Contextual multi armed bandits

This is a octave demo for contextual multi armed bandits. A contextual multi armed bandit strategy learns to select the best performing model of a finite set of models given a context. In spite I use random generated context vectors, the learning process is already visible after a low number of iterations over the same context set. In practice, the context should contain reasonable input data, e.g. certain query words entered by a user in the input field of a search engine. My octave code is based on the linUCB pseudo code proposed in “A Contextual-Bandit Approach to Personalized News Article Recommendation”, Lihong Li, Wei Chu ,Yahoo! Labs.

You can find the code in my git repository.results

Dieser Beitrag wurde unter Allgemein veröffentlicht. Setze ein Lesezeichen auf den Permalink.

Die Kommentarfunktion ist geschlossen.