I would recommend you use (only) the ad data. These are "boolean" data points in the recommender engine speak. You can 'recommend' ads this way.
I understand your question is a bit more than that. First you want to use the *not*-clicked data. My first question is, is this meaningful? I am served 1000 ads per day that I don't even look at; that I do not click them does not say much. Is your situation some kind of interstitial ad that the user is forced to skip? that's more meaningful, but the same comment applies. If you really do have such meaningful data, consider making a separate "anti-recommender" out of this data. This will tell you which ads are probably worst to show. You could merge the two results then to make your decision. What to do with purchase data? You could ignore it on the grounds that when recommending ads, the only thing that matter is its ability to induce a click -- whether it results in a purchase is a different matter. Or you could view it as reaffirming that the ad click was a "strong click", that it is more likely the user was not merely curious or mis-clicked, but was significantly more interested in the advertised product. You could go back and add "ratings" to your model -- a "1" for a click and a "5" for a click that results in purchase? It's quite arbitrary and I don't know if the results are much better. If you're serious about using this data too, I would again recommend looking at the ALS algorithm as presented in www2.research.att.com/~yifanhu/PUB/cf.pdf -- their model is nice in that it ingests a "confidence" in the association between a user and item, which is much more like what you have than a "rating". On Wed, Apr 4, 2012 at 10:35 AM, vinutha <vinutha...@yahoo.com> wrote: > > Hello! > > I have a data set containing user behavior such as which products s/he > clicked on , and which products s/he bought from a retail site. I have > another data set containing which ads the same user has clicked on, and > the > ads which were shown to him/her but hasn't been clicked on. The idea is to > use the user behavior data set to make recommendations for ads. > As I ve understood from Mahout in Action, there isn't a way to introduce > user behavior has a feature set . One can only use, userid, productid /ad > id > , preferences. > > Is my understanding correct? > Any suggestions would be most welcome! > > Thanks, > Vinutha > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/recommend-ads-using-mahout-tp3883496p3883496.html > Sent from the Mahout User List mailing list archive at Nabble.com.