On 03/12/12 04:06, Koobas wrote:
Thank you very much. The pointer to Myrrix is a very useful piece of information. Myrrix, however, relies on an iterative sparse matrix factorization to do PCA. I want to produce Amazon-like recommendations. I.e., "70% of users who bough this, also bought that." So, I specifically want the direct kNN algorithm. Any clue what Mahout + Hadoop can deliver on that one? Thanks, Jacob
While the "70% of users bought also ... " could be generated by a suitable recommendation engine, I think it fits better with a frequent pattern mining approach i.e. Association Rules. I don't know if Amazon implements it that way, but it seems likely, since it's not really a personalized recommendation (unless we interpret the personalization as coming from the pages the user is visiting, i.e. real-time profile building). I believe Mahout has a frequent itemset mining algorithm (FPGrowth), though I've never tried it myself. For your problem, you would select the minimum support for your itemsets (this would eliminate spurious associations), and the confidence obtained would be directly your 70% value. Although your formulation selects only the rules with 1 item in the antecedent, i.e. item1 -> item2, you could use the items visited before to build bigger antecedents. Paulo ________________________________ Este mensaje se dirige exclusivamente a su destinatario. Puede consultar nuestra política de envío y recepción de correo electrónico en el enlace situado más abajo. This message is intended exclusively for its addressee. We only send and receive email on the basis of the terms set out at: http://www.tid.es/ES/PAGINAS/disclaimer.aspx