Hi Deejay, > 1. We're using Mahout as a recommendation system. Has anyone had any success > plugging Neo4j into this?
I work with various companies using graph databases to do recommendation. Moreover, a couple of them are also experimenting with Mahout over MySQL and Hadoop. I have never thought to think about backing Mahout by a graph database. That is an interesting idea... Here are three thoughts that are related (though do not directly address your question): 1. With Mahout (as I understand it since a few versions back), only single relational data can be processed. That is, it only supports data of the form: "X likes[weight] Y," where weight can be binary or rational. When a domain model is sufficiently complex: people liking things, people knowing each other, people working in the same, similar, etc. places, and products having features, designers, etc. ---- there is more information in the domain that can be capitalized on for recommendation. 2. With pure graph-based recommendation, no recommendation model is generated (intermediate data structure) as recommendations are calculated on the fly over the raw graph using traversal techniques. Traversals can propagate over more complex relations and are not limited to "X likes[weight] Y" or, better yet, such basic relations can be derived through implicit relations (i.e. paths) [ http://markorodriguez.com/2011/02/08/property-graph-algorithms/ ]. Along this line of thought, the raw graph representation of your domain can be used for more than just recommendation --- e.g. path analysis, global ranking, searching, reasoning, abstraction, etc. [ http://markorodriguez.com/2011/07/14/graphs-brains-and-gremlin/ ] 3. With various forms of graph sampling/weighting, it is possible to put as many clock cycles (thus, compute time) as desired into the determination of a recommendation -- generally, more clock cycles yields greater accuracy. However, with accumulative methods, it is possible to reach an ergodic state [ http://en.wikipedia.org/wiki/Ergodicity ] whereby the contribution of more clock cycles does not yield more information (i.e. does not alter the order of the resultant recommendation ranking). While your question was about overlaying Mahout on top of Neo4j, I argue that by using Neo4j in its native form (through its API and its approach to data analysis), there is much more beyond recommendation that you can exploit from your domain model. To conclude, I recently wrote up a post on graph-based recommendation that may be of interest to you: http://markorodriguez.com/2011/09/22/a-graph-based-movie-recommender-engine/ Good luck with your explorations, Marko. http://markorodriguez.com _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user