[ https://issues.apache.org/jira/browse/MAHOUT-1286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737606#comment-13737606 ]
Ted Dunning commented on MAHOUT-1286: ------------------------------------- Recommendation as search is just one model. I want to have a good demo of that available so that people can deploy recommenders very easily. I am generally less enthusiastic about other forms of recommenders, but definitely not to the extent of thinking that Mahout should not support them. Regarding your other questions, 1) yes, search engines can support real-time learning of some forms and can update during recommendation operations. 2) no search engines like Solr or Lucene only support models that can be sparsified. You can build simple ensembles using complex queries, however. 3) (the question you didn't ask) one particular strength of recommendation as search is that it supports multi-model recommendation. My own feeling is that getting basic recommendations up quickly allows more time to experiment with additional data sources and with alternative UI presentations, business rules and dithering. These make a much larger difference in my experience than the basic recommendation algorithm so getting them up quickly and not wasting time on the algorithm itself is often warranted. If you have time and engineers to spend after getting this very, very good, then going back to improve the algorithms can make good sense. I have never seen a startup that had this time or these engineers. Only a few large companies have them either. So my strategy here is to facilitate early successes without endangering longer term optimizations. > Memory-efficient DataModel, supporting fast online updates and element-wise > iteration > ------------------------------------------------------------------------------------- > > Key: MAHOUT-1286 > URL: https://issues.apache.org/jira/browse/MAHOUT-1286 > Project: Mahout > Issue Type: Improvement > Components: Collaborative Filtering > Affects Versions: 0.9 > Reporter: Peng Cheng > Assignee: Sean Owen > Labels: collaborative-filtering, datamodel, patch, recommender > Fix For: 0.9 > > Attachments: InMemoryDataModel.java, InMemoryDataModelTest.java > > Original Estimate: 336h > Remaining Estimate: 336h > > Most DataModel implementation in current CF component use hash map to enable > fast 2d indexing and update. This is not memory-efficient for big data set. > e.g. Netflix prize dataset takes 11G heap space as a FileDataModel. > Improved implementation of DataModel should use more compact data structure > (like arrays), this can trade a little of time complexity in 2d indexing for > vast improvement in memory efficiency. In addition, any online recommender or > online-to-batch converted recommender will not be affected by this in > training process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira