[ https://issues.apache.org/jira/browse/MAHOUT-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved MAHOUT-1032. ------------------------------- Resolution: Not A Problem > AggregateAndRecommendReducer gets OOM in setup() method > ------------------------------------------------------- > > Key: MAHOUT-1032 > URL: https://issues.apache.org/jira/browse/MAHOUT-1032 > Project: Mahout > Issue Type: Bug > Components: Collaborative Filtering > Affects Versions: 0.5, 0.6, 0.7, 0.8 > Environment: hadoop cluster with -Xmx set to 2G > Reporter: CodyInnowhere > Assignee: Sean Owen > Original Estimate: 168h > Remaining Estimate: 168h > > This bug is actually caused by the very first job: itemIDIndex. This job > transfers itemID to an integer index, and in the later > AggregateAndRecommendReducer, tries to read all items to the > OpenIntLongHashMap indexItemIDMap. However, for large data sets, e.g., my > test data set covers 100million+ items(not too many items for a large > e-commerce website), tasks get out of memory in setup() method. I don't think > the itemIDIndex is necessary, without this job, the final > AggregateAndRecommend step doesn't have to read all items to the memory to do > the reverse index mapping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira