CodyInnowhere created MAHOUT-1032:
-------------------------------------

             Summary: AggregateAndRecommendReducer gets OOM in setup() method
                 Key: MAHOUT-1032
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1032
             Project: Mahout
          Issue Type: Bug
          Components: Collaborative Filtering
    Affects Versions: 0.6, 0.5, 0.7, 0.8
         Environment: hadoop cluster with -Xmx set to 2G
            Reporter: CodyInnowhere
            Assignee: Sean Owen


This bug is actually caused by the very first job: itemIDIndex. This job 
transfers itemID to an integer index, and in the later 
AggregateAndRecommendReducer, tries to read all items to the OpenIntLongHashMap 
indexItemIDMap. However, for large data sets, e.g., my test data set covers 
100million+ items(not too many items for a large e-commerce website), tasks get 
out of memory in setup() method. I don't think the itemIDIndex is necessary, 
without this job, the final AggregateAndRecommend step doesn't have to read all 
items to the memory to do the reverse index mapping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to