[jira] [Commented] (MAHOUT-1032) AggregateAndRecommendReducer gets OOM in setup() method

CodyInnowhere (JIRA) Fri, 15 Jun 2012 04:06:47 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295584#comment-13295584
 ]


CodyInnowhere commented on MAHOUT-1032:
---------------------------------------

Well we have a billion+ distinct items(www.taobao.com), the test data set is a 
part of online items. I see the reason for the index mapping, however, this 
makes enterprise-scale data set a bit difficult to fit in mahout CF. 
BTW, the index mapping is also a problem for so many items as our itemId may 
exceed int.MAX.
                
> AggregateAndRecommendReducer gets OOM in setup() method
> -------------------------------------------------------
>
>                 Key: MAHOUT-1032
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1032
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.5, 0.6, 0.7, 0.8
>         Environment: hadoop cluster with -Xmx set to 2G
>            Reporter: CodyInnowhere
>            Assignee: Sean Owen
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> This bug is actually caused by the very first job: itemIDIndex. This job 
> transfers itemID to an integer index, and in the later 
> AggregateAndRecommendReducer, tries to read all items to the 
> OpenIntLongHashMap indexItemIDMap. However, for large data sets, e.g., my 
> test data set covers 100million+ items(not too many items for a large 
> e-commerce website), tasks get out of memory in setup() method. I don't think 
> the itemIDIndex is necessary, without this job, the final 
> AggregateAndRecommend step doesn't have to read all items to the memory to do 
> the reverse index mapping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-1032) AggregateAndRecommendReducer gets OOM in setup() method

Reply via email to