[ 
https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672327#comment-13672327
 ] 

Sebastian Schelter commented on MAHOUT-974:
-------------------------------------------

Saikat,

In the preprocessing code of the ALS job (the first two mapreduces), you would 
need to hash the long ids to ints, ideally using the MultipleOutputs API so 
that we don't need additional jobs. The mapping needs to be stored together 
with the factorization and must be used in the PredictionJob which uses the 
factorization to predict interactions. It has to map back the ints to longs. 
                
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob  use 
> integer as userId and itemId
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-974
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-974
>             Project: Mahout
>          Issue Type: Wish
>          Components: Collaborative Filtering
>    Affects Versions: 0.8
>            Reporter: Han Hui Wen 
>            Assignee: Sebastian Schelter
>              Labels: CF,recommendation,als
>             Fix For: 0.8
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob  uses 
> integer as userId and itemId,but 
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob  and  
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and 
> ItemId.
> It's best that ParallelALSFactorizationJob   also uses Long as userId and 
> itemId ,so that same dataset can use all the recommendation arithrmetic

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to