[
https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672327#comment-13672327
]
Sebastian Schelter commented on MAHOUT-974:
-------------------------------------------
Saikat,
In the preprocessing code of the ALS job (the first two mapreduces), you would
need to hash the long ids to ints, ideally using the MultipleOutputs API so
that we don't need additional jobs. The mapping needs to be stored together
with the factorization and must be used in the PredictionJob which uses the
factorization to predict interactions. It has to map back the ints to longs.
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use
> integer as userId and itemId
> ---------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-974
> URL: https://issues.apache.org/jira/browse/MAHOUT-974
> Project: Mahout
> Issue Type: Wish
> Components: Collaborative Filtering
> Affects Versions: 0.8
> Reporter: Han Hui Wen
> Assignee: Sebastian Schelter
> Labels: CF,recommendation,als
> Fix For: 0.8
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses
> integer as userId and itemId,but
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and
> ItemId.
> It's best that ParallelALSFactorizationJob also uses Long as userId and
> itemId ,so that same dataset can use all the recommendation arithrmetic
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira