[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678706#comment-13678706 ] Hudson commented on MAHOUT-974: --- Integrated in Mahout-Quality #2056 (See [https://builds.apache.org/job/Mahout-Quality/2056/]) MAHOUT-974 org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId (Revision 1490930) Result = FAILURE ssc : Files : * /mahout/trunk/CHANGELOG * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/MutableRecommendedItem.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/TasteHadoopUtils.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/FactorizationEvaluator.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJob.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/PredictionMapper.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/als/RecommenderJob.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java * /mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.java * /mahout/trunk/core/src/test/java/org/apache/mahout/cf/taste/hadoop/als/ParallelALSFactorizationJobTest.java * /mahout/trunk/core/src/test/java/org/apache/mahout/math/hadoop/MathHelper.java > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Attachments: MAHOUT-974.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677975#comment-13677975 ] Sebastian Schelter commented on MAHOUT-974: --- Saikat, I've had a deeper look and I think I'll take this issue. It's an ugly thing and lots of small places in the code need to be updated... Nothing fancy for someone not deeply familiar with the codebase. If you still wanna work on the ALS code, we should discuss this on the mailinglist. I have a few ideas what could be added for upcoming releases. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677095#comment-13677095 ] Saikat Kanjilal commented on MAHOUT-974: Sebastien, In looking at ABtJob I see MultipleOutputs commented out, I tried to do a search for this class and it doesnt exist, is this more of a concept than an actual class? > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676816#comment-13676816 ] Sebastian Schelter commented on MAHOUT-974: --- Hi Saikat, The first two jobs create two versions of the ratings matrix, one partitioned by items, the other partitioned by users. The most elegant solution for this issue would be to make these jobs write out the mapping of ints to long ids via an emulation of MultipleOutputs such as used in org.apache.mahout.math.hadoop.stochasticsvd.ABtJob I suggest we add an argument "usesLongIDs" to the job that the user can set to trigger the mapping. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676722#comment-13676722 ] Saikat Kanjilal commented on MAHOUT-974: Sebastian, Finally had a chance to dig into this further tonight, so in looking at the first two map-reduces I see the ItemRatingVectorsMapper class, 2 ideas here: 1) should we get rid of this class and just use the ItemIDIndexMapper class and try to make this class work for ALS 2) make ItemRatingVectorsMapper handle the mapping, unlike ItemIDIndexMapper this class doesnt really handle an index and deals with the rating matrix which itself would need to be modified. Any thoughts on simplest solution? My vote would be 2 but I need to read through the code some more to get a deeper understanding. Also please pardon if I'm way off base on solutioning this :)), lot of code to read and understand > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672327#comment-13672327 ] Sebastian Schelter commented on MAHOUT-974: --- Saikat, In the preprocessing code of the ALS job (the first two mapreduces), you would need to hash the long ids to ints, ideally using the MultipleOutputs API so that we don't need additional jobs. The mapping needs to be stored together with the factorization and must be used in the PredictionJob which uses the factorization to predict interactions. It has to map back the ints to longs. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Fix For: 0.8 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672172#comment-13672172 ] Saikat Kanjilal commented on MAHOUT-974: Yes, although I could use some general guidance being a newbie on this codebase, I've not had time to research this further, can you respond to my comments above? Thanks > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672163#comment-13672163 ] Sebastian Schelter commented on MAHOUT-974: --- Saikat, are you still on this? > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.8 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654553#comment-13654553 ] Saikat Kanjilal commented on MAHOUT-974: Thanks for the update, I'll look into this, I'm guessing the fix needs to be made inside the TasteHadoopUtils class > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654376#comment-13654376 ] Angel Martinez Gonzalez commented on MAHOUT-974: Hi Saikat, I think the mapping is done in ItemIDIndexMapper, which in turn calls TasteHadoopUtils.idToIndex > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13630727#comment-13630727 ] Saikat Kanjilal commented on MAHOUT-974: I am reading through the PreparePreferenceMatrixJob and I was wondering if by mapping between longs to ints you're referring to the following lines of code: //convert items to an internal index Job itemIDIndex = prepareJob(getInputPath(), getOutputPath(ITEMID_INDEX), TextInputFormat.class, ItemIDIndexMapper.class, VarIntWritable.class, VarLongWritable.class, ItemIDIndexReducer.class, VarIntWritable.class, VarLongWritable.class, SequenceFileOutputFormat.class); itemIDIndex.setCombinerClass(ItemIDIndexReducer.class); boolean succeeded = itemIDIndex.waitForCompletion(true); if (!succeeded) { return -1; } //convert user preferences into a vector per user Job toUserVectors = prepareJob(getInputPath(), getOutputPath(USER_VECTORS), TextInputFormat.class, ToItemPrefsMapper.class, VarLongWritable.class, booleanData ? VarLongWritable.class : EntityPrefWritable.class, ToUserVectorsReducer.class, VarLongWritable.class, VectorWritable.class, SequenceFileOutputFormat.class); Pardon my ignorance as this is my first time looking at this code, I dont see any other parts of this class resembling a mapping. Also Sebastian I'm wondering whether the mapping itself needs to be present in mahout-core so that multiple jobs can leverage it. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627925#comment-13627925 ] Sebastian Schelter commented on MAHOUT-974: --- I didn't do any work on it, but it could be a good starters project. You basically have to create a mapping for both user and item ids, which must also be used by related jobs like the RecommenderJob for ALS and the one that evaluates the error of a factorization. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627881#comment-13627881 ] Saikat Kanjilal commented on MAHOUT-974: Sebastien, Is this something I can help with, I dont see a patch so am not sure where you are with the fix. Let me know Regards > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sebastian Schelter > Labels: CF,recommendation,als > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215327#comment-13215327 ] Han Hui Wen commented on MAHOUT-974: - org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob just indexed itemId ,but not index userId.it also converts user preferences into a vector per user and builds the rating matrix. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sean Owen > Labels: CF,recommendation,als > Fix For: 0.7 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-974) org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use integer as userId and itemId
[ https://issues.apache.org/jira/browse/MAHOUT-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202195#comment-13202195 ] Sebastian Schelter commented on MAHOUT-974: --- You are right. The item ID indexing (from longs to ints) that already exists in org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob should be built into ParallelALSFactorizationJob too. > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob use > integer as userId and itemId > --- > > Key: MAHOUT-974 > URL: https://issues.apache.org/jira/browse/MAHOUT-974 > Project: Mahout > Issue Type: Wish > Components: Collaborative Filtering >Affects Versions: 0.6 >Reporter: Han Hui Wen >Assignee: Sean Owen > Labels: CF,recommendation,als > Fix For: 0.7 > > Original Estimate: 2h > Remaining Estimate: 2h > > org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob uses > integer as userId and itemId,but > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob and > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob .use Long as userId and > ItemId. > It's best that ParallelALSFactorizationJob also uses Long as userId and > itemId ,so that same dataset can use all the recommendation arithrmetic -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira