Re: recommenditembased returns 0 records from last map-reduce job

Pat Ferrel Fri, 25 Jul 2014 12:58:29 -0700

I think I did explain below. Your IDs must be in the range from 0 to the number 
of rows - 1 and the same for item IDs. This is done by taking your application 
specific IDs and mapping them to sequential non-negative Integers. You need to 
maintain a mapping to/from Mahout IDs somewhere in your own code.


For example imagine input of the form
-92, abc, 1.0
75000x, jkl, 2.0

Your first user ID is -92, give it Mahout ID = 0. For your next user ID 75000x 
give it Mahout ID = 1
Your first item ID is abc, give it Mahout ID = 0. For your next item ID jkl 
give it Mahout ID = 1
keep doing this the first time you see a unique id from your input. A Map will 
do this for you.

And so on. Then the input to Mahout would be:
0,0,1.0
1,1,2.0

The output will have Mahout IDs too so you need to map recommendations for 
Mahout User ID 0 back to your User ID of -92, and the same for all item IDs.


On Jul 25, 2014, at 11:55 AM, Serega Sheypak <serega.shey...@gmail.com> wrote:

I'm preparing data using apache hive: user_id:long, item_it:long,
preference[1.0, 2.0]
I don't understand "For most Mahout jobs you have to prepare you data to
have Mahout IDs". What is "Mahout IDs"? I try to follow mahout site docs, I
didn't find there something related to mahout ids.
Please explain.


2014-07-25 22:39 GMT+04:00 Pat Ferrel <pat.fer...@gmail.com>:

> Sorry I haven’t read this thread carefully but it looks like you may be
> using the wrong IDs.
> 
> For most Mahout jobs you have to prepare you data to have Mahout IDs. You
> do this by looking at each datum and as you see a new unique application
> specific user or item ID you give it a Mahout ID starting from 0. So Mahout
> ID can be thought of as row and column numbers in a matrix. The Mahout IDs
> for rows will be 0 thru # of rows-1 same for columns.
> 
> This always requires that you translate into Mahout IDs then after the job
> is run translate back into your application IDs. You need a bi-directional
> dictionary of some type. I use a HashBiMap from Guava.
> 
> Also I’d avoid the threshold for now. If you get that wrong it will mess
> things up badly and is very hard to tune. It’s there for completeness but I
> never use it.
> 
> 
> On Jul 25, 2014, at 12:55 AM, Serega Sheypak <serega.shey...@gmail.com>
> wrote:
> 
> Hi, nothing helps...
> I do use mahout 0.9 compiled for CDH 4.7
> I do provide only positive values
> I do use itemsimilarityJob and do get 2000 similarities for 1400 unique
> items
> Input data is:
> 16*10^6 preferences
> 4*10^6 users
> 0.6*10^ items
> I do use perason correlation and preferece vlaues are: 1.0 and 2.0
> 
> 
> 2014-07-22 9:32 GMT+04:00 Serega Sheypak <serega.shey...@gmail.com>:
> 
>> Ok, I have recompiled mahout 0.9 for CDH 4.7. I'll try this evening.
>> Right now I don't see how can it help me. As far as I know the stuff I
> try
>> to use is pretty old and stable.
>> looks like I do apply it in a wrong way.
>> 
>> There is an option for recommenditembased named "--threshold". I do
>> provide data for recommenditembased with preference values in range
>> [1.1..2.0].
>> I set --threshold to 1.2
>> --threshold is absolute and can be from [1.1 . .2+] or it's relative and
>> can be [0.0 .. 0.99999]?
>> 
>> 
>> 2014-07-22 3:54 GMT+04:00 Ted Dunning <ted.dunn...@gmail.com>:
>> 
>> That version is no longer supported.  You should upgrade to 0.9
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Jul 21, 2014 at 11:41 AM, Serega Sheypak <
>>> serega.shey...@gmail.com>
>>> wrote:
>>> 
>>>> 0.7-cdh4.7.0
>>>> Anyway, recommenditembased does produce these catalogs:
>>>> 
>>>> /recommenditembased/temp/maxValues.bin
>>>> /recommenditembased/temp/norms.bin
>>>> /recommenditembased/temp/numNonZeroEntries.bin
>>>> /recommenditembased/temp/pairwiseSimilarity
>>>> /recommenditembased/temp/partialMultiply
>>>> /recommenditembased/temp/prePartialMultiply1
>>>> /recommenditembased/temp/prePartialMultiply2
>>>> /recommenditembased/temp/preparePreferenceMatrix
>>>> /recommenditembased/temp/similarityMatrix
>>>> /recommenditembased/temp/weights
>>>> 
>>>> I suppose that "/recommenditembased/temp/similarityMatrix" is the thing
>>> In
>>>> eed. Right now I try to read it using
>>>> 
>>>> matrix = LOAD '/recommenditembased/temp/similarityMatrix' USING
>>>> com.twitter.elephantbird.pig.load.SequenceFileLoader(
>>>>   '-c com.twitter.elephantbird.pig.util.IntWritableConverter',
>>>>   '-c com.twitter.elephantbird.pig.mahout.VectorWritableConverter'
>>>> )  as (intId: int, vector:tuple(cardinality:int,
>>>> entries:bag{t:tuple(some_id:long, some_value:double)}));
>>>> 
>>>> 
>>>> Looks like the vector is empty... Or i do something wrong.
>>>> 
>>>> 
>>>> 
>>>> 2014-07-21 22:09 GMT+04:00 Ted Dunning <ted.dunn...@gmail.com>:
>>>> 
>>>>> Which version of Mahout?
>>>>> 
>>>>> 
>>>>> On Mon, Jul 21, 2014 at 11:05 AM, Serega Sheypak <
>>>> serega.shey...@gmail.com
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Hi, I've tried: Unexpected --outputPathForSimilarityMatrix while
>>>>> processing
>>>>>> Job-Specific
>>>>>> 
>>>>>> sudo -u hdfs hadoop fs -rm -r
>>>>> hdfs://nameservice1/recommenditembased/output
>>>>>> sudo -u hdfs hadoop fs -rm -r
>>>> hdfs://nameservice1/recommenditembased/temp
>>>>>> sudo -u oozie mahout recommenditembased \
>>>>>>                   --input \
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> hdfs://nameservice1/user/hive/warehouse/staging_weighted_visits_and_rec_clicks
>>>>>> \
>>>>>>                   --output \
>>>>>>                   hdfs://nameservice1/recommenditembased/output \
>>>>>>                   --similarityClassname \
>>>>>>                   SIMILARITY_LOGLIKELIHOOD \
>>>>>>                  --numRecommendations \
>>>>>>                   500 \
>>>>>>                   --booleanData \
>>>>>>                   false \
>>>>>>                   --maxPrefsPerUser \
>>>>>>                   1000 \
>>>>>>                   --maxSimilaritiesPerItem \
>>>>>>                   1000 \
>>>>>>                   --minPrefsPerUser \
>>>>>>                   5 \
>>>>>>                   --maxPrefsPerUserInItemSimilarity \
>>>>>>                   30 \
>>>>>>                   --threshold \
>>>>>>                  1.1 \
>>>>>>                   --tempDir \
>>>>>>                   hdfs://nameservice1/recommenditembased/temp \
>>>>>>                   --outputPathForSimilarityMatrix \
>>>>>> 
>>> hdfs://nameservice1/recommenditembased/sim_matrix
>>>>>> 
>>>>>> 
>>>>>> I'm on Cloudera cdh 4.7, looks like this feature is not supported.
>>>>>> 
>>>>>> 
>>>>>> 2014-07-21 11:18 GMT+04:00 Peng Zhang <pzhang.x...@gmail.com>:
>>>>>> 
>>>>>>> Serega,
>>>>>>> 
>>>>>>> See the last line on how to pass outputPathForSimilarityMatrix
>>>> options
>>>>> to
>>>>>>> the recommenditembased command:
>>>>>>> 
>>>>>>> sudo -u oozie mahout recommenditembased \
>>>>>>>                  --input visited_items_with_inverted_items \
>>>>>>> 
>>>>>>>                  --output result \
>>>>>>>                  --similarityClassname SIMILARITY_LOGLIKELIHOOD
>>> \
>>>>>>>                  --usersFile inverted_items \
>>>>>>>                  --numRecommendations 500 \
>>>>>>>                  --booleanData false \
>>>>>>>                  --maxPrefsPerUser 100 \
>>>>>>>                  --maxSimilaritiesPerItem 500 \
>>>>>>>                  --minPrefsPerUser 0\
>>>>>>>                  --maxPrefsPerUserInItemSimilarity 30 \
>>>>>>>                  --threshold 0.91 \
>>>>>>>                  --tempDir  temp \
>>>>>>>                  --outputPathForSimilarityMatrix
>>> similarityMatri \
>>>>>>> 
>>>>>>> 
>>>>>>> Peng Zhang
>>>>>>> pzhang.x...@gmail.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Jul 21, 2014, at 3:09 PM, Serega Sheypak <
>>>> serega.shey...@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I've inspected the code, our approach wouldn't work with
>>>>>>> booleanData=false.
>>>>>>>> We do calcualte imte similarity in the wrong way...(((
>>>>>>>> Thank you
>>>>>>>> 1. We provide "fake" user_id and provide --usersFile in order to
>>>> get
>>>>>>>> recommendations for "fake user_id, where user_id is a negative
>>>>> item_id.
>>>>>>> It
>>>>>>>> worked when we did provide user_id->item_id pairs without
>>>> preference.
>>>>>>>> 2. Our target is to get item similarities. We tried
>>>>>>>> 
>>> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
>>>>> but
>>>>>>> it
>>>>>>>> returns bad result comparing to RecommenderJob with our "fake"
>>>>> user_id
>>>>>>>> (inverted item_id)
>>>>>>>> 
>>>>>>>> 1. I'll try the option you provided.
>>>>>>>> 2. I will remove input with fake user_id and usersFile with
>>> these
>>>>> fake
>>>>>>> ids
>>>>>>>> 
>>>>>>>> 3.
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> https://github.com/apache/mahout/blob/master/mrlegacy/src/main/java/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java
>>>>>>>> I don't understand how to pass ---outputPathForSimilarityMatrix
>>>>> option
>>>>>> to
>>>>>>>> RecommenderJob
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2014-07-21 4:58 GMT+04:00 Peng Zhang <pzhang.x...@gmail.com>:
>>>>>>>> 
>>>>>>>>> Seraga,
>>>>>>>>> 
>>>>>>>>> I have two comments:
>>>>>>>>> 1. Don’t use negative user ids. Since Mahout uses user id as
>>> well
>>>> as
>>>>>>> item
>>>>>>>>> id as the row/column index, you’d better use 0, 1, 2, etc as
>>> ids
>>>>>>>>> 2. If you want to get the item similarity information, you can
>>> use
>>>>>>>>> --outputPathForSimilarityMatrix in the command
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Peng Zhang
>>>>>>>>> M: +86 186-1658-7856
>>>>>>>>> pzhang.x...@gmail.com
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Jul 21, 2014, at 4:00 AM, Serega Sheypak <
>>>>> serega.shey...@gmail.com
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> All bad things happen here:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Name
>>>>>>>>>> 
>>>>>>>>>> RecommenderJob-PartialMultiplyMapper-Reducer
>>>>>>>>>> 
>>>>>>>>>> User
>>>>>>>>>> 
>>>>>>>>>> oozie
>>>>>>>>>> 
>>>>>>>>>> Process User
>>>>>>>>>> 
>>>>>>>>>> oozie
>>>>>>>>>> 
>>>>>>>>>> Group
>>>>>>>>>> 
>>>>>>>>>> oozie
>>>>>>>>>> 
>>>>>>>>>> Mapper Class
>>>>>>>>>> 
>>>>>>>>>> PartialMultiplyMapper
>>>>>>>>>> 
>>>>>>>>>> Reducer Class
>>>>>>>>>> 
>>>>>>>>>> AggregateAndRecommendReducer
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Job Input Directory
>>>>>>>>>> 
>>>>>>>>>> hdfs://nameservice1/itemrec/temp/partialMultiply
>>>>>>>>>> 
>>>>>>>>>> Job Output Directory
>>>>>>>>>> 
>>>>>>>>>> hdfs://nameservice1/itemrec/output/
>>>>>>>>>> 
>>>>>>>>>> 14/07/20 23:57:47 INFO mapred.JobClient:     Map input
>>>>>> records=3312879
>>>>>>>>>> 
>>>>>>>>>> 14/07/20 23:57:47 INFO mapred.JobClient:     Map output
>>>>>> records=3313251
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 14/07/20 23:57:47 INFO mapred.JobClient:     Reduce input
>>>>>>> records=3313251
>>>>>>>>>> 
>>>>>>>>>> 14/07/20 23:57:47 INFO mapred.JobClient:     Reduce output
>>>>> records=0
>>>>>>>>>> 
>>>>>>>>>> Why does mahout returns 0 rows? it works when booleanData=true
>>>>>>>>> (preferences
>>>>>>>>>> are ignored...?)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 2014-07-20 23:19 GMT+04:00 Serega Sheypak <
>>>>> serega.shey...@gmail.com
>>>>>>> :
>>>>>>>>>> 
>>>>>>>>>>> the version is: CDH-4.7.0-1.cdh4.7.0.p0.40
>>>>>>>>>>> users_file:
>>>>>>>>>>> --inverted_item_id
>>>>>>>>>>> -1
>>>>>>>>>>> -2
>>>>>>>>>>> -3
>>>>>>>>>>> -4
>>>>>>>>>>> 
>>>>>>>>>>> users_items_prefs
>>>>>>>>>>> --inverted item_id
>>>>>>>>>>> -1 1 1.0
>>>>>>>>>>> -2 2 1.0
>>>>>>>>>>> -3 3 1.0
>>>>>>>>>>> -4 4 1.0
>>>>>>>>>>> --user_id item_id pref_value
>>>>>>>>>>> 11   1 1.6
>>>>>>>>>>> 11   2 1.6
>>>>>>>>>>> 123 3 2.0
>>>>>>>>>>> 123 4 2.0
>>>>>>>>>>> 333 1 2.0
>>>>>>>>>>> 333 2 1.6
>>>>>>>>>>> --e.t.c.
>>>>>>>>>>> 
>>>>>>>>>>> if I set --booleanData true
>>>>>>>>>>> then mahout returns the result.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2014-07-20 23:12 GMT+04:00 Andrew Musselman <
>>>>>>> andrew.mussel...@gmail.com
>>>>>>>>>> :
>>>>>>>>>>> 
>>>>>>>>>>> I'm confused about how you're constructing the user file, and
>>>> why
>>>>>>> there
>>>>>>>>>>>> are negated item ids here.
>>>>>>>>>>>> 
>>>>>>>>>>>> Can you post some more details please, including Mahout
>>> version
>>>>> and
>>>>>>>>> some
>>>>>>>>>>>> sample data sets?
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Jul 20, 2014, at 11:57 AM, Serega Sheypak <
>>>>>>>>> serega.shey...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi, I'm trying to create item similarity.
>>>>>>>>>>>>> I gather items which users visit during shopping and then
>>>>> create a
>>>>>>>>> file:
>>>>>>>>>>>>> user_id, item_id, weight (where weight can be: [1.0, 1.6,
>>>> 1.9],
>>>>>>>>> depends
>>>>>>>>>>>> on
>>>>>>>>>>>>> user action type and data source)
>>>>>>>>>>>>> UNION
>>>>>>>>>>>>> -item_id, item_id, 1.0 (from items dictionary)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> and I do provide a userFile, where user_id = -item_id
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The idea is to get item similary. If any user visits item
>>>> named
>>>>>>> "A", i
>>>>>>>>>>>> want
>>>>>>>>>>>>> to show him items "B", "c", "xxx" using preferences of
>>> other
>>>>>> users.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The problem is that the last (???) mapreduce job returns 0
>>>> rows:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Here are my settings:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> sudo -u oozie mahout recommenditembased \
>>>>>>>>>>>>>                --input visited_items_with_inverted_items
>>> \
>>>>>>>>>>>>> 
>>>>>>>>>>>>>                --output result \
>>>>>>>>>>>>>                --similarityClassname
>>>> SIMILARITY_LOGLIKELIHOOD
>>>>> \
>>>>>>>>>>>>>                --usersFile inverted_items \
>>>>>>>>>>>>>                --numRecommendations 500 \
>>>>>>>>>>>>>                --booleanData false \
>>>>>>>>>>>>>                --maxPrefsPerUser 100 \
>>>>>>>>>>>>>                --maxSimilaritiesPerItem 500 \
>>>>>>>>>>>>>                --minPrefsPerUser 0\
>>>>>>>>>>>>>                --maxPrefsPerUserInItemSimilarity 30 \
>>>>>>>>>>>>>                --threshold 0.91 \
>>>>>>>>>>>>>                --tempDir  temp \
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Some counters... I don't get what do they mean....
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:43:08 INFO mapred.JobClient:
>>>>>>>>>>>>> 
>>>>>> org.apache.mahout.cf.taste.hadoop.item.ToUserVectorsReducer$Counters
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:43:08 INFO mapred.JobClient:     USERS=7528530
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:43:43 INFO mapred.JobClient:
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> org.apache.mahout.cf.taste.hadoop.preparation.ToItemVectorsMapper$Elements
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:43:43 INFO mapred.JobClient:
>>>>>>>>>>>>> USER_RATINGS_NEGLECTED=1,798,738
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:43:43 INFO mapred.JobClient:
>>>>>>>>>>>> USER_RATINGS_USED=12,429,693
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:44:24 INFO mapred.JobClient:
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:44:24 INFO mapred.JobClient:     ROWS=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:45:18 INFO mapred.JobClient:
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:45:18 INFO mapred.JobClient:
>>>>>> COOCCURRENCES=35882374
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:45:18 INFO mapred.JobClient:
>>>>>> PRUNED_COOCCURRENCES=0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:00 INFO mapred.JobClient:     Map input
>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:00 INFO mapred.JobClient:     Map output
>>>>>>>>> records=17570268
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:00 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=5221907
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:00 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:34 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:34 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:34 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:46:34 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:06 INFO mapred.JobClient:     Map input
>>>>>>> records=7528530
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:06 INFO mapred.JobClient:     Map output
>>>>>>>>> records=3313251
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:06 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=3313251
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:06 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>> records=3313251
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:40 INFO mapred.JobClient:     Map input
>>>>>>> records=6626130
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:40 INFO mapred.JobClient:     Map output
>>>>>>>>> records=6626130
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:40 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=6626130
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:47:40 INFO mapred.JobClient:     Reduce output
>>>>>>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:48:26 INFO mapred.JobClient:     Map input
>>>>>>> records=3312879
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:48:26 INFO mapred.JobClient:     Map output
>>>>>>>>> records=3313251
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 14/07/20 22:48:26 INFO mapred.JobClient:     Reduce input
>>>>>>>>>>>> records=3313251
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --------
>>>>>>>>>>>>> 14/07/20 22:48:26 INFO mapred.JobClient:     Reduce output
>>>>>> records=0
>>>>>>>>>>>>> --------
>>>>>>>>>>>>> 
>>>>>>>>>>>>> why 0???
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
> 
>

Re: recommenditembased returns 0 records from last map-reduce job

Reply via email to