Re: Nested map reduce job

2012-05-05 Thread Shi Yu
A quick glance at your problem indicates that you might have a design problem with your code. In my opinion you should avoid nested Map/Reduce job. You could use chain Map/Reduce, but the nested or recursive structure is not suggested. I don't know how you implemented your nested M/R job, ma

RE: Nested map reduce job

2012-05-05 Thread Mingxi Wu
You may not need nested map-reduce job. All you need to do is to use keys to partition the permutation. And duplicate the data from map. output.collect(1, value); output.collect(2, value); . . . output.collect(n, value); Then, set your reducer number to n. When you emit data in the mapper, th

Nested map reduce job

2012-05-05 Thread venkataswamy
Hi, I encountered a strange issue in developing a system. I have data where reducer recieves about 3 millions values. The reducer emits all the permutations of the values. Reducer{ List FindPermutations(List) foreach( permutation ) emit( key, permutation ) } It is feasible t

cannot use a map side join to merge the output of multiple map side joins

2012-05-05 Thread Jim Donofrio
I am trying to use a map side join to merge the output of multiple map side joins. This is failing because of the below code in JobClient.writeOldSplits which reorders the splits from largest to smallest. Why is that done, is that so that the largest split which will take the longest gets proce