RE: how to solve reducer memory problem?

java8964 Thu, 03 Apr 2014 07:40:53 -0700

There are several issues could come together, since you know your data, we can 
only guess here:
1) mapred.child.java.opts=-Xmx2g setting only works IF you didn't set 
"mapred.map.child.java.opts" or "mapred.reduce.child.java.opts", otherwise, the 
later one will override the "mapred.child.java.opts". So double check the 
setting, make sure the reducers did have 2G heap as your want.
2) In your implementation, you Could OOM as you store more and more data into 
"TrainingWeights result". So the question is for each "Reducer group", or 
"Key", how many data it could be?If a key could contain big values, then all 
these values will be saved in the memory of "result" instance. That will 
require big memory. If so, either you have to have that much memory, or 
redesign your key, make it more lower level, so requires less memory.
Yong
Date: Thu, 3 Apr 2014 17:53:57 +0800
Subject: Re: how to solve reducer memory problem?
From: fancye...@gmail.com
To: user@hadoop.apache.org


                       you can think of each TrainingWeights as a very large 
double[] whose length is about 10,000,000                 TrainingWeights 
result=null;
                        int total=0;                    for(TrainingWeights 
weights:values){                            if(result==null){
                                        result=weights;                         
}else{                                  addWeights(result, weights);
                                }                               total++;        
                }                       if(total>1){
                                divideWeights(result, total);                   
}                       context.write(NullWritable.get(), result);


On Thu, Apr 3, 2014 at 5:49 PM, Gordon Wang <gw...@gopivotal.com> wrote:

What is the work in reducer ?Do you have any memory intensive work in 
reducer(eg. cache a lot of data in memory) ? I guess the OOM error comes from 
your code in reducer.  




On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fancye...@gmail.com> wrote:



mapred.child.java.opts=-Xmx2g


On Thu, Apr 3, 2014 at 5:10 PM, Li Li <fancye...@gmail.com> wrote:




2g

On Thu, Apr 3, 2014 at 1:30 PM, Stanley Shi <s...@gopivotal.com> wrote:





This doesn't seem like related with the data size.
How much memory do you use for the reducer? 




Regards,
Stanley Shi,



On Thu, Apr 3, 2014 at 8:04 AM, Li Li <fancye...@gmail.com> wrote:






I have a map reduce program that do some matrix operations. in the

reducer, it will average many large matrix(each matrix takes up

400+MB(said by Map output bytes). so if there 50 matrix to a reducer,

then the total memory usage is 20GB. so the reduce task got exception:



FATAL org.apache.hadoop.mapred.Child: Error running child :

java.lang.OutOfMemoryError: Java heap space

at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:344)

at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:406)

at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:238)

at org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:438)

at org.apache.hadoop.mapred.Merger.merge(Merger.java:142)

at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.createKVIterator(ReduceTask.java:2539)

at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier.access$400(ReduceTask.java:661)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)

at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)

at org.apache.hadoop.mapred.Child.main(Child.java:249)



one method I can come up with is use Combiner to save sums of some

matrixs and their count

but it still can solve the problem because the combiner is not fully

controled by me.









-- 
RegardsGordon Wang

RE: how to solve reducer memory problem?

Reply via email to