Re: OOM Error Map output copy.

2011-12-10 Thread Niranjan Balasubramanian
All

Thanks for your inputs. Fortunately, our requirements for this job =
changed, allowing me to use a combiner that ended up reducing the data =
that was being pumped to the reducers and the problem went away.=20

@Arun -- You are probably right that the number of reducers is rather =
small, given the amount of data that is being reduced. We are currently =
using the Fair scheduler and it fits our other use cases very well. The =
newer versions of Fair scheduler appear to support limits on the reducer =
pools. So this solution has to wait until we can make a switch.=20


~ Niranjan.

On Dec 9, 2011, at 8:51 PM, Chandraprakash Bhagtani wrote:

 Hi Niranjan,
 
 Your issue looks similar to
 https://issues.apache.org/jira/browse/MAPREDUCE-1182 . Which hadoop version
 are you using?
 
 
 On Sat, Dec 10, 2011 at 12:17 AM, Prashant Kommireddi
 prash1...@gmail.comwrote:
 
 Arun, I faced the same issue and increasing the # of reducers fixed the
 problem.
 
 I was initially under the impression MR framework spills to disk if data is
 too huge to keep in memory, however on extraordinarily large reduce inputs
 this was not the case and the job failed on trying to assign the in-memory
 buffer
 
 private MapOutput shuffleInMemory(MapOutputLocation mapOutputLoc,
   URLConnection connection,
   InputStream input,
   int mapOutputLength,
   int compressedLength)
 throws IOException, InterruptedException {
   // Reserve ram for the map-output
 .
 .
 .
 .
 
   // Copy map-output into an in-memory buffer
   byte[] shuffleData = new byte[mapOutputLength];
 
 
 -Prahant Kommireddi
 
 On Fri, Dec 9, 2011 at 10:29 AM, Arun C Murthy a...@hortonworks.com
 wrote:
 
 Moving to mapreduce-user@, bcc common-user@. Please use project specific
 lists.
 
 Niranjan,
 
 If you average as 0.5G output per-map, it's 5000 maps *0.5G - 2.5TB over
 12 reduces i.e. nearly 250G per reduce - compressed!
 
 If you think you have 4:1 compression you are doing nearly a Terabyte per
 reducer... which is way too high!
 
 I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G
 (compressed) per reducer for your job. If your compression ratio is 2:1,
 try 500 reduces and so on.
 
 If you are worried about other users, use the CapacityScheduler and
 submit
 your job to a queue with a small capacity and max-capacity to restrict
 your
 job to 10 or 20 concurrent reduces at a given point.
 
 Arun
 
 On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:
 
 All
 
 I am encountering the following out-of-memory error during the reduce
 phase of a large job.
 
 Map output copy failure : java.lang.OutOfMemoryError: Java heap space
  at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
  at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
  at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
  at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
 I tried increasing the memory available using mapped.child.java.opts
 but
 that only helps a little. The reduce task eventually fails again. Here
 are
 some relevant job configuration details:
 
 1. The input to the mappers is about 2.5 TB (LZO compressed). The
 mappers filter out a small percentage of the input ( less than 1%).
 
 2. I am currently using 12 reducers and I can't increase this count by
 much to ensure availability of reduce slots for other users.
 
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 4. mapred.job.shuffle.input.buffer.percent-- 0.70
 
 5. mapred.job.shuffle.merge.percent   -- 0.66
 
 6. mapred.inmem.merge.threshold   -- 1000
 
 7. I have nearly 5000 mappers which are supposed to produce LZO
 compressed outputs. The logs seem to indicate that the map outputs range
 between 0.3G to 0.8GB.
 
 Does anything here seem amiss? I'd appreciate any input of what
 settings
 to try. I can try different reduced values for the input buffer percent
 and
 the merge percent.  Given that the job runs for about 7-8 hours before
 crashing, I would like to make some informed choices if possible.
 
 Thanks.
 ~ Niranjan.
 
 
 
 
 
 
 
 
 
 -- 
 Thanks  Regards,
 Chandra Prakash Bhagtani,
 Nokia India Pvt. Ltd.



RE: OOM Error Map output copy.

2011-12-09 Thread Devaraj K
Can you try increasing the max heap memory whether still you face the
problem.



Devaraj K 

-Original Message-
From: Niranjan Balasubramanian [mailto:niran...@cs.washington.edu] 
Sent: Thursday, December 08, 2011 11:09 PM
To: common-user@hadoop.apache.org
Subject: Re: OOM Error Map output copy.

I am using version 0.20.203.

Thanks
~ Niranjan.
On Dec 8, 2011, at 9:26 AM, Niranjan Balasubramanian wrote:

 Devaraj
 
 These are indeed the actual settings I copied over from the job.xml. 
 
 ~ Niranjan.
 On Dec 8, 2011, at 12:10 AM, Devaraj K wrote:
 
 Hi Niranjan,
 
  Every thing looks ok as per the info you have given. Can you check
 in the job.xml file whether these child opts reflecting or any thing else
is
 overwriting this config.
  
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 
 and also can you tell me which version of hadoop using?
 
 
 Devaraj K 
 
 -Original Message-
 From: Niranjan Balasubramanian [mailto:niran...@cs.washington.edu] 
 Sent: Thursday, December 08, 2011 12:21 AM
 To: common-user@hadoop.apache.org
 Subject: OOM Error Map output copy.
 
 All 
 
 I am encountering the following out-of-memory error during the reduce
phase
 of a large job.
 
 Map output copy failure : java.lang.OutOfMemoryError: Java heap space
  at

org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
 mory(ReduceTask.java:1669)
  at

org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
 t(ReduceTask.java:1529)
  at

org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
 ReduceTask.java:1378)
  at

org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
 ask.java:1310)
 I tried increasing the memory available using mapped.child.java.opts but
 that only helps a little. The reduce task eventually fails again. Here
are
 some relevant job configuration details:
 
 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers
 filter out a small percentage of the input ( less than 1%).
 
 2. I am currently using 12 reducers and I can't increase this count by
much
 to ensure availability of reduce slots for other users. 
 
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 4. mapred.job.shuffle.input.buffer.percent   -- 0.70
 
 5. mapred.job.shuffle.merge.percent  -- 0.66
 
 6. mapred.inmem.merge.threshold  -- 1000
 
 7. I have nearly 5000 mappers which are supposed to produce LZO
compressed
 outputs. The logs seem to indicate that the map outputs range between
0.3G
 to 0.8GB. 
 
 Does anything here seem amiss? I'd appreciate any input of what settings
to
 try. I can try different reduced values for the input buffer percent and
the
 merge percent.  Given that the job runs for about 7-8 hours before
crashing,
 I would like to make some informed choices if possible.
 
 Thanks. 
 ~ Niranjan.
 
 
 
 
 



Re: OOM Error Map output copy.

2011-12-09 Thread Arun C Murthy
Moving to mapreduce-user@, bcc common-user@. Please use project specific lists.

Niranjan,

If you average as 0.5G output per-map, it's 5000 maps *0.5G - 2.5TB over 12 
reduces i.e. nearly 250G per reduce - compressed! 

If you think you have 4:1 compression you are doing nearly a Terabyte per 
reducer... which is way too high!

I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G 
(compressed) per reducer for your job. If your compression ratio is 2:1, try 
500 reduces and so on.

If you are worried about other users, use the CapacityScheduler and submit your 
job to a queue with a small capacity and max-capacity to restrict your job to 
10 or 20 concurrent reduces at a given point.

Arun

On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:

 All 
 
 I am encountering the following out-of-memory error during the reduce phase 
 of a large job.
 
 Map output copy failure : java.lang.OutOfMemoryError: Java heap space
   at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
   at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
   at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
   at 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
 I tried increasing the memory available using mapped.child.java.opts but that 
 only helps a little. The reduce task eventually fails again. Here are some 
 relevant job configuration details:
 
 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers 
 filter out a small percentage of the input ( less than 1%).
 
 2. I am currently using 12 reducers and I can't increase this count by much 
 to ensure availability of reduce slots for other users. 
 
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 4. mapred.job.shuffle.input.buffer.percent-- 0.70
 
 5. mapred.job.shuffle.merge.percent   -- 0.66
 
 6. mapred.inmem.merge.threshold   -- 1000
 
 7. I have nearly 5000 mappers which are supposed to produce LZO compressed 
 outputs. The logs seem to indicate that the map outputs range between 0.3G to 
 0.8GB. 
 
 Does anything here seem amiss? I'd appreciate any input of what settings to 
 try. I can try different reduced values for the input buffer percent and the 
 merge percent.  Given that the job runs for about 7-8 hours before crashing, 
 I would like to make some informed choices if possible.
 
 Thanks. 
 ~ Niranjan.
 
 
 



Re: OOM Error Map output copy.

2011-12-09 Thread Prashant Kommireddi
Arun, I faced the same issue and increasing the # of reducers fixed the
problem.

I was initially under the impression MR framework spills to disk if data is
too huge to keep in memory, however on extraordinarily large reduce inputs
this was not the case and the job failed on trying to assign the in-memory
buffer

private MapOutput shuffleInMemory(MapOutputLocation mapOutputLoc,
URLConnection connection,
InputStream input,
int mapOutputLength,
int compressedLength)
  throws IOException, InterruptedException {
// Reserve ram for the map-output
  .
.
.
.

// Copy map-output into an in-memory buffer
byte[] shuffleData = new byte[mapOutputLength];


-Prahant Kommireddi

On Fri, Dec 9, 2011 at 10:29 AM, Arun C Murthy a...@hortonworks.com wrote:

 Moving to mapreduce-user@, bcc common-user@. Please use project specific
 lists.

 Niranjan,

 If you average as 0.5G output per-map, it's 5000 maps *0.5G - 2.5TB over
 12 reduces i.e. nearly 250G per reduce - compressed!

 If you think you have 4:1 compression you are doing nearly a Terabyte per
 reducer... which is way too high!

 I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G
 (compressed) per reducer for your job. If your compression ratio is 2:1,
 try 500 reduces and so on.

 If you are worried about other users, use the CapacityScheduler and submit
 your job to a queue with a small capacity and max-capacity to restrict your
 job to 10 or 20 concurrent reduces at a given point.

 Arun

 On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:

  All
 
  I am encountering the following out-of-memory error during the reduce
 phase of a large job.
 
  Map output copy failure : java.lang.OutOfMemoryError: Java heap space
at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
  I tried increasing the memory available using mapped.child.java.opts but
 that only helps a little. The reduce task eventually fails again. Here are
 some relevant job configuration details:
 
  1. The input to the mappers is about 2.5 TB (LZO compressed). The
 mappers filter out a small percentage of the input ( less than 1%).
 
  2. I am currently using 12 reducers and I can't increase this count by
 much to ensure availability of reduce slots for other users.
 
  3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
  4. mapred.job.shuffle.input.buffer.percent-- 0.70
 
  5. mapred.job.shuffle.merge.percent   -- 0.66
 
  6. mapred.inmem.merge.threshold   -- 1000
 
  7. I have nearly 5000 mappers which are supposed to produce LZO
 compressed outputs. The logs seem to indicate that the map outputs range
 between 0.3G to 0.8GB.
 
  Does anything here seem amiss? I'd appreciate any input of what settings
 to try. I can try different reduced values for the input buffer percent and
 the merge percent.  Given that the job runs for about 7-8 hours before
 crashing, I would like to make some informed choices if possible.
 
  Thanks.
  ~ Niranjan.
 
 
 




Re: OOM Error Map output copy.

2011-12-09 Thread Chandraprakash Bhagtani
Hi Niranjan,

Your issue looks similar to
https://issues.apache.org/jira/browse/MAPREDUCE-1182 . Which hadoop version
are you using?


On Sat, Dec 10, 2011 at 12:17 AM, Prashant Kommireddi
prash1...@gmail.comwrote:

 Arun, I faced the same issue and increasing the # of reducers fixed the
 problem.

 I was initially under the impression MR framework spills to disk if data is
 too huge to keep in memory, however on extraordinarily large reduce inputs
 this was not the case and the job failed on trying to assign the in-memory
 buffer

 private MapOutput shuffleInMemory(MapOutputLocation mapOutputLoc,
URLConnection connection,
InputStream input,
int mapOutputLength,
int compressedLength)
  throws IOException, InterruptedException {
// Reserve ram for the map-output
  .
 .
 .
 .

// Copy map-output into an in-memory buffer
byte[] shuffleData = new byte[mapOutputLength];


 -Prahant Kommireddi

 On Fri, Dec 9, 2011 at 10:29 AM, Arun C Murthy a...@hortonworks.com
 wrote:

  Moving to mapreduce-user@, bcc common-user@. Please use project specific
  lists.
 
  Niranjan,
 
  If you average as 0.5G output per-map, it's 5000 maps *0.5G - 2.5TB over
  12 reduces i.e. nearly 250G per reduce - compressed!
 
  If you think you have 4:1 compression you are doing nearly a Terabyte per
  reducer... which is way too high!
 
  I'd recommend you bump to somewhere along 1000 reduces to get to 2.5G
  (compressed) per reducer for your job. If your compression ratio is 2:1,
  try 500 reduces and so on.
 
  If you are worried about other users, use the CapacityScheduler and
 submit
  your job to a queue with a small capacity and max-capacity to restrict
 your
  job to 10 or 20 concurrent reduces at a given point.
 
  Arun
 
  On Dec 7, 2011, at 10:51 AM, Niranjan Balasubramanian wrote:
 
   All
  
   I am encountering the following out-of-memory error during the reduce
  phase of a large job.
  
   Map output copy failure : java.lang.OutOfMemoryError: Java heap space
 at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
 at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
 at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
 at
 
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
   I tried increasing the memory available using mapped.child.java.opts
 but
  that only helps a little. The reduce task eventually fails again. Here
 are
  some relevant job configuration details:
  
   1. The input to the mappers is about 2.5 TB (LZO compressed). The
  mappers filter out a small percentage of the input ( less than 1%).
  
   2. I am currently using 12 reducers and I can't increase this count by
  much to ensure availability of reduce slots for other users.
  
   3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
  
   4. mapred.job.shuffle.input.buffer.percent-- 0.70
  
   5. mapred.job.shuffle.merge.percent   -- 0.66
  
   6. mapred.inmem.merge.threshold   -- 1000
  
   7. I have nearly 5000 mappers which are supposed to produce LZO
  compressed outputs. The logs seem to indicate that the map outputs range
  between 0.3G to 0.8GB.
  
   Does anything here seem amiss? I'd appreciate any input of what
 settings
  to try. I can try different reduced values for the input buffer percent
 and
  the merge percent.  Given that the job runs for about 7-8 hours before
  crashing, I would like to make some informed choices if possible.
  
   Thanks.
   ~ Niranjan.
  
  
  
 
 




-- 
Thanks  Regards,
Chandra Prakash Bhagtani,
Nokia India Pvt. Ltd.


RE: OOM Error Map output copy.

2011-12-08 Thread Devaraj K
Hi Niranjan,

Every thing looks ok as per the info you have given. Can you check
in the job.xml file whether these child opts reflecting or any thing else is
overwriting this config.

3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC


and also can you tell me which version of hadoop using?


Devaraj K 

-Original Message-
From: Niranjan Balasubramanian [mailto:niran...@cs.washington.edu] 
Sent: Thursday, December 08, 2011 12:21 AM
To: common-user@hadoop.apache.org
Subject: OOM Error Map output copy.

All 

I am encountering the following out-of-memory error during the reduce phase
of a large job.

Map output copy failure : java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
mory(ReduceTask.java:1669)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
t(ReduceTask.java:1529)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
ReduceTask.java:1378)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
ask.java:1310)
I tried increasing the memory available using mapped.child.java.opts but
that only helps a little. The reduce task eventually fails again. Here are
some relevant job configuration details:

1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers
filter out a small percentage of the input ( less than 1%).

2. I am currently using 12 reducers and I can't increase this count by much
to ensure availability of reduce slots for other users. 

3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC

4. mapred.job.shuffle.input.buffer.percent  -- 0.70

5. mapred.job.shuffle.merge.percent -- 0.66

6. mapred.inmem.merge.threshold -- 1000

7. I have nearly 5000 mappers which are supposed to produce LZO compressed
outputs. The logs seem to indicate that the map outputs range between 0.3G
to 0.8GB. 

Does anything here seem amiss? I'd appreciate any input of what settings to
try. I can try different reduced values for the input buffer percent and the
merge percent.  Given that the job runs for about 7-8 hours before crashing,
I would like to make some informed choices if possible.

Thanks. 
~ Niranjan.






Re: OOM Error Map output copy.

2011-12-08 Thread Niranjan Balasubramanian
Devaraj

These are indeed the actual settings I copied over from the job.xml. 

~ Niranjan.
On Dec 8, 2011, at 12:10 AM, Devaraj K wrote:

 Hi Niranjan,
 
   Every thing looks ok as per the info you have given. Can you check
 in the job.xml file whether these child opts reflecting or any thing else is
 overwriting this config.
   
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 
 and also can you tell me which version of hadoop using?
 
 
 Devaraj K 
 
 -Original Message-
 From: Niranjan Balasubramanian [mailto:niran...@cs.washington.edu] 
 Sent: Thursday, December 08, 2011 12:21 AM
 To: common-user@hadoop.apache.org
 Subject: OOM Error Map output copy.
 
 All 
 
 I am encountering the following out-of-memory error during the reduce phase
 of a large job.
 
 Map output copy failure : java.lang.OutOfMemoryError: Java heap space
   at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
 mory(ReduceTask.java:1669)
   at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
 t(ReduceTask.java:1529)
   at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
 ReduceTask.java:1378)
   at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
 ask.java:1310)
 I tried increasing the memory available using mapped.child.java.opts but
 that only helps a little. The reduce task eventually fails again. Here are
 some relevant job configuration details:
 
 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers
 filter out a small percentage of the input ( less than 1%).
 
 2. I am currently using 12 reducers and I can't increase this count by much
 to ensure availability of reduce slots for other users. 
 
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 4. mapred.job.shuffle.input.buffer.percent-- 0.70
 
 5. mapred.job.shuffle.merge.percent   -- 0.66
 
 6. mapred.inmem.merge.threshold   -- 1000
 
 7. I have nearly 5000 mappers which are supposed to produce LZO compressed
 outputs. The logs seem to indicate that the map outputs range between 0.3G
 to 0.8GB. 
 
 Does anything here seem amiss? I'd appreciate any input of what settings to
 try. I can try different reduced values for the input buffer percent and the
 merge percent.  Given that the job runs for about 7-8 hours before crashing,
 I would like to make some informed choices if possible.
 
 Thanks. 
 ~ Niranjan.
 
 
 
 



Re: OOM Error Map output copy.

2011-12-08 Thread Niranjan Balasubramanian
I am using version 0.20.203.

Thanks
~ Niranjan.
On Dec 8, 2011, at 9:26 AM, Niranjan Balasubramanian wrote:

 Devaraj
 
 These are indeed the actual settings I copied over from the job.xml. 
 
 ~ Niranjan.
 On Dec 8, 2011, at 12:10 AM, Devaraj K wrote:
 
 Hi Niranjan,
 
  Every thing looks ok as per the info you have given. Can you check
 in the job.xml file whether these child opts reflecting or any thing else is
 overwriting this config.
  
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 
 and also can you tell me which version of hadoop using?
 
 
 Devaraj K 
 
 -Original Message-
 From: Niranjan Balasubramanian [mailto:niran...@cs.washington.edu] 
 Sent: Thursday, December 08, 2011 12:21 AM
 To: common-user@hadoop.apache.org
 Subject: OOM Error Map output copy.
 
 All 
 
 I am encountering the following out-of-memory error during the reduce phase
 of a large job.
 
 Map output copy failure : java.lang.OutOfMemoryError: Java heap space
  at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMe
 mory(ReduceTask.java:1669)
  at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutpu
 t(ReduceTask.java:1529)
  at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(
 ReduceTask.java:1378)
  at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceT
 ask.java:1310)
 I tried increasing the memory available using mapped.child.java.opts but
 that only helps a little. The reduce task eventually fails again. Here are
 some relevant job configuration details:
 
 1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers
 filter out a small percentage of the input ( less than 1%).
 
 2. I am currently using 12 reducers and I can't increase this count by much
 to ensure availability of reduce slots for other users. 
 
 3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC
 
 4. mapred.job.shuffle.input.buffer.percent   -- 0.70
 
 5. mapred.job.shuffle.merge.percent  -- 0.66
 
 6. mapred.inmem.merge.threshold  -- 1000
 
 7. I have nearly 5000 mappers which are supposed to produce LZO compressed
 outputs. The logs seem to indicate that the map outputs range between 0.3G
 to 0.8GB. 
 
 Does anything here seem amiss? I'd appreciate any input of what settings to
 try. I can try different reduced values for the input buffer percent and the
 merge percent.  Given that the job runs for about 7-8 hours before crashing,
 I would like to make some informed choices if possible.
 
 Thanks. 
 ~ Niranjan.
 
 
 
 
 



OOM Error Map output copy.

2011-12-07 Thread Niranjan Balasubramanian
All 

I am encountering the following out-of-memory error during the reduce phase of 
a large job.

Map output copy failure : java.lang.OutOfMemoryError: Java heap space
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1669)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1529)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1378)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1310)
I tried increasing the memory available using mapped.child.java.opts but that 
only helps a little. The reduce task eventually fails again. Here are some 
relevant job configuration details:

1. The input to the mappers is about 2.5 TB (LZO compressed). The mappers 
filter out a small percentage of the input ( less than 1%).

2. I am currently using 12 reducers and I can't increase this count by much to 
ensure availability of reduce slots for other users. 

3. mapred.child.java.opts -- -Xms512M -Xmx1536M -XX:+UseSerialGC

4. mapred.job.shuffle.input.buffer.percent  -- 0.70

5. mapred.job.shuffle.merge.percent -- 0.66

6. mapred.inmem.merge.threshold -- 1000

7. I have nearly 5000 mappers which are supposed to produce LZO compressed 
outputs. The logs seem to indicate that the map outputs range between 0.3G to 
0.8GB. 

Does anything here seem amiss? I'd appreciate any input of what settings to 
try. I can try different reduced values for the input buffer percent and the 
merge percent.  Given that the job runs for about 7-8 hours before crashing, I 
would like to make some informed choices if possible.

Thanks. 
~ Niranjan.