Hi Geelong

Let me just put in my thoughts here

You have 8G of RAM. But you have 8+8 = 16 slots with task jvm size of 1G.
This means if all slots are utilized simultaneously then tasks need 16G but
only 8G is available, hence high chances of OOM errors.

When you decide on slots you need to consider the memory utilized by OS,
hadoop daemons etc, only the remaining memory has to be divided among task
slots.

Increasing the number of reduce tasks alone won't give too much of a
performance improvement. In MR the sort and shuffle is the most expensive
phase, try doing your tweaking there, some things i can think of are
1. Use map output compression
2. Use combiner if possible
3. reduce spills by adjusting io.sort.mb and io.sort.factor etc

Apart from this if you are having some custom code running,
controlling/filtering the data volume at initial stages of a multi stage MR
could bring in considerable performance improvement.



On Wed, Apr 17, 2013 at 3:12 PM, 姚吉龙 <geelong...@gmail.com> wrote:

> Hi everyone
>
> We have a cluster of 31 datanodes with 1 namenode,each with 8-core cpu and
> 8G RAM
> I am studying the approach to improve the performance of this cluster.Now
> we have a datafile of 100G as the test case.
> when I add the reduce number form 100 to 200, I did not see larger
> improvment from 23m52s to 19m44s. Besides there two failed task appear in
> this process:
>
> java.lang.Throwable: Child Error at
> org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271) Caused by:
> java.io.IOException: Task process exit with nonzero status of 126. at
> org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>
> Here is my conf in mapred-site.xml:[image: 内嵌图片 1]
>
> 1.Could any help about the failed task? Why would this happen?
>  2.How can I continue to speed up the process of this case.
> Any suggestion is welcome
>
>
> BRs
> Geelong
>
> --
> From Good To Great
>

<<image.png>>

Reply via email to