pts","-Xmx***m).
>
> -Gang
>
>
>
>
> - 原始邮件
> 发件人: Reik Schatz
> 收件人: "common-user@hadoop.apache.org"
> 发送日期: 2010/3/17 (周三) 10:13:45 上午
> 主 题: Re: optimization help needed
>
> Very good input not to sent the "original xml" o
after several
attempts. You may need to increase the heap size for each task by
JobConf.set("mapred.child.java.opts","-Xmx***m).
-Gang
- 原始邮件
发件人: Reik Schatz
收件人: "common-user@hadoop.apache.org"
发送日期: 2010/3/17 (周三) 10:13:45 上午
主 题: Re: optimization help ne
uce the amount of data sent from mappers to reducers. Use
> combiner to pre-aggregate the data may also help.
>
> -Gang
>
>
>
>
> - 原始邮件
> 发件人: Reik Schatz
> 收件人: "common-user@hadoop.apache.org"
> 发送日期: 2010/3/17 (周三) 5:04:33 上午
> 主
eik Schatz
收件人: "common-user@hadoop.apache.org"
发送日期: 2010/3/17 (周三) 5:04:33 上午
主 题: optimization help needed
Preparing a Hadoop presentation here. For demonstration I start up a 5 machine
m1.large cluster in EC2 via cloudera scripts ($hadoop-ec2 launch-cluster
my-hadoop-cluster 5).
Preparing a Hadoop presentation here. For demonstration I start up a 5
machine m1.large cluster in EC2 via cloudera scripts ($hadoop-ec2
launch-cluster my-hadoop-cluster 5). Then I sent a 500 MB xml file over
into HDFS. The Mapper will receive a XML block as the key, select a
email address from