Hi,
The block size is configured to 128MB, I've read that it is recommended to
increase it in order to get better performance.
What value do you recommend to set it ?

Avi

-----Original Message-----
From: madhu phatak [mailto:phatak....@gmail.com] 
Sent: Tuesday, June 21, 2011 12:54 PM
To: common-user@hadoop.apache.org
Subject: Re: Help with adjusting Hadoop configuration files

If u reduce the default block size of dfs(which is in the configuration
file) and if u use default inputformat it creates more no of mappers at a
time which may help you to effectively use the RAM.. Another way is create
as many parallel jobs as possible at pro grammatically so that uses all
available RAM.

On Tue, Jun 21, 2011 at 3:17 PM, Avi Vaknin <avivakni...@gmail.com> wrote:

> Hi Madhu,
> First of all, thanks for the quick reply.
> I've searched the net about the properties of the configuration files and
I
> specifically wanted to know if there is
> a property that is related to memory tuning (as you can see I have 7.5 RAM
> on each datanode and I really want to use it properly).
> Also, I've changed the mapred.tasktracker.reduce/map.tasks.maximum to 10
> (number of cores on the datanodes) and unfortunately I haven't seen any
> change on the performance or time duration of running jobs.
>
> Avi
>
> -----Original Message-----
> From: madhu phatak [mailto:phatak....@gmail.com]
> Sent: Tuesday, June 21, 2011 12:33 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Help with adjusting Hadoop configuration files
>
> The utilization of cluster depends upon the no of jobs and no of mappers
> and
> reducers.The configuration files only help u set up the cluster by
> specifying info .u can also specify some of details like block size and
> replication in configuration files  which may help you in job
> management.You
> can read all the available configuration properties here
> http://hadoop.apache.org/common/docs/current/cluster_setup.html
>
> On Tue, Jun 21, 2011 at 2:13 PM, Avi Vaknin <avivakni...@gmail.com> wrote:
>
> > Hi Everyone,
> > We are a start-up company has been using the Hadoop Cluster platform
> > (version 0.20.2) on Amazon EC2 environment.
> > We tried to setup a cluster using two different forms:
> > Cluster 1: includes 1 master (namenode) + 5 datanodes - all of the
> machines
> > are small EC2 instances (1.6 GB RAM)
> > Cluster 2: includes 1 master (namenode) + 2 datanodes - the master is a
> > small EC2 instance and the other two datanodes are large EC2 instances
> (7.5
> > GB RAM)
> > We tried to make changes on the the configuration files (core-sit,
> > hdfs-site
> > and mapred-sit xml files) and we expected to see a significant
> improvement
> > on the performance of the cluster 2,
> > unfortunately this has yet to happen.
> >
> > Are there any special parameters on the configuration files that we need
> to
> > change in order to adjust the Hadoop to a large hardware environment ?
> > Are there any best practice you recommend?
> >
> > Thanks in advance.
> >
> > Avi
> >
> >
> >
> >
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1382 / Virus Database: 1513/3707 - Release Date: 06/16/11
>
>

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1382 / Virus Database: 1513/3707 - Release Date: 06/16/11

Reply via email to