AW: reference document which properties are set in which configuration file

2012-02-13 Thread Kleegrewe, Christian
Hi all,

thank you for the answers. Especially the hints from Harsh were helpful. Since 
I was in a little hurry I tried out the different properties in the different 
files and finally succeeded. I think my worst problem was that I had a typo in 
the property definition. So if I take care of this in future I think I do not 
need a reference document any more.

Thanks and best regards,

Christian

8<--
Siemens AG
Corporate Technology
Corporate Research and Technologies
CT T DE IT3
Otto-Hahn-Ring 6
81739 München, Deutschland
Tel.: +49 89 636-42722
Fax: +49 89 636-41423
mailto:christian.kleegr...@siemens.com

Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard Cromme; 
Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte Ederer, Klaus 
Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt, Siegfried Russwurm, Peter 
Y. Solmssen, Michael Süß; Sitz der Gesellschaft: Berlin und München, 
Deutschland; Registergericht: Berlin Charlottenburg, HRB 12300, München, HRB 
6684; WEEE-Reg.-Nr. DE 23691322


-Ursprüngliche Nachricht-
Von: Praveen Sripati [mailto:praveensrip...@gmail.com]
Gesendet: Samstag, 11. Februar 2012 04:58
An: common-user@hadoop.apache.org
Betreff: Re: reference document which properties are set in which configuration 
file

The mapred.task.tracker.http.address will go in the mapred-site.xml file.

In the Hadoop installation directory check the core-default.xml,
hdfs-default,xml and mapred-default.xml files to know about the different
properties. Some of the properties which might be in the code may not be
mentioned in the xml files and will be defaulted.

Praveen

On Tue, Feb 7, 2012 at 3:30 PM, Kleegrewe, Christian <
christian.kleegr...@siemens.com> wrote:

> Dear all,
>
> while configuring our hadoop cluster I wonder whether there exists a
> reference document that contains information about which configuration
> property has to be specified in which properties file. Especially I do not
> know where the mapred.task.tracker.http.address has to be set. Is it in the
> mapre-site.xml or in the hdfs-site.xml?
>
> any hint will be appreciated
>
> thanks
>
> Christian
>
>
> 8<--
> Siemens AG
> Corporate Technology
> Corporate Research and Technologies
> CT T DE IT3
> Otto-Hahn-Ring 6
> 81739 München, Deutschland
> Tel.: +49 89 636-42722
> Fax: +49 89 636-41423
> mailto:christian.kleegr...@siemens.com
>
> Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard
> Cromme; Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte
> Ederer, Klaus Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt,
> Siegfried Russwurm, Peter Y. Solmssen, Michael Süß; Sitz der Gesellschaft:
> Berlin und München, Deutschland; Registergericht: Berlin Charlottenburg,
> HRB 12300, München, HRB 6684; WEEE-Reg.-Nr. DE 23691322
>
>
>


Re: reference document which properties are set in which configuration file

2012-02-10 Thread Raj Vishwanathan
Harsh, All

This was one of the first questions that  I asked. It is sometimes not clear 
whether some parameters are site related  or jab related or whether it belongs 
to NN, JT , DN or TT.

If I get some time during the weekend , I will try and put this into a document 
and see if it helps

Raj



>
> From: Harsh J 
>To: common-user@hadoop.apache.org 
>Sent: Friday, February 10, 2012 8:31 PM
>Subject: Re: reference document which properties are set in which 
>configuration file
> 
>As a thumb rule, all properties starting with mapred.* or mapreduce.*
>go to mapred-site.xml, all properties starting with dfs.* go to
>hdfs-site.xml, and the rest may be put in core-site.xml to be safe.
>
>In case you notice MR or HDFS specific properties being outside of
>this naming convention, please do report a JIRA so we can deprecate
>the old name and rename it with a more appropriate prefix.
>
>On Sat, Feb 11, 2012 at 9:27 AM, Praveen Sripati
> wrote:
>> The mapred.task.tracker.http.address will go in the mapred-site.xml file.
>>
>> In the Hadoop installation directory check the core-default.xml,
>> hdfs-default,xml and mapred-default.xml files to know about the different
>> properties. Some of the properties which might be in the code may not be
>> mentioned in the xml files and will be defaulted.
>>
>> Praveen
>>
>> On Tue, Feb 7, 2012 at 3:30 PM, Kleegrewe, Christian <
>> christian.kleegr...@siemens.com> wrote:
>>
>>> Dear all,
>>>
>>> while configuring our hadoop cluster I wonder whether there exists a
>>> reference document that contains information about which configuration
>>> property has to be specified in which properties file. Especially I do not
>>> know where the mapred.task.tracker.http.address has to be set. Is it in the
>>> mapre-site.xml or in the hdfs-site.xml?
>>>
>>> any hint will be appreciated
>>>
>>> thanks
>>>
>>> Christian
>>>
>>>
>>> 8<--
>>> Siemens AG
>>> Corporate Technology
>>> Corporate Research and Technologies
>>> CT T DE IT3
>>> Otto-Hahn-Ring 6
>>> 81739 München, Deutschland
>>> Tel.: +49 89 636-42722
>>> Fax: +49 89 636-41423
>>> mailto:christian.kleegr...@siemens.com
>>>
>>> Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard
>>> Cromme; Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte
>>> Ederer, Klaus Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt,
>>> Siegfried Russwurm, Peter Y. Solmssen, Michael Süß; Sitz der Gesellschaft:
>>> Berlin und München, Deutschland; Registergericht: Berlin Charlottenburg,
>>> HRB 12300, München, HRB 6684; WEEE-Reg.-Nr. DE 23691322
>>>
>>>
>>>
>
>
>
>-- 
>Harsh J
>Customer Ops. Engineer
>Cloudera | http://tiny.cloudera.com/about
>
>
>

Re: reference document which properties are set in which configuration file

2012-02-10 Thread Harsh J
As a thumb rule, all properties starting with mapred.* or mapreduce.*
go to mapred-site.xml, all properties starting with dfs.* go to
hdfs-site.xml, and the rest may be put in core-site.xml to be safe.

In case you notice MR or HDFS specific properties being outside of
this naming convention, please do report a JIRA so we can deprecate
the old name and rename it with a more appropriate prefix.

On Sat, Feb 11, 2012 at 9:27 AM, Praveen Sripati
 wrote:
> The mapred.task.tracker.http.address will go in the mapred-site.xml file.
>
> In the Hadoop installation directory check the core-default.xml,
> hdfs-default,xml and mapred-default.xml files to know about the different
> properties. Some of the properties which might be in the code may not be
> mentioned in the xml files and will be defaulted.
>
> Praveen
>
> On Tue, Feb 7, 2012 at 3:30 PM, Kleegrewe, Christian <
> christian.kleegr...@siemens.com> wrote:
>
>> Dear all,
>>
>> while configuring our hadoop cluster I wonder whether there exists a
>> reference document that contains information about which configuration
>> property has to be specified in which properties file. Especially I do not
>> know where the mapred.task.tracker.http.address has to be set. Is it in the
>> mapre-site.xml or in the hdfs-site.xml?
>>
>> any hint will be appreciated
>>
>> thanks
>>
>> Christian
>>
>>
>> 8<--
>> Siemens AG
>> Corporate Technology
>> Corporate Research and Technologies
>> CT T DE IT3
>> Otto-Hahn-Ring 6
>> 81739 München, Deutschland
>> Tel.: +49 89 636-42722
>> Fax: +49 89 636-41423
>> mailto:christian.kleegr...@siemens.com
>>
>> Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard
>> Cromme; Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte
>> Ederer, Klaus Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt,
>> Siegfried Russwurm, Peter Y. Solmssen, Michael Süß; Sitz der Gesellschaft:
>> Berlin und München, Deutschland; Registergericht: Berlin Charlottenburg,
>> HRB 12300, München, HRB 6684; WEEE-Reg.-Nr. DE 23691322
>>
>>
>>



-- 
Harsh J
Customer Ops. Engineer
Cloudera | http://tiny.cloudera.com/about


reference document which properties are set in which configuration file

2012-02-07 Thread Kleegrewe, Christian
Dear all,

while configuring our hadoop cluster I wonder whether there exists a reference 
document that contains information about which configuration property has to be 
specified in which properties file. Especially I do not know where the 
mapred.task.tracker.http.address has to be set. Is it in the mapre-site.xml or 
in the hdfs-site.xml?

any hint will be appreciated 

thanks

Christian

8<--
Siemens AG 
Corporate Technology
Corporate Research and Technologies 
CT T DE IT3 
Otto-Hahn-Ring 6 
81739 München, Deutschland 
Tel.: +49 89 636-42722 
Fax: +49 89 636-41423 
mailto:christian.kleegr...@siemens.com 

Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard Cromme; 
Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte Ederer, Klaus 
Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt, Siegfried Russwurm, Peter 
Y. Solmssen, Michael Süß; Sitz der Gesellschaft: Berlin und München, 
Deutschland; Registergericht: Berlin Charlottenburg, HRB 12300, München, HRB 
6684; WEEE-Reg.-Nr. DE 23691322 




Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Yu Li
Hi Ted,

Thanks for your help. I've seen the code under the path you mentioned,
but it seems this class has been deprecated. Do you know any other
parameter has similar functionality as this one?

Others, any comments/suggestions? Thanks.

Best Regards,
Carp

2010/7/2 Ted Yu :
> fs.inmemory.size.mb is used in 0.20.2
> See src/core/org/apache/hadoop/fs/InMemoryFileSystem.java:
>
>    public void initialize(URI uri, Configuration conf) {
>      setConf(conf);
>      int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
>      this.fsSize = size * 1024L * 1024L;
>
> On Thu, Jul 1, 2010 at 9:38 AM, Yu Li  wrote:
>
>> Hi Ted,
>>
>> Thanks for your help. Another question is: if this parameter is not used in
>> the 0.20.X release and the "Cluster Setup" is not updated, is there any
>> parameter replacing this one? It's a useful parameter IMHO.
>>
>> Anyone knows about this? Thanks in advance!
>>
>> Best Regards,
>> Carp
>>
>> 2010/7/2 Ted Yu 
>>
>> > I found https://issues.apache.org/jira/browse/HADOOP-6812
>> >
>> > You can add the following to core-site.xml:
>> > 
>> > fs.inmemory.size.mb
>> > 100
>> > 
>> >
>> > Default value is 100:
>> >      int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
>> > ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java
>> >
>> > On Thu, Jul 1, 2010 at 2:33 AM, Yu Li  wrote:
>> >
>> > > Hi Sriguru,
>> > >
>> > > Thanks for your comments. Do you know how to set this parameter?
>> > >
>> > > Best Regards,
>> > > Carp
>> > >
>> > > 2010/7/1 Srigurunath Chakravarthi :
>> > > > Carp,
>> > > >  IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side
>> equivalent
>> > of
>> > > io.sort.mb. In the reducer tasks, intermediate map output is collected
>> > into
>> > > a buffer (who size is governed by this parameter's value), and data is
>> > > flushed into files as (partially) sorted KVs.
>> > > >
>> > > >  These files will be re-merged if we end up with more than
>> > io.sort.factor
>> > > number of files, else KVs will be served out of these files to the
>> reduce
>> > > function directly.
>> > > >
>> > > >  I don't know where in the code it is though, sorry.
>> > > >
>> > > > cheers,
>> > > > Sriguru
>> > > >
>> > > >
>> > > >>-Original Message-
>> > > >>From: Yu Li [mailto:car...@gmail.com]
>> > > >>Sent: Thursday, July 01, 2010 1:12 PM
>> > > >>To: common-user@hadoop.apache.org
>> > > >>Subject: In which configuration file to configure the
>> > > >>"fs.inmemory.size.mb" parameter?
>> > > >>
>> > > >>Hi all,
>> > > >>
>> > > >>I looked through the "Cluster Setup" guide under link
>> > > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
>> > > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory
>> > > >>allocated for the in-memory file-system used to merge map-outputs at
>> > > >>the reduces, and this parameter is set in the "core-site.xml". But
>> > > >>when I checked the "core-default.xml" under path
>> > > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
>> > > >>could I find the parameter through JTUI after lauching jobs.
>> > > >>Does anybody know about this parameter? Has it been removed from
>> > > >>release 0.20.X? If it hasn't been removed, how could I set the
>> > > >>parameter besides using the -D option? Thanks in advance.
>> > > >>
>> > > >>Best Regards,
>> > > >>Carp
>> > > >
>> > >
>> >
>>
>


Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Ted Yu
fs.inmemory.size.mb is used in 0.20.2
See src/core/org/apache/hadoop/fs/InMemoryFileSystem.java:

public void initialize(URI uri, Configuration conf) {
  setConf(conf);
  int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
  this.fsSize = size * 1024L * 1024L;

On Thu, Jul 1, 2010 at 9:38 AM, Yu Li  wrote:

> Hi Ted,
>
> Thanks for your help. Another question is: if this parameter is not used in
> the 0.20.X release and the "Cluster Setup" is not updated, is there any
> parameter replacing this one? It's a useful parameter IMHO.
>
> Anyone knows about this? Thanks in advance!
>
> Best Regards,
> Carp
>
> 2010/7/2 Ted Yu 
>
> > I found https://issues.apache.org/jira/browse/HADOOP-6812
> >
> > You can add the following to core-site.xml:
> > 
> > fs.inmemory.size.mb
> > 100
> > 
> >
> > Default value is 100:
> >  int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
> > ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java
> >
> > On Thu, Jul 1, 2010 at 2:33 AM, Yu Li  wrote:
> >
> > > Hi Sriguru,
> > >
> > > Thanks for your comments. Do you know how to set this parameter?
> > >
> > > Best Regards,
> > > Carp
> > >
> > > 2010/7/1 Srigurunath Chakravarthi :
> > > > Carp,
> > > >  IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side
> equivalent
> > of
> > > io.sort.mb. In the reducer tasks, intermediate map output is collected
> > into
> > > a buffer (who size is governed by this parameter's value), and data is
> > > flushed into files as (partially) sorted KVs.
> > > >
> > > >  These files will be re-merged if we end up with more than
> > io.sort.factor
> > > number of files, else KVs will be served out of these files to the
> reduce
> > > function directly.
> > > >
> > > >  I don't know where in the code it is though, sorry.
> > > >
> > > > cheers,
> > > > Sriguru
> > > >
> > > >
> > > >>-Original Message-
> > > >>From: Yu Li [mailto:car...@gmail.com]
> > > >>Sent: Thursday, July 01, 2010 1:12 PM
> > > >>To: common-user@hadoop.apache.org
> > > >>Subject: In which configuration file to configure the
> > > >>"fs.inmemory.size.mb" parameter?
> > > >>
> > > >>Hi all,
> > > >>
> > > >>I looked through the "Cluster Setup" guide under link
> > > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
> > > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory
> > > >>allocated for the in-memory file-system used to merge map-outputs at
> > > >>the reduces, and this parameter is set in the "core-site.xml". But
> > > >>when I checked the "core-default.xml" under path
> > > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
> > > >>could I find the parameter through JTUI after lauching jobs.
> > > >>Does anybody know about this parameter? Has it been removed from
> > > >>release 0.20.X? If it hasn't been removed, how could I set the
> > > >>parameter besides using the -D option? Thanks in advance.
> > > >>
> > > >>Best Regards,
> > > >>Carp
> > > >
> > >
> >
>


Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Yu Li
Hi Ted,

Thanks for your help. Another question is: if this parameter is not used in
the 0.20.X release and the "Cluster Setup" is not updated, is there any
parameter replacing this one? It's a useful parameter IMHO.

Anyone knows about this? Thanks in advance!

Best Regards,
Carp

2010/7/2 Ted Yu 

> I found https://issues.apache.org/jira/browse/HADOOP-6812
>
> You can add the following to core-site.xml:
> 
> fs.inmemory.size.mb
> 100
> 
>
> Default value is 100:
>  int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
> ./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java
>
> On Thu, Jul 1, 2010 at 2:33 AM, Yu Li  wrote:
>
> > Hi Sriguru,
> >
> > Thanks for your comments. Do you know how to set this parameter?
> >
> > Best Regards,
> > Carp
> >
> > 2010/7/1 Srigurunath Chakravarthi :
> > > Carp,
> > >  IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent
> of
> > io.sort.mb. In the reducer tasks, intermediate map output is collected
> into
> > a buffer (who size is governed by this parameter's value), and data is
> > flushed into files as (partially) sorted KVs.
> > >
> > >  These files will be re-merged if we end up with more than
> io.sort.factor
> > number of files, else KVs will be served out of these files to the reduce
> > function directly.
> > >
> > >  I don't know where in the code it is though, sorry.
> > >
> > > cheers,
> > > Sriguru
> > >
> > >
> > >>-Original Message-
> > >>From: Yu Li [mailto:car...@gmail.com]
> > >>Sent: Thursday, July 01, 2010 1:12 PM
> > >>To: common-user@hadoop.apache.org
> > >>Subject: In which configuration file to configure the
> > >>"fs.inmemory.size.mb" parameter?
> > >>
> > >>Hi all,
> > >>
> > >>I looked through the "Cluster Setup" guide under link
> > >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
> > >>found there's a "fs.inmemory.size.mb" parameter for specifying memory
> > >>allocated for the in-memory file-system used to merge map-outputs at
> > >>the reduces, and this parameter is set in the "core-site.xml". But
> > >>when I checked the "core-default.xml" under path
> > >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
> > >>could I find the parameter through JTUI after lauching jobs.
> > >>Does anybody know about this parameter? Has it been removed from
> > >>release 0.20.X? If it hasn't been removed, how could I set the
> > >>parameter besides using the -D option? Thanks in advance.
> > >>
> > >>Best Regards,
> > >>Carp
> > >
> >
>


Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Ted Yu
I found https://issues.apache.org/jira/browse/HADOOP-6812

You can add the following to core-site.xml:

fs.inmemory.size.mb
100


Default value is 100:
  int size = Integer.parseInt(conf.get("fs.inmemory.size.mb", "100"));
./src/core/org/apache/hadoop/fs/InMemoryFileSystem.java

On Thu, Jul 1, 2010 at 2:33 AM, Yu Li  wrote:

> Hi Sriguru,
>
> Thanks for your comments. Do you know how to set this parameter?
>
> Best Regards,
> Carp
>
> 2010/7/1 Srigurunath Chakravarthi :
> > Carp,
> >  IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of
> io.sort.mb. In the reducer tasks, intermediate map output is collected into
> a buffer (who size is governed by this parameter's value), and data is
> flushed into files as (partially) sorted KVs.
> >
> >  These files will be re-merged if we end up with more than io.sort.factor
> number of files, else KVs will be served out of these files to the reduce
> function directly.
> >
> >  I don't know where in the code it is though, sorry.
> >
> > cheers,
> > Sriguru
> >
> >
> >>-Original Message-
> >>From: Yu Li [mailto:car...@gmail.com]
> >>Sent: Thursday, July 01, 2010 1:12 PM
> >>To: common-user@hadoop.apache.org
> >>Subject: In which configuration file to configure the
> >>"fs.inmemory.size.mb" parameter?
> >>
> >>Hi all,
> >>
> >>I looked through the "Cluster Setup" guide under link
> >>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
> >>found there's a "fs.inmemory.size.mb" parameter for specifying memory
> >>allocated for the in-memory file-system used to merge map-outputs at
> >>the reduces, and this parameter is set in the "core-site.xml". But
> >>when I checked the "core-default.xml" under path
> >>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
> >>could I find the parameter through JTUI after lauching jobs.
> >>Does anybody know about this parameter? Has it been removed from
> >>release 0.20.X? If it hasn't been removed, how could I set the
> >>parameter besides using the -D option? Thanks in advance.
> >>
> >>Best Regards,
> >>Carp
> >
>


Re: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Yu Li
Hi Sriguru,

Thanks for your comments. Do you know how to set this parameter?

Best Regards,
Carp

2010/7/1 Srigurunath Chakravarthi :
> Carp,
>  IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of 
> io.sort.mb. In the reducer tasks, intermediate map output is collected into a 
> buffer (who size is governed by this parameter's value), and data is flushed 
> into files as (partially) sorted KVs.
>
>  These files will be re-merged if we end up with more than io.sort.factor 
> number of files, else KVs will be served out of these files to the reduce 
> function directly.
>
>  I don't know where in the code it is though, sorry.
>
> cheers,
> Sriguru
>
>
>>-Original Message-
>>From: Yu Li [mailto:car...@gmail.com]
>>Sent: Thursday, July 01, 2010 1:12 PM
>>To: common-user@hadoop.apache.org
>>Subject: In which configuration file to configure the
>>"fs.inmemory.size.mb" parameter?
>>
>>Hi all,
>>
>>I looked through the "Cluster Setup" guide under link
>>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
>>found there's a "fs.inmemory.size.mb" parameter for specifying memory
>>allocated for the in-memory file-system used to merge map-outputs at
>>the reduces, and this parameter is set in the "core-site.xml". But
>>when I checked the "core-default.xml" under path
>>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
>>could I find the parameter through JTUI after lauching jobs.
>>Does anybody know about this parameter? Has it been removed from
>>release 0.20.X? If it hasn't been removed, how could I set the
>>parameter besides using the -D option? Thanks in advance.
>>
>>Best Regards,
>>Carp
>


RE: In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Srigurunath Chakravarthi
Carp,
 IMHO, .20.x has it. fs.inmemory.size.mb is the reduce-side equivalent of 
io.sort.mb. In the reducer tasks, intermediate map output is collected into a 
buffer (who size is governed by this parameter's value), and data is flushed 
into files as (partially) sorted KVs. 

 These files will be re-merged if we end up with more than io.sort.factor 
number of files, else KVs will be served out of these files to the reduce 
function directly.

 I don't know where in the code it is though, sorry.

cheers,
Sriguru


>-Original Message-
>From: Yu Li [mailto:car...@gmail.com]
>Sent: Thursday, July 01, 2010 1:12 PM
>To: common-user@hadoop.apache.org
>Subject: In which configuration file to configure the
>"fs.inmemory.size.mb" parameter?
>
>Hi all,
>
>I looked through the "Cluster Setup" guide under link
>http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
>found there's a "fs.inmemory.size.mb" parameter for specifying memory
>allocated for the in-memory file-system used to merge map-outputs at
>the reduces, and this parameter is set in the "core-site.xml". But
>when I checked the "core-default.xml" under path
>"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
>could I find the parameter through JTUI after lauching jobs.
>Does anybody know about this parameter? Has it been removed from
>release 0.20.X? If it hasn't been removed, how could I set the
>parameter besides using the -D option? Thanks in advance.
>
>Best Regards,
>Carp


In which configuration file to configure the "fs.inmemory.size.mb" parameter?

2010-07-01 Thread Yu Li
Hi all,

I looked through the "Cluster Setup" guide under link
http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html and
found there's a "fs.inmemory.size.mb" parameter for specifying memory
allocated for the in-memory file-system used to merge map-outputs at
the reduces, and this parameter is set in the "core-site.xml". But
when I checked the "core-default.xml" under path
"$HADOOP_HOME/src/core/", I didn't find the parameter at all, nor
could I find the parameter through JTUI after lauching jobs.
Does anybody know about this parameter? Has it been removed from
release 0.20.X? If it hasn't been removed, how could I set the
parameter besides using the -D option? Thanks in advance.

Best Regards,
Carp


Re: configuration file

2010-02-04 Thread Eric Arenas
Hi Gang,

You have to load the XML config file in your M/R code.

Something like this:
FSDataInputStream inS = fs.open(in);
conf.addResource(inS); 

 
Where "conf" is your Configuration.

This will in effect read all the parameters from that XML and override anything 
that you have previously set with:
conf.set("parameter",parameterValue);

regards,
Eric Arenas



- Original Message 
From: Gang Luo 
To: common-user@hadoop.apache.org
Sent: Thu, February 4, 2010 6:14:54 AM
Subject: Re: configuration file

I give the path to that xml file in that command. Do I need to add that path to 
classpath? I try to give a wrong path, there is no error reported.

Aren't those parameters all configurable? like io.sort.mb, mapred.reduce.tasks, 
io.sort.factor, etc. 

Thanks.
-Gang




- 原始邮件 
发件人: Amogh Vasekar 
收件人: "common-user@hadoop.apache.org" 
发送日期: 2010/2/4 (周四) 6:09:04 上午
主   题: Re: configuration file

Hi,
A shot in the dark, is the conf file in your classpath? If yes, are the 
parameters you are trying to override marked final?

Amogh


On 2/4/10 3:18 AM, "Gang Luo"  wrote:

Hi,
I am writing script to run whole bunch of jobs automatically. But the 
configuration file doesn't seems working. I think there is something wrong in 
my command.

The command is my script is like:
bin/hadoop jar myJarFile myClass -conf myConfigurationFilr.xml  arg1  agr2 

I use conf.get() so show the value of some parameters. But the values are not 
what I define in that xml file.  Is there something wrong?

Thanks.
-Gang


  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/



Re: configuration file

2010-02-04 Thread Gang Luo
I give the path to that xml file in that command. Do I need to add that path to 
classpath? I try to give a wrong path, there is no error reported.

Aren't those parameters all configurable? like io.sort.mb, mapred.reduce.tasks, 
io.sort.factor, etc. 

Thanks.
-Gang




- 原始邮件 
发件人: Amogh Vasekar 
收件人: "common-user@hadoop.apache.org" 
发送日期: 2010/2/4 (周四) 6:09:04 上午
主   题: Re: configuration file

Hi,
A shot in the dark, is the conf file in your classpath? If yes, are the 
parameters you are trying to override marked final?

Amogh


On 2/4/10 3:18 AM, "Gang Luo"  wrote:

Hi,
I am writing script to run whole bunch of jobs automatically. But the 
configuration file doesn't seems working. I think there is something wrong in 
my command.

The command is my script is like:
bin/hadoop jar myJarFile myClass -conf myConfigurationFilr.xml  arg1  agr2 

I use conf.get() so show the value of some parameters. But the values are not 
what I define in that xml file.  Is there something wrong?

Thanks.
-Gang


  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/


Re: configuration file

2010-02-04 Thread Amogh Vasekar
Hi,
A shot in the dark, is the conf file in your classpath? If yes, are the 
parameters you are trying to override marked final?

Amogh


On 2/4/10 3:18 AM, "Gang Luo"  wrote:

Hi,
I am writing script to run whole bunch of jobs automatically. But the 
configuration file doesn't seems working. I think there is something wrong in 
my command.

The command is my script is like:
bin/hadoop jar myJarFile myClass -conf myConfigurationFilr.xml  arg1  agr2 

I use conf.get() so show the value of some parameters. But the values are not 
what I define in that xml file.  Is there something wrong?

Thanks.
-Gang




configuration file

2010-02-03 Thread Gang Luo
Hi,
I am writing script to run whole bunch of jobs automatically. But the 
configuration file doesn't seems working. I think there is something wrong in 
my command. 

The command is my script is like:
bin/hadoop jar myJarFile myClass -conf myConfigurationFilr.xml  arg1  agr2 

I use conf.get() so show the value of some parameters. But the values are not 
what I define in that xml file.  Is there something wrong? 

Thanks.
-Gang



  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/