Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread Amareshwari Sriramadasu

Yes. The configuration is read only when the taskTracker starts.
You can see more discussion on jira HADOOP-5170 
(http://issues.apache.org/jira/browse/HADOOP-5170) for making it per job.

-Amareshwari
jason hadoop wrote:

I certainly hope it changes but I am unaware that it is in the todo queue at
present.

2009/2/18 S D 

  

Thanks Jason. That's useful information. Are you aware of plans to change
this so that the maximum values can be changed without restarting the
server?

John

2009/2/18 jason hadoop 



The .maximum values are only loaded by the Tasktrackers at server start
time
at present, and any changes you make will be ignored.


2009/2/18 S D 

  

Thanks for your response Rasit. You may have missed a portion of my


post.


On a different note, when I attempt to pass params via -D I get a
  

usage


message; when I use


-jobconf the command goes through (and works in the case of
  

mapred.reduce.tasks=0 for


example) but I get  a deprecation warning).
  

I'm using Hadoop 0.19.0 and -D is not working. Are you using version


0.19.0
  

as well?

John


On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS 


wrote:
  

John, did you try -D option instead of -jobconf,

I had -D option in my code, I changed it with -jobconf, this is what
  

I


get:


...
...
Options:
 -input DFS input file(s) for the Map step
 -outputDFS output directory for the Reduce step
 -mapper The streaming command to run
 -combiner  Combiner has to be a Java class
 -reducerThe streaming command to run
 -file  File/dir to be shipped in the Job jar file
 -inputformat
TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
Optional.
 -outputformat TextOutputFormat(default)|JavaClassName  Optional.
 -partitioner JavaClassName  Optional.
 -numReduceTasks   Optional.
 -inputreader   Optional.
 -cmdenv   =Optional. Pass env.var to streaming commands
 -mapdebug   Optional. To run this script when a map task fails
 -reducedebug   Optional. To run this script when a reduce task
  

fails


 -verbose

Generic options supported are
-conf  specify an application configuration
  

file
  

-D use value for given property
-fs   specify a namenode
-jt specify a job tracker
-files specify comma separated
  

files


to


be copied to the map reduce cluster
-libjars specify comma separated
  

jar


files
to include in the classpath.
-archives specify comma
  

separated


archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info



I think -jobconf is not used in v.0.19 .

2009/2/18 S D 

  

I'm having trouble overriding the maximum number of map tasks that


run
  

on


a
  

given machine in my cluster. The default value of
mapred.tasktracker.map.tasks.maximum is set to 2 in


hadoop-default.xml.
  

When
running my job I passed

-jobconf mapred.tasktracker.map.tasks.maximum=1

to limit map tasks to one per machine but each machine was still


allocated
  

2
map tasks (simultaneously).  The only way I was able to guarantee a


maximum
  

of one map task per machine was to change the value of the property


in
  

hadoop-site.xml. This is unsatisfactory since I'll often be


changing


the


maximum on a per job basis. Any hints?

On a different note, when I attempt to pass params via -D I get a


usage
  

message; when I use -jobconf the command goes through (and works in


the
  

case
of mapred.reduce.tasks=0 for example) but I get  a deprecation


warning).


Thanks,
John




--
M. Raşit ÖZDAŞ

  


  




Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread Rasit OZDAS
I see, John.
I also use 0.19, just to note, -D option should come first, since it's one
of generic options. I use it without any errors.

Cheers,
Rasit

2009/2/18 S D 

> Thanks for your response Rasit. You may have missed a portion of my post.
>
> > On a different note, when I attempt to pass params via -D I get a usage
> message; when I use
> > -jobconf the command goes through (and works in the case of
> mapred.reduce.tasks=0 for
> > example) but I get  a deprecation warning).
>
> I'm using Hadoop 0.19.0 and -D is not working. Are you using version 0.19.0
> as well?
>
> John
>
>
> On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS  wrote:
>
> > John, did you try -D option instead of -jobconf,
> >
> > I had -D option in my code, I changed it with -jobconf, this is what I
> get:
> >
> > ...
> > ...
> > Options:
> >  -input DFS input file(s) for the Map step
> >  -outputDFS output directory for the Reduce step
> >  -mapper The streaming command to run
> >  -combiner  Combiner has to be a Java class
> >  -reducerThe streaming command to run
> >  -file  File/dir to be shipped in the Job jar file
> >  -inputformat
> > TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
> > Optional.
> >  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
> >  -partitioner JavaClassName  Optional.
> >  -numReduceTasks   Optional.
> >  -inputreader   Optional.
> >  -cmdenv   =Optional. Pass env.var to streaming commands
> >  -mapdebug   Optional. To run this script when a map task fails
> >  -reducedebug   Optional. To run this script when a reduce task
> fails
> >
> >  -verbose
> >
> > Generic options supported are
> > -conf  specify an application configuration file
> > -D use value for given property
> > -fs   specify a namenode
> > -jt specify a job tracker
> > -files specify comma separated files
> to
> > be copied to the map reduce cluster
> > -libjars specify comma separated jar
> > files
> > to include in the classpath.
> > -archives specify comma separated
> > archives to be unarchived on the compute machines.
> >
> > The general command line syntax is
> > bin/hadoop command [genericOptions] [commandOptions]
> >
> > For more details about these options:
> > Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
> >
> >
> >
> > I think -jobconf is not used in v.0.19 .
> >
> > 2009/2/18 S D 
> >
> > > I'm having trouble overriding the maximum number of map tasks that run
> on
> > a
> > > given machine in my cluster. The default value of
> > > mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml.
> > > When
> > > running my job I passed
> > >
> > > -jobconf mapred.tasktracker.map.tasks.maximum=1
> > >
> > > to limit map tasks to one per machine but each machine was still
> > allocated
> > > 2
> > > map tasks (simultaneously).  The only way I was able to guarantee a
> > maximum
> > > of one map task per machine was to change the value of the property in
> > > hadoop-site.xml. This is unsatisfactory since I'll often be changing
> the
> > > maximum on a per job basis. Any hints?
> > >
> > > On a different note, when I attempt to pass params via -D I get a usage
> > > message; when I use -jobconf the command goes through (and works in the
> > > case
> > > of mapred.reduce.tasks=0 for example) but I get  a deprecation
> warning).
> > >
> > > Thanks,
> > > John
> > >
> >
> >
> >
> > --
> > M. Raşit ÖZDAŞ
> >
>



-- 
M. Raşit ÖZDAŞ


Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread jason hadoop
I certainly hope it changes but I am unaware that it is in the todo queue at
present.

2009/2/18 S D 

> Thanks Jason. That's useful information. Are you aware of plans to change
> this so that the maximum values can be changed without restarting the
> server?
>
> John
>
> 2009/2/18 jason hadoop 
>
> > The .maximum values are only loaded by the Tasktrackers at server start
> > time
> > at present, and any changes you make will be ignored.
> >
> >
> > 2009/2/18 S D 
> >
> > > Thanks for your response Rasit. You may have missed a portion of my
> post.
> > >
> > > > On a different note, when I attempt to pass params via -D I get a
> usage
> > > message; when I use
> > > > -jobconf the command goes through (and works in the case of
> > > mapred.reduce.tasks=0 for
> > > > example) but I get  a deprecation warning).
> > >
> > > I'm using Hadoop 0.19.0 and -D is not working. Are you using version
> > 0.19.0
> > > as well?
> > >
> > > John
> > >
> > >
> > > On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS 
> > wrote:
> > >
> > > > John, did you try -D option instead of -jobconf,
> > > >
> > > > I had -D option in my code, I changed it with -jobconf, this is what
> I
> > > get:
> > > >
> > > > ...
> > > > ...
> > > > Options:
> > > >  -input DFS input file(s) for the Map step
> > > >  -outputDFS output directory for the Reduce step
> > > >  -mapper The streaming command to run
> > > >  -combiner  Combiner has to be a Java class
> > > >  -reducerThe streaming command to run
> > > >  -file  File/dir to be shipped in the Job jar file
> > > >  -inputformat
> > > > TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
> > > > Optional.
> > > >  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
> > > >  -partitioner JavaClassName  Optional.
> > > >  -numReduceTasks   Optional.
> > > >  -inputreader   Optional.
> > > >  -cmdenv   =Optional. Pass env.var to streaming commands
> > > >  -mapdebug   Optional. To run this script when a map task fails
> > > >  -reducedebug   Optional. To run this script when a reduce task
> > > fails
> > > >
> > > >  -verbose
> > > >
> > > > Generic options supported are
> > > > -conf  specify an application configuration
> > file
> > > > -D use value for given property
> > > > -fs   specify a namenode
> > > > -jt specify a job tracker
> > > > -files specify comma separated
> files
> > > to
> > > > be copied to the map reduce cluster
> > > > -libjars specify comma separated
> jar
> > > > files
> > > > to include in the classpath.
> > > > -archives specify comma
> separated
> > > > archives to be unarchived on the compute machines.
> > > >
> > > > The general command line syntax is
> > > > bin/hadoop command [genericOptions] [commandOptions]
> > > >
> > > > For more details about these options:
> > > > Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
> > > >
> > > >
> > > >
> > > > I think -jobconf is not used in v.0.19 .
> > > >
> > > > 2009/2/18 S D 
> > > >
> > > > > I'm having trouble overriding the maximum number of map tasks that
> > run
> > > on
> > > > a
> > > > > given machine in my cluster. The default value of
> > > > > mapred.tasktracker.map.tasks.maximum is set to 2 in
> > hadoop-default.xml.
> > > > > When
> > > > > running my job I passed
> > > > >
> > > > > -jobconf mapred.tasktracker.map.tasks.maximum=1
> > > > >
> > > > > to limit map tasks to one per machine but each machine was still
> > > > allocated
> > > > > 2
> > > > > map tasks (simultaneously).  The only way I was able to guarantee a
> > > > maximum
> > > > > of one map task per machine was to change the value of the property
> > in
> > > > > hadoop-site.xml. This is unsatisfactory since I'll often be
> changing
> > > the
> > > > > maximum on a per job basis. Any hints?
> > > > >
> > > > > On a different note, when I attempt to pass params via -D I get a
> > usage
> > > > > message; when I use -jobconf the command goes through (and works in
> > the
> > > > > case
> > > > > of mapred.reduce.tasks=0 for example) but I get  a deprecation
> > > warning).
> > > > >
> > > > > Thanks,
> > > > > John
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > M. Raşit ÖZDAŞ
> > > >
> > >
> >
>


Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread S D
Thanks Jason. That's useful information. Are you aware of plans to change
this so that the maximum values can be changed without restarting the
server?

John

2009/2/18 jason hadoop 

> The .maximum values are only loaded by the Tasktrackers at server start
> time
> at present, and any changes you make will be ignored.
>
>
> 2009/2/18 S D 
>
> > Thanks for your response Rasit. You may have missed a portion of my post.
> >
> > > On a different note, when I attempt to pass params via -D I get a usage
> > message; when I use
> > > -jobconf the command goes through (and works in the case of
> > mapred.reduce.tasks=0 for
> > > example) but I get  a deprecation warning).
> >
> > I'm using Hadoop 0.19.0 and -D is not working. Are you using version
> 0.19.0
> > as well?
> >
> > John
> >
> >
> > On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS 
> wrote:
> >
> > > John, did you try -D option instead of -jobconf,
> > >
> > > I had -D option in my code, I changed it with -jobconf, this is what I
> > get:
> > >
> > > ...
> > > ...
> > > Options:
> > >  -input DFS input file(s) for the Map step
> > >  -outputDFS output directory for the Reduce step
> > >  -mapper The streaming command to run
> > >  -combiner  Combiner has to be a Java class
> > >  -reducerThe streaming command to run
> > >  -file  File/dir to be shipped in the Job jar file
> > >  -inputformat
> > > TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
> > > Optional.
> > >  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
> > >  -partitioner JavaClassName  Optional.
> > >  -numReduceTasks   Optional.
> > >  -inputreader   Optional.
> > >  -cmdenv   =Optional. Pass env.var to streaming commands
> > >  -mapdebug   Optional. To run this script when a map task fails
> > >  -reducedebug   Optional. To run this script when a reduce task
> > fails
> > >
> > >  -verbose
> > >
> > > Generic options supported are
> > > -conf  specify an application configuration
> file
> > > -D use value for given property
> > > -fs   specify a namenode
> > > -jt specify a job tracker
> > > -files specify comma separated files
> > to
> > > be copied to the map reduce cluster
> > > -libjars specify comma separated jar
> > > files
> > > to include in the classpath.
> > > -archives specify comma separated
> > > archives to be unarchived on the compute machines.
> > >
> > > The general command line syntax is
> > > bin/hadoop command [genericOptions] [commandOptions]
> > >
> > > For more details about these options:
> > > Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
> > >
> > >
> > >
> > > I think -jobconf is not used in v.0.19 .
> > >
> > > 2009/2/18 S D 
> > >
> > > > I'm having trouble overriding the maximum number of map tasks that
> run
> > on
> > > a
> > > > given machine in my cluster. The default value of
> > > > mapred.tasktracker.map.tasks.maximum is set to 2 in
> hadoop-default.xml.
> > > > When
> > > > running my job I passed
> > > >
> > > > -jobconf mapred.tasktracker.map.tasks.maximum=1
> > > >
> > > > to limit map tasks to one per machine but each machine was still
> > > allocated
> > > > 2
> > > > map tasks (simultaneously).  The only way I was able to guarantee a
> > > maximum
> > > > of one map task per machine was to change the value of the property
> in
> > > > hadoop-site.xml. This is unsatisfactory since I'll often be changing
> > the
> > > > maximum on a per job basis. Any hints?
> > > >
> > > > On a different note, when I attempt to pass params via -D I get a
> usage
> > > > message; when I use -jobconf the command goes through (and works in
> the
> > > > case
> > > > of mapred.reduce.tasks=0 for example) but I get  a deprecation
> > warning).
> > > >
> > > > Thanks,
> > > > John
> > > >
> > >
> > >
> > >
> > > --
> > > M. Raşit ÖZDAŞ
> > >
> >
>


Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread jason hadoop
The .maximum values are only loaded by the Tasktrackers at server start time
at present, and any changes you make will be ignored.


2009/2/18 S D 

> Thanks for your response Rasit. You may have missed a portion of my post.
>
> > On a different note, when I attempt to pass params via -D I get a usage
> message; when I use
> > -jobconf the command goes through (and works in the case of
> mapred.reduce.tasks=0 for
> > example) but I get  a deprecation warning).
>
> I'm using Hadoop 0.19.0 and -D is not working. Are you using version 0.19.0
> as well?
>
> John
>
>
> On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS  wrote:
>
> > John, did you try -D option instead of -jobconf,
> >
> > I had -D option in my code, I changed it with -jobconf, this is what I
> get:
> >
> > ...
> > ...
> > Options:
> >  -input DFS input file(s) for the Map step
> >  -outputDFS output directory for the Reduce step
> >  -mapper The streaming command to run
> >  -combiner  Combiner has to be a Java class
> >  -reducerThe streaming command to run
> >  -file  File/dir to be shipped in the Job jar file
> >  -inputformat
> > TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
> > Optional.
> >  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
> >  -partitioner JavaClassName  Optional.
> >  -numReduceTasks   Optional.
> >  -inputreader   Optional.
> >  -cmdenv   =Optional. Pass env.var to streaming commands
> >  -mapdebug   Optional. To run this script when a map task fails
> >  -reducedebug   Optional. To run this script when a reduce task
> fails
> >
> >  -verbose
> >
> > Generic options supported are
> > -conf  specify an application configuration file
> > -D use value for given property
> > -fs   specify a namenode
> > -jt specify a job tracker
> > -files specify comma separated files
> to
> > be copied to the map reduce cluster
> > -libjars specify comma separated jar
> > files
> > to include in the classpath.
> > -archives specify comma separated
> > archives to be unarchived on the compute machines.
> >
> > The general command line syntax is
> > bin/hadoop command [genericOptions] [commandOptions]
> >
> > For more details about these options:
> > Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
> >
> >
> >
> > I think -jobconf is not used in v.0.19 .
> >
> > 2009/2/18 S D 
> >
> > > I'm having trouble overriding the maximum number of map tasks that run
> on
> > a
> > > given machine in my cluster. The default value of
> > > mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml.
> > > When
> > > running my job I passed
> > >
> > > -jobconf mapred.tasktracker.map.tasks.maximum=1
> > >
> > > to limit map tasks to one per machine but each machine was still
> > allocated
> > > 2
> > > map tasks (simultaneously).  The only way I was able to guarantee a
> > maximum
> > > of one map task per machine was to change the value of the property in
> > > hadoop-site.xml. This is unsatisfactory since I'll often be changing
> the
> > > maximum on a per job basis. Any hints?
> > >
> > > On a different note, when I attempt to pass params via -D I get a usage
> > > message; when I use -jobconf the command goes through (and works in the
> > > case
> > > of mapred.reduce.tasks=0 for example) but I get  a deprecation
> warning).
> > >
> > > Thanks,
> > > John
> > >
> >
> >
> >
> > --
> > M. Raşit ÖZDAŞ
> >
>


Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread S D
Thanks for your response Rasit. You may have missed a portion of my post.

> On a different note, when I attempt to pass params via -D I get a usage
message; when I use
> -jobconf the command goes through (and works in the case of
mapred.reduce.tasks=0 for
> example) but I get  a deprecation warning).

I'm using Hadoop 0.19.0 and -D is not working. Are you using version 0.19.0
as well?

John


On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS  wrote:

> John, did you try -D option instead of -jobconf,
>
> I had -D option in my code, I changed it with -jobconf, this is what I get:
>
> ...
> ...
> Options:
>  -input DFS input file(s) for the Map step
>  -outputDFS output directory for the Reduce step
>  -mapper The streaming command to run
>  -combiner  Combiner has to be a Java class
>  -reducerThe streaming command to run
>  -file  File/dir to be shipped in the Job jar file
>  -inputformat
> TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
> Optional.
>  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
>  -partitioner JavaClassName  Optional.
>  -numReduceTasks   Optional.
>  -inputreader   Optional.
>  -cmdenv   =Optional. Pass env.var to streaming commands
>  -mapdebug   Optional. To run this script when a map task fails
>  -reducedebug   Optional. To run this script when a reduce task fails
>
>  -verbose
>
> Generic options supported are
> -conf  specify an application configuration file
> -D use value for given property
> -fs   specify a namenode
> -jt specify a job tracker
> -files specify comma separated files to
> be copied to the map reduce cluster
> -libjars specify comma separated jar
> files
> to include in the classpath.
> -archives specify comma separated
> archives to be unarchived on the compute machines.
>
> The general command line syntax is
> bin/hadoop command [genericOptions] [commandOptions]
>
> For more details about these options:
> Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
>
>
>
> I think -jobconf is not used in v.0.19 .
>
> 2009/2/18 S D 
>
> > I'm having trouble overriding the maximum number of map tasks that run on
> a
> > given machine in my cluster. The default value of
> > mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml.
> > When
> > running my job I passed
> >
> > -jobconf mapred.tasktracker.map.tasks.maximum=1
> >
> > to limit map tasks to one per machine but each machine was still
> allocated
> > 2
> > map tasks (simultaneously).  The only way I was able to guarantee a
> maximum
> > of one map task per machine was to change the value of the property in
> > hadoop-site.xml. This is unsatisfactory since I'll often be changing the
> > maximum on a per job basis. Any hints?
> >
> > On a different note, when I attempt to pass params via -D I get a usage
> > message; when I use -jobconf the command goes through (and works in the
> > case
> > of mapred.reduce.tasks=0 for example) but I get  a deprecation warning).
> >
> > Thanks,
> > John
> >
>
>
>
> --
> M. Raşit ÖZDAŞ
>


Re: Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread Rasit OZDAS
John, did you try -D option instead of -jobconf,

I had -D option in my code, I changed it with -jobconf, this is what I get:

...
...
Options:
  -input DFS input file(s) for the Map step
  -outputDFS output directory for the Reduce step
  -mapper The streaming command to run
  -combiner  Combiner has to be a Java class
  -reducerThe streaming command to run
  -file  File/dir to be shipped in the Job jar file
  -inputformat
TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName
Optional.
  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
  -partitioner JavaClassName  Optional.
  -numReduceTasks   Optional.
  -inputreader   Optional.
  -cmdenv   =Optional. Pass env.var to streaming commands
  -mapdebug   Optional. To run this script when a map task fails
  -reducedebug   Optional. To run this script when a reduce task fails

  -verbose

Generic options supported are
-conf  specify an application configuration file
-D use value for given property
-fs   specify a namenode
-jt specify a job tracker
-files specify comma separated files to
be copied to the map reduce cluster
-libjars specify comma separated jar files
to include in the classpath.
-archives specify comma separated
archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info



I think -jobconf is not used in v.0.19 .

2009/2/18 S D 

> I'm having trouble overriding the maximum number of map tasks that run on a
> given machine in my cluster. The default value of
> mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml.
> When
> running my job I passed
>
> -jobconf mapred.tasktracker.map.tasks.maximum=1
>
> to limit map tasks to one per machine but each machine was still allocated
> 2
> map tasks (simultaneously).  The only way I was able to guarantee a maximum
> of one map task per machine was to change the value of the property in
> hadoop-site.xml. This is unsatisfactory since I'll often be changing the
> maximum on a per job basis. Any hints?
>
> On a different note, when I attempt to pass params via -D I get a usage
> message; when I use -jobconf the command goes through (and works in the
> case
> of mapred.reduce.tasks=0 for example) but I get  a deprecation warning).
>
> Thanks,
> John
>



-- 
M. Raşit ÖZDAŞ


Overriding mapred.tasktracker.map.tasks.maximum with -jobconf

2009-02-18 Thread S D
I'm having trouble overriding the maximum number of map tasks that run on a
given machine in my cluster. The default value of
mapred.tasktracker.map.tasks.maximum is set to 2 in hadoop-default.xml. When
running my job I passed

-jobconf mapred.tasktracker.map.tasks.maximum=1

to limit map tasks to one per machine but each machine was still allocated 2
map tasks (simultaneously).  The only way I was able to guarantee a maximum
of one map task per machine was to change the value of the property in
hadoop-site.xml. This is unsatisfactory since I'll often be changing the
maximum on a per job basis. Any hints?

On a different note, when I attempt to pass params via -D I get a usage
message; when I use -jobconf the command goes through (and works in the case
of mapred.reduce.tasks=0 for example) but I get  a deprecation warning).

Thanks,
John