Re: Passing Command-line Parameters to the Job Submit Command

2012-09-26 Thread Varad Meru
Thanks Hemanth,

Yes, the java variables passed as -Dkey=value. But for the arguments passed to 
the main method (i.e. String[] args) I cannot find any other way to pass them 
apart from hadoop jar CLASSNAME arguments. So if I have a job file, I'll will 
compulsorily have to use the java variables, and not the command line arguments.

Thanks,
Varad

On 25-Sep-2012, at 12:40 PM, Hemanth Yamijala wrote:

 By java environment variables, do you mean the ones passed as
 -Dkey=value ? That's one way of passing them. I suppose another way is
 to have a client side site configuration (like mapred-site.xml) that
 is in the classpath of the client app.
 
 Thanks
 Hemanth
 
 On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru meru.va...@gmail.com wrote:
 Thanks Hemanth,
 
 But in general, if we want to pass arguments to any job (not only
 PiEstimator from examples-jar) and submit the Job to the Job queue
 scheduler, by the looks of it, we might always need to use the java
 environment variables only.
 
 Is my above assumption correct?
 
 Thanks,
 Varad
 
 On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala yhema...@gmail.comwrote:
 
 Varad,
 
 Looking at the code for the PiEstimator class which implements the
 'pi' example, the two arguments are mandatory and are used *before*
 the job is submitted for execution - i.e on the client side. In
 particular, one of them (nSamples) is used not by the MapReduce job,
 but by the client code (i.e. PiEstimator) to generate some input.
 
 Hence, I believe all of this additional work that is being done by the
 PiEstimator class will be bypassed if we directly use the job -submit
 command. In other words, I don't think these two ways of running the
 job:
 
 - using the hadoop jar examples pi
 - using hadoop job -submit
 
 are equivalent.
 
 As a general answer to your question though, if additional parameters
 are used by the Mappers or reducers, then they will generally be set
 as additional job specific configuration items. So, one way of using
 them with the job -submit command will be to find out the specific
 names of the configuration items (from code, or some other
 documentation), and include them in the job.xml used when submitting
 the job.
 
 Thanks
 Hemanth
 
 On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com wrote:
 Hi,
 
 I want to run the PiEstimator example from using the following command
 
 $hadoop job -submit pieestimatorconf.xml
 
 which contains all the info required by hadoop to run the job. E.g. the
 input file location, the output file location and other details.
 
 
 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
 propertynamemapred.map.tasks/namevalue20/value/property
 propertynamemapred.reduce.tasks/namevalue2/value/property
 ...
 propertynamemapred.job.name
 /namevaluePiEstimator/value/property
 
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property
 
 Now, as we now, to run the PiEstimator, we can use the following command
 too
 
 $hadoop jar hadoop-examples.1.0.3 pi 5 10
 
 where 5 and 10 are the arguments to the main class of the PiEstimator.
 How
 can I pass the same arguments (5 and 10) using the job -submit command
 through conf. file or any other way, without changing the code of the
 examples to reflect the use of environment variables.
 
 Thanks in advance,
 Varad
 
 -
 Varad Meru
 Software Engineer,
 Business Intelligence and Analytics,
 Persistent Systems and Solutions Ltd.,
 Pune, India.
 



Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Hemanth Yamijala
By java environment variables, do you mean the ones passed as
-Dkey=value ? That's one way of passing them. I suppose another way is
to have a client side site configuration (like mapred-site.xml) that
is in the classpath of the client app.

Thanks
Hemanth

On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru meru.va...@gmail.com wrote:
 Thanks Hemanth,

 But in general, if we want to pass arguments to any job (not only
 PiEstimator from examples-jar) and submit the Job to the Job queue
 scheduler, by the looks of it, we might always need to use the java
 environment variables only.

 Is my above assumption correct?

 Thanks,
 Varad

 On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala yhema...@gmail.comwrote:

 Varad,

 Looking at the code for the PiEstimator class which implements the
 'pi' example, the two arguments are mandatory and are used *before*
 the job is submitted for execution - i.e on the client side. In
 particular, one of them (nSamples) is used not by the MapReduce job,
 but by the client code (i.e. PiEstimator) to generate some input.

 Hence, I believe all of this additional work that is being done by the
 PiEstimator class will be bypassed if we directly use the job -submit
 command. In other words, I don't think these two ways of running the
 job:

 - using the hadoop jar examples pi
 - using hadoop job -submit

 are equivalent.

 As a general answer to your question though, if additional parameters
 are used by the Mappers or reducers, then they will generally be set
 as additional job specific configuration items. So, one way of using
 them with the job -submit command will be to find out the specific
 names of the configuration items (from code, or some other
 documentation), and include them in the job.xml used when submitting
 the job.

 Thanks
 Hemanth

 On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com wrote:
  Hi,
 
  I want to run the PiEstimator example from using the following command
 
  $hadoop job -submit pieestimatorconf.xml
 
  which contains all the info required by hadoop to run the job. E.g. the
  input file location, the output file location and other details.
 
 
 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
  propertynamemapred.map.tasks/namevalue20/value/property
  propertynamemapred.reduce.tasks/namevalue2/value/property
  ...
  propertynamemapred.job.name
 /namevaluePiEstimator/value/property
 
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property
 
  Now, as we now, to run the PiEstimator, we can use the following command
 too
 
  $hadoop jar hadoop-examples.1.0.3 pi 5 10
 
  where 5 and 10 are the arguments to the main class of the PiEstimator.
 How
  can I pass the same arguments (5 and 10) using the job -submit command
  through conf. file or any other way, without changing the code of the
  examples to reflect the use of environment variables.
 
  Thanks in advance,
  Varad
 
  -
  Varad Meru
  Software Engineer,
  Business Intelligence and Analytics,
  Persistent Systems and Solutions Ltd.,
  Pune, India.



Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Bertrand Dechoux
Building on Hemanth answer : at the end your variables should be in the
job.xml (the second file needed with the jar to run a job). Building this
job.xml can be done in various way but it does inherit from your local
configuration and you can change it using the java API but at the end it is
only a xml file so you are not hand tied.

I know there is a job file that you can provide with the shell command :
http://hadoop.apache.org/docs/r1.0.3/commands_manual.html#job

But I haven't used it yet so I can tell you more about this option.

Regards

Bertrand

On Tue, Sep 25, 2012 at 9:10 AM, Hemanth Yamijala yhema...@gmail.comwrote:

 By java environment variables, do you mean the ones passed as
 -Dkey=value ? That's one way of passing them. I suppose another way is
 to have a client side site configuration (like mapred-site.xml) that
 is in the classpath of the client app.

 Thanks
 Hemanth

 On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru meru.va...@gmail.com wrote:
  Thanks Hemanth,
 
  But in general, if we want to pass arguments to any job (not only
  PiEstimator from examples-jar) and submit the Job to the Job queue
  scheduler, by the looks of it, we might always need to use the java
  environment variables only.
 
  Is my above assumption correct?
 
  Thanks,
  Varad
 
  On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala yhema...@gmail.com
 wrote:
 
  Varad,
 
  Looking at the code for the PiEstimator class which implements the
  'pi' example, the two arguments are mandatory and are used *before*
  the job is submitted for execution - i.e on the client side. In
  particular, one of them (nSamples) is used not by the MapReduce job,
  but by the client code (i.e. PiEstimator) to generate some input.
 
  Hence, I believe all of this additional work that is being done by the
  PiEstimator class will be bypassed if we directly use the job -submit
  command. In other words, I don't think these two ways of running the
  job:
 
  - using the hadoop jar examples pi
  - using hadoop job -submit
 
  are equivalent.
 
  As a general answer to your question though, if additional parameters
  are used by the Mappers or reducers, then they will generally be set
  as additional job specific configuration items. So, one way of using
  them with the job -submit command will be to find out the specific
  names of the configuration items (from code, or some other
  documentation), and include them in the job.xml used when submitting
  the job.
 
  Thanks
  Hemanth
 
  On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com
 wrote:
   Hi,
  
   I want to run the PiEstimator example from using the following command
  
   $hadoop job -submit pieestimatorconf.xml
  
   which contains all the info required by hadoop to run the job. E.g.
 the
   input file location, the output file location and other details.
  
  
 
 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
   propertynamemapred.map.tasks/namevalue20/value/property
   propertynamemapred.reduce.tasks/namevalue2/value/property
   ...
   propertynamemapred.job.name
  /namevaluePiEstimator/value/property
  
 
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property
  
   Now, as we now, to run the PiEstimator, we can use the following
 command
  too
  
   $hadoop jar hadoop-examples.1.0.3 pi 5 10
  
   where 5 and 10 are the arguments to the main class of the PiEstimator.
  How
   can I pass the same arguments (5 and 10) using the job -submit command
   through conf. file or any other way, without changing the code of the
   examples to reflect the use of environment variables.
  
   Thanks in advance,
   Varad
  
   -
   Varad Meru
   Software Engineer,
   Business Intelligence and Analytics,
   Persistent Systems and Solutions Ltd.,
   Pune, India.
 




-- 
Bertrand Dechoux


Re: Passing Command-line Parameters to the Job Submit Command

2012-09-25 Thread Mohit Anchlia
You could always write your own properties file and read it as resource.

On Tue, Sep 25, 2012 at 12:10 AM, Hemanth Yamijala yhema...@gmail.comwrote:

 By java environment variables, do you mean the ones passed as
 -Dkey=value ? That's one way of passing them. I suppose another way is
 to have a client side site configuration (like mapred-site.xml) that
 is in the classpath of the client app.

 Thanks
 Hemanth

 On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru meru.va...@gmail.com wrote:
  Thanks Hemanth,
 
  But in general, if we want to pass arguments to any job (not only
  PiEstimator from examples-jar) and submit the Job to the Job queue
  scheduler, by the looks of it, we might always need to use the java
  environment variables only.
 
  Is my above assumption correct?
 
  Thanks,
  Varad
 
  On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala yhema...@gmail.com
 wrote:
 
  Varad,
 
  Looking at the code for the PiEstimator class which implements the
  'pi' example, the two arguments are mandatory and are used *before*
  the job is submitted for execution - i.e on the client side. In
  particular, one of them (nSamples) is used not by the MapReduce job,
  but by the client code (i.e. PiEstimator) to generate some input.
 
  Hence, I believe all of this additional work that is being done by the
  PiEstimator class will be bypassed if we directly use the job -submit
  command. In other words, I don't think these two ways of running the
  job:
 
  - using the hadoop jar examples pi
  - using hadoop job -submit
 
  are equivalent.
 
  As a general answer to your question though, if additional parameters
  are used by the Mappers or reducers, then they will generally be set
  as additional job specific configuration items. So, one way of using
  them with the job -submit command will be to find out the specific
  names of the configuration items (from code, or some other
  documentation), and include them in the job.xml used when submitting
  the job.
 
  Thanks
  Hemanth
 
  On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com
 wrote:
   Hi,
  
   I want to run the PiEstimator example from using the following command
  
   $hadoop job -submit pieestimatorconf.xml
  
   which contains all the info required by hadoop to run the job. E.g.
 the
   input file location, the output file location and other details.
  
  
 
 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
   propertynamemapred.map.tasks/namevalue20/value/property
   propertynamemapred.reduce.tasks/namevalue2/value/property
   ...
   propertynamemapred.job.name
  /namevaluePiEstimator/value/property
  
 
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property
  
   Now, as we now, to run the PiEstimator, we can use the following
 command
  too
  
   $hadoop jar hadoop-examples.1.0.3 pi 5 10
  
   where 5 and 10 are the arguments to the main class of the PiEstimator.
  How
   can I pass the same arguments (5 and 10) using the job -submit command
   through conf. file or any other way, without changing the code of the
   examples to reflect the use of environment variables.
  
   Thanks in advance,
   Varad
  
   -
   Varad Meru
   Software Engineer,
   Business Intelligence and Analytics,
   Persistent Systems and Solutions Ltd.,
   Pune, India.
 



Re: Passing Command-line Parameters to the Job Submit Command

2012-09-24 Thread Varad Meru
Thanks Hemanth,

But in general, if we want to pass arguments to any job (not only
PiEstimator from examples-jar) and submit the Job to the Job queue
scheduler, by the looks of it, we might always need to use the java
environment variables only.

Is my above assumption correct?

Thanks,
Varad

On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala yhema...@gmail.comwrote:

 Varad,

 Looking at the code for the PiEstimator class which implements the
 'pi' example, the two arguments are mandatory and are used *before*
 the job is submitted for execution - i.e on the client side. In
 particular, one of them (nSamples) is used not by the MapReduce job,
 but by the client code (i.e. PiEstimator) to generate some input.

 Hence, I believe all of this additional work that is being done by the
 PiEstimator class will be bypassed if we directly use the job -submit
 command. In other words, I don't think these two ways of running the
 job:

 - using the hadoop jar examples pi
 - using hadoop job -submit

 are equivalent.

 As a general answer to your question though, if additional parameters
 are used by the Mappers or reducers, then they will generally be set
 as additional job specific configuration items. So, one way of using
 them with the job -submit command will be to find out the specific
 names of the configuration items (from code, or some other
 documentation), and include them in the job.xml used when submitting
 the job.

 Thanks
 Hemanth

 On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com wrote:
  Hi,
 
  I want to run the PiEstimator example from using the following command
 
  $hadoop job -submit pieestimatorconf.xml
 
  which contains all the info required by hadoop to run the job. E.g. the
  input file location, the output file location and other details.
 
 
 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
  propertynamemapred.map.tasks/namevalue20/value/property
  propertynamemapred.reduce.tasks/namevalue2/value/property
  ...
  propertynamemapred.job.name
 /namevaluePiEstimator/value/property
 
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property
 
  Now, as we now, to run the PiEstimator, we can use the following command
 too
 
  $hadoop jar hadoop-examples.1.0.3 pi 5 10
 
  where 5 and 10 are the arguments to the main class of the PiEstimator.
 How
  can I pass the same arguments (5 and 10) using the job -submit command
  through conf. file or any other way, without changing the code of the
  examples to reflect the use of environment variables.
 
  Thanks in advance,
  Varad
 
  -
  Varad Meru
  Software Engineer,
  Business Intelligence and Analytics,
  Persistent Systems and Solutions Ltd.,
  Pune, India.



Passing Command-line Parameters to the Job Submit Command

2012-09-23 Thread Varad Meru
Hi,

I want to run the PiEstimator example from using the following command

$hadoop job -submit pieestimatorconf.xml

which contains all the info required by hadoop to run the job. E.g. the
input file location, the output file location and other details.

propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
propertynamemapred.map.tasks/namevalue20/value/property
propertynamemapred.reduce.tasks/namevalue2/value/property
...
propertynamemapred.job.name/namevaluePiEstimator/value/property
propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property

Now, as we now, to run the PiEstimator, we can use the following command too

$hadoop jar hadoop-examples.1.0.3 pi 5 10

where 5 and 10 are the arguments to the main class of the PiEstimator. How
can I pass the same arguments (5 and 10) using the job -submit command
through conf. file or any other way, without changing the code of the
examples to reflect the use of environment variables.

Thanks in advance,
Varad

-
Varad Meru
Software Engineer,
Business Intelligence and Analytics,
Persistent Systems and Solutions Ltd.,
Pune, India.


Re: Passing Command-line Parameters to the Job Submit Command

2012-09-23 Thread Hemanth Yamijala
Varad,

Looking at the code for the PiEstimator class which implements the
'pi' example, the two arguments are mandatory and are used *before*
the job is submitted for execution - i.e on the client side. In
particular, one of them (nSamples) is used not by the MapReduce job,
but by the client code (i.e. PiEstimator) to generate some input.

Hence, I believe all of this additional work that is being done by the
PiEstimator class will be bypassed if we directly use the job -submit
command. In other words, I don't think these two ways of running the
job:

- using the hadoop jar examples pi
- using hadoop job -submit

are equivalent.

As a general answer to your question though, if additional parameters
are used by the Mappers or reducers, then they will generally be set
as additional job specific configuration items. So, one way of using
them with the job -submit command will be to find out the specific
names of the configuration items (from code, or some other
documentation), and include them in the job.xml used when submitting
the job.

Thanks
Hemanth

On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru meru.va...@gmail.com wrote:
 Hi,

 I want to run the PiEstimator example from using the following command

 $hadoop job -submit pieestimatorconf.xml

 which contains all the info required by hadoop to run the job. E.g. the
 input file location, the output file location and other details.

 propertynamemapred.jar/namevaluefile:Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar/value/property
 propertynamemapred.map.tasks/namevalue20/value/property
 propertynamemapred.reduce.tasks/namevalue2/value/property
 ...
 propertynamemapred.job.name/namevaluePiEstimator/value/property
 propertynamemapred.output.dir/namevaluefile:Users/varadmeru/Work/out/value/property

 Now, as we now, to run the PiEstimator, we can use the following command too

 $hadoop jar hadoop-examples.1.0.3 pi 5 10

 where 5 and 10 are the arguments to the main class of the PiEstimator. How
 can I pass the same arguments (5 and 10) using the job -submit command
 through conf. file or any other way, without changing the code of the
 examples to reflect the use of environment variables.

 Thanks in advance,
 Varad

 -
 Varad Meru
 Software Engineer,
 Business Intelligence and Analytics,
 Persistent Systems and Solutions Ltd.,
 Pune, India.