Re: Passing Command-line Parameters to the Job Submit Command

Hemanth Yamijala Tue, 25 Sep 2012 00:10:45 -0700

By java environment variables, do you mean the ones passed as
-Dkey=value ? That's one way of passing them. I suppose another way is
to have a client side site configuration (like mapred-site.xml) that
is in the classpath of the client app.


Thanks
Hemanth

On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru <meru.va...@gmail.com> wrote:
> Thanks Hemanth,
>
> But in general, if we want to pass arguments to any job (not only
> PiEstimator from examples-jar) and submit the Job to the Job queue
> scheduler, by the looks of it, we might always need to use the java
> environment variables only.
>
> Is my above assumption correct?
>
> Thanks,
> Varad
>
> On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala <yhema...@gmail.com>wrote:
>
>> Varad,
>>
>> Looking at the code for the PiEstimator class which implements the
>> 'pi' example, the two arguments are mandatory and are used *before*
>> the job is submitted for execution - i.e on the client side. In
>> particular, one of them (nSamples) is used not by the MapReduce job,
>> but by the client code (i.e. PiEstimator) to generate some input.
>>
>> Hence, I believe all of this additional work that is being done by the
>> PiEstimator class will be bypassed if we directly use the job -submit
>> command. In other words, I don't think these two ways of running the
>> job:
>>
>> - using the "hadoop jar examples pi"
>> - using hadoop job -submit
>>
>> are equivalent.
>>
>> As a general answer to your question though, if additional parameters
>> are used by the Mappers or reducers, then they will generally be set
>> as additional job specific configuration items. So, one way of using
>> them with the job -submit command will be to find out the specific
>> names of the configuration items (from code, or some other
>> documentation), and include them in the job.xml used when submitting
>> the job.
>>
>> Thanks
>> Hemanth
>>
>> On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru <meru.va...@gmail.com> wrote:
>> > Hi,
>> >
>> > I want to run the PiEstimator example from using the following command
>> >
>> > $hadoop job -submit pieestimatorconf.xml
>> >
>> > which contains all the info required by hadoop to run the job. E.g. the
>> > input file location, the output file location and other details.
>> >
>> >
>> <property><name>mapred.jar</name><value>file:////Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar</value></property>
>> > <property><name>mapred.map.tasks</name><value>20</value></property>
>> > <property><name>mapred.reduce.tasks</name><value>2</value></property>
>> > ...
>> > <property><name>mapred.job.name
>> </name><value>PiEstimator</value></property>
>> >
>> <property><name>mapred.output.dir</name><value>file:////Users/varadmeru/Work/out</value></property>
>> >
>> > Now, as we now, to run the PiEstimator, we can use the following command
>> too
>> >
>> > $hadoop jar hadoop-examples.1.0.3 pi 5 10
>> >
>> > where 5 and 10 are the arguments to the main class of the PiEstimator.
>> How
>> > can I pass the same arguments (5 and 10) using the job -submit command
>> > through conf. file or any other way, without changing the code of the
>> > examples to reflect the use of environment variables.
>> >
>> > Thanks in advance,
>> > Varad
>> >
>> > -----------------
>> > Varad Meru
>> > Software Engineer,
>> > Business Intelligence and Analytics,
>> > Persistent Systems and Solutions Ltd.,
>> > Pune, India.
>>

Re: Passing Command-line Parameters to the Job Submit Command

Reply via email to