Re: Submitting and running hadoop jobs Programmatically

2011-07-27 Thread Harsh J
Madhu,

Ditch the '*' in the classpath element that has the configuration
directory. The directory ought to be on the classpath, not the files
AFAIK.

Try and let us know if it then picks up the proper config (right now,
its using the local mode).

On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak phatak@gmail.com wrote:
 Hi
 I am submitting the job as follows

 java -cp
  Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
 com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
 kkk11fffrrw 1

 I get the log in CLI as below

 11/07/27 10:22:54 INFO security.Groups: Group mapping
 impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
 cacheTimeout=30
 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
 processName=JobTracker, sessionId=
 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
 processName=JobTracker, sessionId= - already initialized
 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for
 parsing the arguments. Applications should implement Tool for the same.
 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area
 file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001

 It doesn't create any job in hadoop.

 On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote:

 Madhu,

  Can you check the client logs, whether any error/exception is coming while
 submitting the job?

 Devaraj K

 -Original Message-
 From: Harsh J [mailto:ha...@cloudera.com]
 Sent: Tuesday, July 26, 2011 5:01 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Submitting and running hadoop jobs Programmatically

 Yes. Internally, it calls regular submit APIs.

 On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com
 wrote:
  I am using JobControl.add() to add a job and running job control in
  a separate thread and using JobControl.allFinished() to see all jobs
  completed or not . Is this work same as Job.submit()??
 
  On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:
 
  Madhu,
 
  Do you get a specific error message / stack trace? Could you also
  paste your JT logs?
 
  On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
  wrote:
   Hi
    I am using the same APIs but i am not able to run the jobs by just
  adding
   the configuration files and jars . It never create a job in Hadoop ,
 it
  just
   shows cleaning up staging area and fails.
  
   On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com
 wrote:
  
   Hi Madhu,
  
     You can submit the jobs using the Job API's programmatically from
 any
   system. The job submission code can be written this way.
  
       // Create a new Job
       Job job = new Job(new Configuration());
       job.setJarByClass(MyJob.class);
  
       // Specify various job-specific parameters
       job.setJobName(myjob);
  
       job.setInputPath(new Path(in));
       job.setOutputPath(new Path(out));
  
       job.setMapperClass(MyJob.MyMapper.class);
       job.setReducerClass(MyJob.MyReducer.class);
  
       // Submit the job
       job.submit();
  
  
  
   For submitting this, need to add the hadoop jar files and
 configuration
   files in the class path of the application from where you want to
 submit
   the
   job.
  
   You can refer this docs for more info on Job API's.
  
  
 

 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
   uce/Job.html
  
  
  
   Devaraj K
  
   -Original Message-
   From: madhu phatak [mailto:phatak@gmail.com]
   Sent: Tuesday, July 26, 2011 3:29 PM
   To: common-user@hadoop.apache.org
   Subject: Submitting and running hadoop jobs Programmatically
  
   Hi,
    I am working on a open source project
   Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
   i am trying to create the hadoop jobs depending upon the user input.
 I
  was
   using Java Process API to run the bin/hadoop shell script to submit
 the
   jobs. But it seems not good way because the process creation model is
   not consistent across different operating systems . Is there any
 better
  way
   to submit the jobs rather than invoking the shell script? I am using
   hadoop-0.21.0 version and i am running my program in the same user
 where
   hadoop is installed . Some of the older thread told if I add
  configuration
   files in path it will work fine . But i am not able to run in that
 way
 .
  So
   anyone tried this before? If So , please can you give detailed
  instruction
   how to achieve it . Advanced thanks for your help.
  
   Regards,
   Madhukara Phatak
  
  
  
 
 
 
  --
  Harsh J
 
 



 --
 Harsh J






-- 
Harsh J


Re: Submitting and running hadoop jobs Programmatically

2011-07-27 Thread Steve Loughran

On 27/07/11 05:55, madhu phatak wrote:

Hi
I am submitting the job as follows

java -cp
  
Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
kkk11fffrrw 1


My code to submit jobs (via a declarative configuration) is up online

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590view=markup

It's LGPL, but ask nicely and I'll change the header to Apache.

That code doesn't set up the classpath by pushing out more JARs (I'm 
planning to push out .groovy scripts instead), but it can also poll for 
job completion, take a timeout (useful in small test runs), and do other 
things. I currently mainly use it for testing




Re: Submitting and running hadoop jobs Programmatically

2011-07-27 Thread madhu phatak
Thank you . Will have a look on it.

On Wed, Jul 27, 2011 at 3:28 PM, Steve Loughran ste...@apache.org wrote:

 On 27/07/11 05:55, madhu phatak wrote:

 Hi
 I am submitting the job as follows

 java -cp
  Nectar-analytics-0.0.1-**SNAPSHOT.jar:/home/hadoop/**
 hadoop-for-nectar/hadoop-0.21.**0/conf/*:$HADOOP_COMMON_HOME/**
 lib/*:$HADOOP_COMMON_HOME/*
 com.zinnia.nectar.regression.**hadoop.primitive.jobs.SigmaJob
 input/book.csv
 kkk11fffrrw 1


 My code to submit jobs (via a declarative configuration) is up online

 http://smartfrog.svn.**sourceforge.net/viewvc/**
 smartfrog/trunk/core/hadoop-**components/hadoop-ops/src/org/**
 smartfrog/services/hadoop/**operations/components/**
 submitter/SubmitterImpl.java?**revision=8590view=markuphttp://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590view=markup

 It's LGPL, but ask nicely and I'll change the header to Apache.

 That code doesn't set up the classpath by pushing out more JARs (I'm
 planning to push out .groovy scripts instead), but it can also poll for job
 completion, take a timeout (useful in small test runs), and do other things.
 I currently mainly use it for testing




Re: Submitting and running hadoop jobs Programmatically

2011-07-27 Thread madhu phatak
Thank you Harsha . I am able to run the jobs by ditching *.

On Wed, Jul 27, 2011 at 11:41 AM, Harsh J ha...@cloudera.com wrote:

 Madhu,

 Ditch the '*' in the classpath element that has the configuration
 directory. The directory ought to be on the classpath, not the files
 AFAIK.

 Try and let us know if it then picks up the proper config (right now,
 its using the local mode).

 On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak phatak@gmail.com
 wrote:
  Hi
  I am submitting the job as follows
 
  java -cp
 
  
 Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
  com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob
 input/book.csv
  kkk11fffrrw 1
 
  I get the log in CLI as below
 
  11/07/27 10:22:54 INFO security.Groups: Group mapping
  impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
  cacheTimeout=30
  11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
  processName=JobTracker, sessionId=
  11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
  processName=JobTracker, sessionId= - already initialized
  11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser
 for
  parsing the arguments. Applications should implement Tool for the same.
  11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging
 area
 
 file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001
 
  It doesn't create any job in hadoop.
 
  On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote:
 
  Madhu,
 
   Can you check the client logs, whether any error/exception is coming
 while
  submitting the job?
 
  Devaraj K
 
  -Original Message-
  From: Harsh J [mailto:ha...@cloudera.com]
  Sent: Tuesday, July 26, 2011 5:01 PM
  To: common-user@hadoop.apache.org
  Subject: Re: Submitting and running hadoop jobs Programmatically
 
  Yes. Internally, it calls regular submit APIs.
 
  On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com
  wrote:
   I am using JobControl.add() to add a job and running job control in
   a separate thread and using JobControl.allFinished() to see all jobs
   completed or not . Is this work same as Job.submit()??
  
   On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:
  
   Madhu,
  
   Do you get a specific error message / stack trace? Could you also
   paste your JT logs?
  
   On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
   wrote:
Hi
 I am using the same APIs but i am not able to run the jobs by just
   adding
the configuration files and jars . It never create a job in Hadoop
 ,
  it
   just
shows cleaning up staging area and fails.
   
On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com
  wrote:
   
Hi Madhu,
   
  You can submit the jobs using the Job API's programmatically
 from
  any
system. The job submission code can be written this way.
   
// Create a new Job
Job job = new Job(new Configuration());
job.setJarByClass(MyJob.class);
   
// Specify various job-specific parameters
job.setJobName(myjob);
   
job.setInputPath(new Path(in));
job.setOutputPath(new Path(out));
   
job.setMapperClass(MyJob.MyMapper.class);
job.setReducerClass(MyJob.MyReducer.class);
   
// Submit the job
job.submit();
   
   
   
For submitting this, need to add the hadoop jar files and
  configuration
files in the class path of the application from where you want to
  submit
the
job.
   
You can refer this docs for more info on Job API's.
   
   
  
 
 
 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
uce/Job.html
   
   
   
Devaraj K
   
-Original Message-
From: madhu phatak [mailto:phatak@gmail.com]
Sent: Tuesday, July 26, 2011 3:29 PM
To: common-user@hadoop.apache.org
Subject: Submitting and running hadoop jobs Programmatically
   
Hi,
 I am working on a open source project
Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
i am trying to create the hadoop jobs depending upon the user
 input.
  I
   was
using Java Process API to run the bin/hadoop shell script to
 submit
  the
jobs. But it seems not good way because the process creation model
 is
not consistent across different operating systems . Is there any
  better
   way
to submit the jobs rather than invoking the shell script? I am
 using
hadoop-0.21.0 version and i am running my program in the same user
  where
hadoop is installed . Some of the older thread told if I add
   configuration
files in path it will work fine . But i am not able to run in that
  way
  .
   So
anyone tried this before? If So , please can you give detailed
   instruction
how to achieve it . Advanced thanks for your help.
   
Regards

Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread Harsh J
A simple job.submit(…) OR JobClient.runJob(jobConf), submits your job
right from the Java API. Does this not work for you? If not, what
error do you face?

Forking out and launching from a system process is a bad idea unless
there's absolutely no way.

On Tue, Jul 26, 2011 at 3:28 PM, madhu phatak phatak@gmail.com wrote:
 Hi,
  I am working on a open source project
 Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
 i am trying to create the hadoop jobs depending upon the user input. I was
 using Java Process API to run the bin/hadoop shell script to submit the
 jobs. But it seems not good way because the process creation model is
 not consistent across different operating systems . Is there any better way
 to submit the jobs rather than invoking the shell script? I am using
 hadoop-0.21.0 version and i am running my program in the same user where
 hadoop is installed . Some of the older thread told if I add configuration
 files in path it will work fine . But i am not able to run in that way . So
 anyone tried this before? If So , please can you give detailed instruction
 how to achieve it . Advanced thanks for your help.

 Regards,
 Madhukara Phatak




-- 
Harsh J


RE: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread Devaraj K
Hi Madhu,

   You can submit the jobs using the Job API's programmatically from any
system. The job submission code can be written this way.

 // Create a new Job
 Job job = new Job(new Configuration());
 job.setJarByClass(MyJob.class);
 
 // Specify various job-specific parameters 
 job.setJobName(myjob);
 
 job.setInputPath(new Path(in));
 job.setOutputPath(new Path(out));
 
 job.setMapperClass(MyJob.MyMapper.class);
 job.setReducerClass(MyJob.MyReducer.class);

 // Submit the job
 job.submit();



For submitting this, need to add the hadoop jar files and configuration
files in the class path of the application from where you want to submit the
job. 

You can refer this docs for more info on Job API's.
http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
uce/Job.html



Devaraj K 

-Original Message-
From: madhu phatak [mailto:phatak@gmail.com] 
Sent: Tuesday, July 26, 2011 3:29 PM
To: common-user@hadoop.apache.org
Subject: Submitting and running hadoop jobs Programmatically

Hi,
  I am working on a open source project
Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
i am trying to create the hadoop jobs depending upon the user input. I was
using Java Process API to run the bin/hadoop shell script to submit the
jobs. But it seems not good way because the process creation model is
not consistent across different operating systems . Is there any better way
to submit the jobs rather than invoking the shell script? I am using
hadoop-0.21.0 version and i am running my program in the same user where
hadoop is installed . Some of the older thread told if I add configuration
files in path it will work fine . But i am not able to run in that way . So
anyone tried this before? If So , please can you give detailed instruction
how to achieve it . Advanced thanks for your help.

Regards,
Madhukara Phatak



Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread madhu phatak
Hi
 I am using the same APIs but i am not able to run the jobs by just adding
the configuration files and jars . It never create a job in Hadoop , it just
shows cleaning up staging area and fails.

On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote:

 Hi Madhu,

   You can submit the jobs using the Job API's programmatically from any
 system. The job submission code can be written this way.

 // Create a new Job
 Job job = new Job(new Configuration());
 job.setJarByClass(MyJob.class);

 // Specify various job-specific parameters
 job.setJobName(myjob);

 job.setInputPath(new Path(in));
 job.setOutputPath(new Path(out));

 job.setMapperClass(MyJob.MyMapper.class);
 job.setReducerClass(MyJob.MyReducer.class);

 // Submit the job
 job.submit();



 For submitting this, need to add the hadoop jar files and configuration
 files in the class path of the application from where you want to submit
 the
 job.

 You can refer this docs for more info on Job API's.

 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
 uce/Job.html



 Devaraj K

 -Original Message-
 From: madhu phatak [mailto:phatak@gmail.com]
 Sent: Tuesday, July 26, 2011 3:29 PM
 To: common-user@hadoop.apache.org
 Subject: Submitting and running hadoop jobs Programmatically

 Hi,
  I am working on a open source project
 Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
 i am trying to create the hadoop jobs depending upon the user input. I was
 using Java Process API to run the bin/hadoop shell script to submit the
 jobs. But it seems not good way because the process creation model is
 not consistent across different operating systems . Is there any better way
 to submit the jobs rather than invoking the shell script? I am using
 hadoop-0.21.0 version and i am running my program in the same user where
 hadoop is installed . Some of the older thread told if I add configuration
 files in path it will work fine . But i am not able to run in that way . So
 anyone tried this before? If So , please can you give detailed instruction
 how to achieve it . Advanced thanks for your help.

 Regards,
 Madhukara Phatak




Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread Harsh J
Madhu,

Do you get a specific error message / stack trace? Could you also
paste your JT logs?

On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote:
 Hi
  I am using the same APIs but i am not able to run the jobs by just adding
 the configuration files and jars . It never create a job in Hadoop , it just
 shows cleaning up staging area and fails.

 On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote:

 Hi Madhu,

   You can submit the jobs using the Job API's programmatically from any
 system. The job submission code can be written this way.

     // Create a new Job
     Job job = new Job(new Configuration());
     job.setJarByClass(MyJob.class);

     // Specify various job-specific parameters
     job.setJobName(myjob);

     job.setInputPath(new Path(in));
     job.setOutputPath(new Path(out));

     job.setMapperClass(MyJob.MyMapper.class);
     job.setReducerClass(MyJob.MyReducer.class);

     // Submit the job
     job.submit();



 For submitting this, need to add the hadoop jar files and configuration
 files in the class path of the application from where you want to submit
 the
 job.

 You can refer this docs for more info on Job API's.

 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
 uce/Job.html



 Devaraj K

 -Original Message-
 From: madhu phatak [mailto:phatak@gmail.com]
 Sent: Tuesday, July 26, 2011 3:29 PM
 To: common-user@hadoop.apache.org
 Subject: Submitting and running hadoop jobs Programmatically

 Hi,
  I am working on a open source project
 Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
 i am trying to create the hadoop jobs depending upon the user input. I was
 using Java Process API to run the bin/hadoop shell script to submit the
 jobs. But it seems not good way because the process creation model is
 not consistent across different operating systems . Is there any better way
 to submit the jobs rather than invoking the shell script? I am using
 hadoop-0.21.0 version and i am running my program in the same user where
 hadoop is installed . Some of the older thread told if I add configuration
 files in path it will work fine . But i am not able to run in that way . So
 anyone tried this before? If So , please can you give detailed instruction
 how to achieve it . Advanced thanks for your help.

 Regards,
 Madhukara Phatak






-- 
Harsh J


Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread madhu phatak
I am using JobControl.add() to add a job and running job control in
a separate thread and using JobControl.allFinished() to see all jobs
completed or not . Is this work same as Job.submit()??

On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:

 Madhu,

 Do you get a specific error message / stack trace? Could you also
 paste your JT logs?

 On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
 wrote:
  Hi
   I am using the same APIs but i am not able to run the jobs by just
 adding
  the configuration files and jars . It never create a job in Hadoop , it
 just
  shows cleaning up staging area and fails.
 
  On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote:
 
  Hi Madhu,
 
You can submit the jobs using the Job API's programmatically from any
  system. The job submission code can be written this way.
 
  // Create a new Job
  Job job = new Job(new Configuration());
  job.setJarByClass(MyJob.class);
 
  // Specify various job-specific parameters
  job.setJobName(myjob);
 
  job.setInputPath(new Path(in));
  job.setOutputPath(new Path(out));
 
  job.setMapperClass(MyJob.MyMapper.class);
  job.setReducerClass(MyJob.MyReducer.class);
 
  // Submit the job
  job.submit();
 
 
 
  For submitting this, need to add the hadoop jar files and configuration
  files in the class path of the application from where you want to submit
  the
  job.
 
  You can refer this docs for more info on Job API's.
 
 
 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
  uce/Job.html
 
 
 
  Devaraj K
 
  -Original Message-
  From: madhu phatak [mailto:phatak@gmail.com]
  Sent: Tuesday, July 26, 2011 3:29 PM
  To: common-user@hadoop.apache.org
  Subject: Submitting and running hadoop jobs Programmatically
 
  Hi,
   I am working on a open source project
  Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
  i am trying to create the hadoop jobs depending upon the user input. I
 was
  using Java Process API to run the bin/hadoop shell script to submit the
  jobs. But it seems not good way because the process creation model is
  not consistent across different operating systems . Is there any better
 way
  to submit the jobs rather than invoking the shell script? I am using
  hadoop-0.21.0 version and i am running my program in the same user where
  hadoop is installed . Some of the older thread told if I add
 configuration
  files in path it will work fine . But i am not able to run in that way .
 So
  anyone tried this before? If So , please can you give detailed
 instruction
  how to achieve it . Advanced thanks for your help.
 
  Regards,
  Madhukara Phatak
 
 
 



 --
 Harsh J



Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread Harsh J
Yes. Internally, it calls regular submit APIs.

On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote:
 I am using JobControl.add() to add a job and running job control in
 a separate thread and using JobControl.allFinished() to see all jobs
 completed or not . Is this work same as Job.submit()??

 On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:

 Madhu,

 Do you get a specific error message / stack trace? Could you also
 paste your JT logs?

 On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
 wrote:
  Hi
   I am using the same APIs but i am not able to run the jobs by just
 adding
  the configuration files and jars . It never create a job in Hadoop , it
 just
  shows cleaning up staging area and fails.
 
  On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote:
 
  Hi Madhu,
 
    You can submit the jobs using the Job API's programmatically from any
  system. The job submission code can be written this way.
 
      // Create a new Job
      Job job = new Job(new Configuration());
      job.setJarByClass(MyJob.class);
 
      // Specify various job-specific parameters
      job.setJobName(myjob);
 
      job.setInputPath(new Path(in));
      job.setOutputPath(new Path(out));
 
      job.setMapperClass(MyJob.MyMapper.class);
      job.setReducerClass(MyJob.MyReducer.class);
 
      // Submit the job
      job.submit();
 
 
 
  For submitting this, need to add the hadoop jar files and configuration
  files in the class path of the application from where you want to submit
  the
  job.
 
  You can refer this docs for more info on Job API's.
 
 
 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
  uce/Job.html
 
 
 
  Devaraj K
 
  -Original Message-
  From: madhu phatak [mailto:phatak@gmail.com]
  Sent: Tuesday, July 26, 2011 3:29 PM
  To: common-user@hadoop.apache.org
  Subject: Submitting and running hadoop jobs Programmatically
 
  Hi,
   I am working on a open source project
  Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
  i am trying to create the hadoop jobs depending upon the user input. I
 was
  using Java Process API to run the bin/hadoop shell script to submit the
  jobs. But it seems not good way because the process creation model is
  not consistent across different operating systems . Is there any better
 way
  to submit the jobs rather than invoking the shell script? I am using
  hadoop-0.21.0 version and i am running my program in the same user where
  hadoop is installed . Some of the older thread told if I add
 configuration
  files in path it will work fine . But i am not able to run in that way .
 So
  anyone tried this before? If So , please can you give detailed
 instruction
  how to achieve it . Advanced thanks for your help.
 
  Regards,
  Madhukara Phatak
 
 
 



 --
 Harsh J





-- 
Harsh J


RE: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread Devaraj K
Madhu,

 Can you check the client logs, whether any error/exception is coming while
submitting the job? 

Devaraj K 

-Original Message-
From: Harsh J [mailto:ha...@cloudera.com] 
Sent: Tuesday, July 26, 2011 5:01 PM
To: common-user@hadoop.apache.org
Subject: Re: Submitting and running hadoop jobs Programmatically

Yes. Internally, it calls regular submit APIs.

On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote:
 I am using JobControl.add() to add a job and running job control in
 a separate thread and using JobControl.allFinished() to see all jobs
 completed or not . Is this work same as Job.submit()??

 On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:

 Madhu,

 Do you get a specific error message / stack trace? Could you also
 paste your JT logs?

 On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
 wrote:
  Hi
   I am using the same APIs but i am not able to run the jobs by just
 adding
  the configuration files and jars . It never create a job in Hadoop , it
 just
  shows cleaning up staging area and fails.
 
  On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com
wrote:
 
  Hi Madhu,
 
    You can submit the jobs using the Job API's programmatically from
any
  system. The job submission code can be written this way.
 
      // Create a new Job
      Job job = new Job(new Configuration());
      job.setJarByClass(MyJob.class);
 
      // Specify various job-specific parameters
      job.setJobName(myjob);
 
      job.setInputPath(new Path(in));
      job.setOutputPath(new Path(out));
 
      job.setMapperClass(MyJob.MyMapper.class);
      job.setReducerClass(MyJob.MyReducer.class);
 
      // Submit the job
      job.submit();
 
 
 
  For submitting this, need to add the hadoop jar files and
configuration
  files in the class path of the application from where you want to
submit
  the
  job.
 
  You can refer this docs for more info on Job API's.
 
 

http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
  uce/Job.html
 
 
 
  Devaraj K
 
  -Original Message-
  From: madhu phatak [mailto:phatak@gmail.com]
  Sent: Tuesday, July 26, 2011 3:29 PM
  To: common-user@hadoop.apache.org
  Subject: Submitting and running hadoop jobs Programmatically
 
  Hi,
   I am working on a open source project
  Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
  i am trying to create the hadoop jobs depending upon the user input. I
 was
  using Java Process API to run the bin/hadoop shell script to submit
the
  jobs. But it seems not good way because the process creation model is
  not consistent across different operating systems . Is there any
better
 way
  to submit the jobs rather than invoking the shell script? I am using
  hadoop-0.21.0 version and i am running my program in the same user
where
  hadoop is installed . Some of the older thread told if I add
 configuration
  files in path it will work fine . But i am not able to run in that way
.
 So
  anyone tried this before? If So , please can you give detailed
 instruction
  how to achieve it . Advanced thanks for your help.
 
  Regards,
  Madhukara Phatak
 
 
 



 --
 Harsh J





-- 
Harsh J



Re: Submitting and running hadoop jobs Programmatically

2011-07-26 Thread madhu phatak
Hi
I am submitting the job as follows

java -cp
 
Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/*
com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv
kkk11fffrrw 1

I get the log in CLI as below

11/07/27 10:22:54 INFO security.Groups: Group mapping
impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping;
cacheTimeout=30
11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with
processName=JobTracker, sessionId= - already initialized
11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area
file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001

It doesn't create any job in hadoop.

On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote:

 Madhu,

  Can you check the client logs, whether any error/exception is coming while
 submitting the job?

 Devaraj K

 -Original Message-
 From: Harsh J [mailto:ha...@cloudera.com]
 Sent: Tuesday, July 26, 2011 5:01 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Submitting and running hadoop jobs Programmatically

 Yes. Internally, it calls regular submit APIs.

 On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com
 wrote:
  I am using JobControl.add() to add a job and running job control in
  a separate thread and using JobControl.allFinished() to see all jobs
  completed or not . Is this work same as Job.submit()??
 
  On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote:
 
  Madhu,
 
  Do you get a specific error message / stack trace? Could you also
  paste your JT logs?
 
  On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com
  wrote:
   Hi
I am using the same APIs but i am not able to run the jobs by just
  adding
   the configuration files and jars . It never create a job in Hadoop ,
 it
  just
   shows cleaning up staging area and fails.
  
   On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com
 wrote:
  
   Hi Madhu,
  
 You can submit the jobs using the Job API's programmatically from
 any
   system. The job submission code can be written this way.
  
   // Create a new Job
   Job job = new Job(new Configuration());
   job.setJarByClass(MyJob.class);
  
   // Specify various job-specific parameters
   job.setJobName(myjob);
  
   job.setInputPath(new Path(in));
   job.setOutputPath(new Path(out));
  
   job.setMapperClass(MyJob.MyMapper.class);
   job.setReducerClass(MyJob.MyReducer.class);
  
   // Submit the job
   job.submit();
  
  
  
   For submitting this, need to add the hadoop jar files and
 configuration
   files in the class path of the application from where you want to
 submit
   the
   job.
  
   You can refer this docs for more info on Job API's.
  
  
 

 http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred
   uce/Job.html
  
  
  
   Devaraj K
  
   -Original Message-
   From: madhu phatak [mailto:phatak@gmail.com]
   Sent: Tuesday, July 26, 2011 3:29 PM
   To: common-user@hadoop.apache.org
   Subject: Submitting and running hadoop jobs Programmatically
  
   Hi,
I am working on a open source project
   Nectarhttps://github.com/zinnia-phatak-dev/Nectar where
   i am trying to create the hadoop jobs depending upon the user input.
 I
  was
   using Java Process API to run the bin/hadoop shell script to submit
 the
   jobs. But it seems not good way because the process creation model is
   not consistent across different operating systems . Is there any
 better
  way
   to submit the jobs rather than invoking the shell script? I am using
   hadoop-0.21.0 version and i am running my program in the same user
 where
   hadoop is installed . Some of the older thread told if I add
  configuration
   files in path it will work fine . But i am not able to run in that
 way
 .
  So
   anyone tried this before? If So , please can you give detailed
  instruction
   how to achieve it . Advanced thanks for your help.
  
   Regards,
   Madhukara Phatak
  
  
  
 
 
 
  --
  Harsh J
 
 



 --
 Harsh J