Re: Submitting and running hadoop jobs Programmatically
Madhu, Ditch the '*' in the classpath element that has the configuration directory. The directory ought to be on the classpath, not the files AFAIK. Try and let us know if it then picks up the proper config (right now, its using the local mode). On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak phatak@gmail.com wrote: Hi I am submitting the job as follows java -cp Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/* com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv kkk11fffrrw 1 I get the log in CLI as below 11/07/27 10:22:54 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001 It doesn't create any job in hadoop. On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote: Madhu, Can you check the client logs, whether any error/exception is coming while submitting the job? Devaraj K -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Tuesday, July 26, 2011 5:01 PM To: common-user@hadoop.apache.org Subject: Re: Submitting and running hadoop jobs Programmatically Yes. Internally, it calls regular submit APIs. On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote: I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J -- Harsh J -- Harsh J
Re: Submitting and running hadoop jobs Programmatically
On 27/07/11 05:55, madhu phatak wrote: Hi I am submitting the job as follows java -cp Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/* com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv kkk11fffrrw 1 My code to submit jobs (via a declarative configuration) is up online http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590view=markup It's LGPL, but ask nicely and I'll change the header to Apache. That code doesn't set up the classpath by pushing out more JARs (I'm planning to push out .groovy scripts instead), but it can also poll for job completion, take a timeout (useful in small test runs), and do other things. I currently mainly use it for testing
Re: Submitting and running hadoop jobs Programmatically
Thank you . Will have a look on it. On Wed, Jul 27, 2011 at 3:28 PM, Steve Loughran ste...@apache.org wrote: On 27/07/11 05:55, madhu phatak wrote: Hi I am submitting the job as follows java -cp Nectar-analytics-0.0.1-**SNAPSHOT.jar:/home/hadoop/** hadoop-for-nectar/hadoop-0.21.**0/conf/*:$HADOOP_COMMON_HOME/** lib/*:$HADOOP_COMMON_HOME/* com.zinnia.nectar.regression.**hadoop.primitive.jobs.SigmaJob input/book.csv kkk11fffrrw 1 My code to submit jobs (via a declarative configuration) is up online http://smartfrog.svn.**sourceforge.net/viewvc/** smartfrog/trunk/core/hadoop-**components/hadoop-ops/src/org/** smartfrog/services/hadoop/**operations/components/** submitter/SubmitterImpl.java?**revision=8590view=markuphttp://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/hadoop-components/hadoop-ops/src/org/smartfrog/services/hadoop/operations/components/submitter/SubmitterImpl.java?revision=8590view=markup It's LGPL, but ask nicely and I'll change the header to Apache. That code doesn't set up the classpath by pushing out more JARs (I'm planning to push out .groovy scripts instead), but it can also poll for job completion, take a timeout (useful in small test runs), and do other things. I currently mainly use it for testing
Re: Submitting and running hadoop jobs Programmatically
Thank you Harsha . I am able to run the jobs by ditching *. On Wed, Jul 27, 2011 at 11:41 AM, Harsh J ha...@cloudera.com wrote: Madhu, Ditch the '*' in the classpath element that has the configuration directory. The directory ought to be on the classpath, not the files AFAIK. Try and let us know if it then picks up the proper config (right now, its using the local mode). On Wed, Jul 27, 2011 at 10:25 AM, madhu phatak phatak@gmail.com wrote: Hi I am submitting the job as follows java -cp Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/* com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv kkk11fffrrw 1 I get the log in CLI as below 11/07/27 10:22:54 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001 It doesn't create any job in hadoop. On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote: Madhu, Can you check the client logs, whether any error/exception is coming while submitting the job? Devaraj K -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Tuesday, July 26, 2011 5:01 PM To: common-user@hadoop.apache.org Subject: Re: Submitting and running hadoop jobs Programmatically Yes. Internally, it calls regular submit APIs. On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote: I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards
Re: Submitting and running hadoop jobs Programmatically
A simple job.submit(…) OR JobClient.runJob(jobConf), submits your job right from the Java API. Does this not work for you? If not, what error do you face? Forking out and launching from a system process is a bad idea unless there's absolutely no way. On Tue, Jul 26, 2011 at 3:28 PM, madhu phatak phatak@gmail.com wrote: Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J
RE: Submitting and running hadoop jobs Programmatically
Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak
Re: Submitting and running hadoop jobs Programmatically
Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak
Re: Submitting and running hadoop jobs Programmatically
Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J
Re: Submitting and running hadoop jobs Programmatically
I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J
Re: Submitting and running hadoop jobs Programmatically
Yes. Internally, it calls regular submit APIs. On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote: I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J -- Harsh J
RE: Submitting and running hadoop jobs Programmatically
Madhu, Can you check the client logs, whether any error/exception is coming while submitting the job? Devaraj K -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Tuesday, July 26, 2011 5:01 PM To: common-user@hadoop.apache.org Subject: Re: Submitting and running hadoop jobs Programmatically Yes. Internally, it calls regular submit APIs. On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote: I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J -- Harsh J
Re: Submitting and running hadoop jobs Programmatically
Hi I am submitting the job as follows java -cp Nectar-analytics-0.0.1-SNAPSHOT.jar:/home/hadoop/hadoop-for-nectar/hadoop-0.21.0/conf/*:$HADOOP_COMMON_HOME/lib/*:$HADOOP_COMMON_HOME/* com.zinnia.nectar.regression.hadoop.primitive.jobs.SigmaJob input/book.csv kkk11fffrrw 1 I get the log in CLI as below 11/07/27 10:22:54 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=30 11/07/27 10:22:54 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 11/07/27 10:22:54 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 11/07/27 10:22:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 11/07/27 10:22:54 INFO mapreduce.JobSubmitter: Cleaning up the staging area file:/tmp/hadoop-hadoop/mapred/staging/hadoop-1331241340/.staging/job_local_0001 It doesn't create any job in hadoop. On Tue, Jul 26, 2011 at 5:11 PM, Devaraj K devara...@huawei.com wrote: Madhu, Can you check the client logs, whether any error/exception is coming while submitting the job? Devaraj K -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Tuesday, July 26, 2011 5:01 PM To: common-user@hadoop.apache.org Subject: Re: Submitting and running hadoop jobs Programmatically Yes. Internally, it calls regular submit APIs. On Tue, Jul 26, 2011 at 4:32 PM, madhu phatak phatak@gmail.com wrote: I am using JobControl.add() to add a job and running job control in a separate thread and using JobControl.allFinished() to see all jobs completed or not . Is this work same as Job.submit()?? On Tue, Jul 26, 2011 at 4:08 PM, Harsh J ha...@cloudera.com wrote: Madhu, Do you get a specific error message / stack trace? Could you also paste your JT logs? On Tue, Jul 26, 2011 at 4:05 PM, madhu phatak phatak@gmail.com wrote: Hi I am using the same APIs but i am not able to run the jobs by just adding the configuration files and jars . It never create a job in Hadoop , it just shows cleaning up staging area and fails. On Tue, Jul 26, 2011 at 3:46 PM, Devaraj K devara...@huawei.com wrote: Hi Madhu, You can submit the jobs using the Job API's programmatically from any system. The job submission code can be written this way. // Create a new Job Job job = new Job(new Configuration()); job.setJarByClass(MyJob.class); // Specify various job-specific parameters job.setJobName(myjob); job.setInputPath(new Path(in)); job.setOutputPath(new Path(out)); job.setMapperClass(MyJob.MyMapper.class); job.setReducerClass(MyJob.MyReducer.class); // Submit the job job.submit(); For submitting this, need to add the hadoop jar files and configuration files in the class path of the application from where you want to submit the job. You can refer this docs for more info on Job API's. http://hadoop.apache.org/mapreduce/docs/current/api/org/apache/hadoop/mapred uce/Job.html Devaraj K -Original Message- From: madhu phatak [mailto:phatak@gmail.com] Sent: Tuesday, July 26, 2011 3:29 PM To: common-user@hadoop.apache.org Subject: Submitting and running hadoop jobs Programmatically Hi, I am working on a open source project Nectarhttps://github.com/zinnia-phatak-dev/Nectar where i am trying to create the hadoop jobs depending upon the user input. I was using Java Process API to run the bin/hadoop shell script to submit the jobs. But it seems not good way because the process creation model is not consistent across different operating systems . Is there any better way to submit the jobs rather than invoking the shell script? I am using hadoop-0.21.0 version and i am running my program in the same user where hadoop is installed . Some of the older thread told if I add configuration files in path it will work fine . But i am not able to run in that way . So anyone tried this before? If So , please can you give detailed instruction how to achieve it . Advanced thanks for your help. Regards, Madhukara Phatak -- Harsh J -- Harsh J