Oleg, Steve already covered this.
The "hadoop jar" subcommand merely runs the jar program for you, as a utility - it has nothing to do with submissions really. Have you tried submitting your program by running your jar as a regular java program (java -jar <jar>), with the proper classpath? (You may use "hadoop classpath" to get a string.). It would go through fine, and submit the job jar with classes included, over to the JobTracker. On Tue, Oct 18, 2011 at 9:13 PM, Oleg Ruchovets <oruchov...@gmail.com> wrote: > I try to be more specific. It is not dependent jar. It is a jar which > contains map/reduce/combine classes and some business logic. > executing our job from command line, class which parse parameters and > submit a job has a line of code: > job.setJarByClass(HadoopJobExecutor.class); > > we execute it locally on hadoop master machine using command such command: > opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar > -inputPath /opt/inputs/ -outputPath /data/output_jobs/output > > and of course my_hadoop_job.jar is found because it is located on the same > machine. > > Now , suppose I am going to submit job remotely (from web applications). > and I have the same line of code > job.setJarByClass(HadoopJobExecutor.class); > > In case my_hadoop_job.jar located on remote hadoop machine (in class path) > , my jobClient will failed because there is no job jar in class path ( it is > located on remote hadoop machine). Am I write? I simply don't know how to > submit a job remotely (in my case job is not a map/combine/reduce classes it > is a jar which contains other classes too). > > Regarding remotely invoke the shellscript that contains the hadoop jar > command with > any required input arguments. > It is possible to do it by Runtime.getRuntime().exec( > submitCommand.toString().split( " " ) ); > But I prefer to use jobClient , because I can monitor my job (get counters > and other useful information). > > Thanks in advance > Oleg. > > On Tue, Oct 18, 2011 at 4:34 PM, Bejoy KS <bejoy.had...@gmail.com> wrote: > >> Hi Oleg >> I haven't tried out a scenario like you mentioned. But I think >> there shouldn't be any issue in submitting a job that has some dependent >> classes which holds the business logic referred from mapper,reducer or >> combiner. You should be able to do the job submission remotely the same we >> were discussing in this thread. If you need to distribute any dependent >> jars/files along with the application jar, you can use the -libjars option >> in CLI or use the DistributedCache methods like >> addArchiveToClassPath()/addFileToClassPath() in your java code. If it is a >> dependent jar It is better to deploy the same in the cluster environment >> itself so that every time when you submit your job you don't have to >> transfer the jar over the network again and again. >> Just a suggestion, if you can execute the job from within your >> hadoop cluster you don't have to do a remote job submission. You just need >> to remotely invoke the shellscript that contains the hadoop jar command >> with >> any required input arguments. Sorry if I'm not getting your requirement >> exactly. >> >> Regards >> Bejoy.K.S >> >> On Tue, Oct 18, 2011 at 6:29 PM, Oleg Ruchovets <oruchov...@gmail.com >> >wrote: >> >> > Thanks you all for your answers but I still have a questions: >> > Currently we running our jobs using shell scripts which locates on >> hadoop >> > master machine. >> > >> > Here is an example of command line: >> > /opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar >> > -inputPath /opt/inputs/ -outputPath /data/output_jobs/output >> > >> > my_hadoop_job.jar has a class which parse input parameters and submit a >> > job. >> > Our code is very similar like you wrote: >> > ...... >> > >> > job.setJarByClass(HadoopJobExecutor.class); >> > job.setMapperClass(MultipleOutputMap.class); >> > job.setCombinerClass(BaseCombine.class); >> > job.setReducerClass(HBaseReducer.class); >> > job.setOutputKeyClass(Text.class); >> > job.setOutputValueClass(MapWritable.class); >> > >> > FileOutputFormat.setOutputPath(job, new Path(finalOutPutPath)); >> > >> > jobCompleteStatus = job.waitForCompletion(true); >> > ............... >> > >> > my question are: >> > >> > 1) my_hadoop_job.jar contains another classes (business logic) not only >> > Map,Combine,Reduce classes and I still don't understand how can I submit >> > job >> > which needs all classes from my_hadoop_job.jar? >> > 2) Do I need to submit a my_hadoop_job.jar too? If yes what is the way to >> > do >> > it? >> > >> > Thanks In Advance >> > Oleg. >> > >> > On Tue, Oct 18, 2011 at 2:11 PM, Uma Maheswara Rao G 72686 < >> > mahesw...@huawei.com> wrote: >> > >> > > ----- Original Message ----- >> > > From: Bejoy KS <bejoy.had...@gmail.com> >> > > Date: Tuesday, October 18, 2011 5:25 pm >> > > Subject: Re: execute hadoop job from remote web application >> > > To: common-user@hadoop.apache.org >> > > >> > > > Oleg >> > > > If you are looking at how to submit your jobs using >> > > > JobClient then the >> > > > below sample can give you a start. >> > > > >> > > > //get the configuration parameters and assigns a job name >> > > > JobConf conf = new JobConf(getConf(), MyClass.class); >> > > > conf.setJobName("SMS Reports"); >> > > > >> > > > //setting key value types for mapper and reducer outputs >> > > > conf.setOutputKeyClass(Text.class); >> > > > conf.setOutputValueClass(Text.class); >> > > > >> > > > //specifying the custom reducer class >> > > > conf.setReducerClass(SmsReducer.class); >> > > > >> > > > //Specifying the input directories(@ runtime) and Mappers >> > > > independently for inputs from multiple sources >> > > > FileInputFormat.addInputPath(conf, new Path(args[0])); >> > > > >> > > > //Specifying the output directory @ runtime >> > > > FileOutputFormat.setOutputPath(conf, new Path(args[1])); >> > > > >> > > > JobClient.runJob(conf); >> > > > >> > > > Along with the hadoop jars you may need to have the config files >> > > > as well on >> > > > your client. >> > > > >> > > > The sample is from old map reduce API. You can use the new one as >> > > > well in >> > > > that we use the Job instead of JobClient. >> > > > >> > > > Hope it helps!.. >> > > > >> > > > Regards >> > > > Bejoy.K.S >> > > > >> > > > >> > > > On Tue, Oct 18, 2011 at 5:00 PM, Oleg Ruchovets >> > > > <oruchov...@gmail.com>wrote: >> > > > > Excellent. Can you give a small example of code. >> > > > > >> > > Good samle by Bejoy >> > > hope, you have access for this site. >> > > Also please go through this docs, >> > > >> > > >> > >> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v2.0 >> > > Here is the wordcount example. >> > > >> > > > > >> > > > > On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 < >> > > > > mahesw...@huawei.com> wrote: >> > > > > >> > > > > > >> > > > > > ----- Original Message ----- >> > > > > > From: Oleg Ruchovets <oruchov...@gmail.com> >> > > > > > Date: Tuesday, October 18, 2011 4:11 pm >> > > > > > Subject: execute hadoop job from remote web application >> > > > > > To: common-user@hadoop.apache.org >> > > > > > >> > > > > > > Hi , what is the way to execute hadoop job on remote >> > > > cluster. I >> > > > > > > want to >> > > > > > > execute my hadoop job from remote web application , but I >> > > > didn't> > > find any >> > > > > > > hadoop client (remote API) to do it. >> > > > > > > >> > > > > > > Please advice. >> > > > > > > Oleg >> > > > > > > >> > > > > > You can put the Hadoop jars in your web applications classpath >> > > > and find >> > > > > the >> > > > > > Class JobClient and submit the jobs using it. >> > > > > > >> > > > > > Regards, >> > > > > > Uma >> > > > > > >> > > > > > >> > > > > >> > > > >> > > Regards >> > > Uma >> > > >> > >> > -- Harsh J