So you mean that in case I am going to submit job remotely and my_hadoop_job.jar will be in class path of my web application it will submit job with my_hadoop_job.jar to remote hadoop machine (cluster)?
On Tue, Oct 18, 2011 at 6:13 PM, Harsh J <ha...@cloudera.com> wrote: > Oleg, > > Steve already covered this. > > The "hadoop jar" subcommand merely runs the jar program for you, as a > utility - it has nothing to do with submissions really. > > Have you tried submitting your program by running your jar as a > regular java program (java -jar <jar>), with the proper classpath? > (You may use "hadoop classpath" to get a string.). > > It would go through fine, and submit the job jar with classes > included, over to the JobTracker. > > On Tue, Oct 18, 2011 at 9:13 PM, Oleg Ruchovets <oruchov...@gmail.com> > wrote: > > I try to be more specific. It is not dependent jar. It is a jar which > > contains map/reduce/combine classes and some business logic. > > executing our job from command line, class which parse parameters and > > submit a job has a line of code: > > job.setJarByClass(HadoopJobExecutor.class); > > > > we execute it locally on hadoop master machine using command such > command: > > opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar > > -inputPath /opt/inputs/ -outputPath /data/output_jobs/output > > > > and of course my_hadoop_job.jar is found because it is located on the > same > > machine. > > > > Now , suppose I am going to submit job remotely (from web applications). > > and I have the same line of code > > job.setJarByClass(HadoopJobExecutor.class); > > > > In case my_hadoop_job.jar located on remote hadoop machine (in class > path) > > , my jobClient will failed because there is no job jar in class path ( it > is > > located on remote hadoop machine). Am I write? I simply don't know how to > > submit a job remotely (in my case job is not a map/combine/reduce classes > it > > is a jar which contains other classes too). > > > > Regarding remotely invoke the shellscript that contains the hadoop jar > > command with > > any required input arguments. > > It is possible to do it by Runtime.getRuntime().exec( > > submitCommand.toString().split( " " ) ); > > But I prefer to use jobClient , because I can monitor my job (get > counters > > and other useful information). > > > > Thanks in advance > > Oleg. > > > > On Tue, Oct 18, 2011 at 4:34 PM, Bejoy KS <bejoy.had...@gmail.com> > wrote: > > > >> Hi Oleg > >> I haven't tried out a scenario like you mentioned. But I think > >> there shouldn't be any issue in submitting a job that has some dependent > >> classes which holds the business logic referred from mapper,reducer or > >> combiner. You should be able to do the job submission remotely the same > we > >> were discussing in this thread. If you need to distribute any dependent > >> jars/files along with the application jar, you can use the -libjars > option > >> in CLI or use the DistributedCache methods like > >> addArchiveToClassPath()/addFileToClassPath() in your java code. If it is > a > >> dependent jar It is better to deploy the same in the cluster environment > >> itself so that every time when you submit your job you don't have to > >> transfer the jar over the network again and again. > >> Just a suggestion, if you can execute the job from within your > >> hadoop cluster you don't have to do a remote job submission. You just > need > >> to remotely invoke the shellscript that contains the hadoop jar command > >> with > >> any required input arguments. Sorry if I'm not getting your requirement > >> exactly. > >> > >> Regards > >> Bejoy.K.S > >> > >> On Tue, Oct 18, 2011 at 6:29 PM, Oleg Ruchovets <oruchov...@gmail.com > >> >wrote: > >> > >> > Thanks you all for your answers but I still have a questions: > >> > Currently we running our jobs using shell scripts which locates on > >> hadoop > >> > master machine. > >> > > >> > Here is an example of command line: > >> > /opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar > >> > -inputPath /opt/inputs/ -outputPath /data/output_jobs/output > >> > > >> > my_hadoop_job.jar has a class which parse input parameters and submit > a > >> > job. > >> > Our code is very similar like you wrote: > >> > ...... > >> > > >> > job.setJarByClass(HadoopJobExecutor.class); > >> > job.setMapperClass(MultipleOutputMap.class); > >> > job.setCombinerClass(BaseCombine.class); > >> > job.setReducerClass(HBaseReducer.class); > >> > job.setOutputKeyClass(Text.class); > >> > job.setOutputValueClass(MapWritable.class); > >> > > >> > FileOutputFormat.setOutputPath(job, new Path(finalOutPutPath)); > >> > > >> > jobCompleteStatus = job.waitForCompletion(true); > >> > ............... > >> > > >> > my question are: > >> > > >> > 1) my_hadoop_job.jar contains another classes (business logic) not > only > >> > Map,Combine,Reduce classes and I still don't understand how can I > submit > >> > job > >> > which needs all classes from my_hadoop_job.jar? > >> > 2) Do I need to submit a my_hadoop_job.jar too? If yes what is the way > to > >> > do > >> > it? > >> > > >> > Thanks In Advance > >> > Oleg. > >> > > >> > On Tue, Oct 18, 2011 at 2:11 PM, Uma Maheswara Rao G 72686 < > >> > mahesw...@huawei.com> wrote: > >> > > >> > > ----- Original Message ----- > >> > > From: Bejoy KS <bejoy.had...@gmail.com> > >> > > Date: Tuesday, October 18, 2011 5:25 pm > >> > > Subject: Re: execute hadoop job from remote web application > >> > > To: common-user@hadoop.apache.org > >> > > > >> > > > Oleg > >> > > > If you are looking at how to submit your jobs using > >> > > > JobClient then the > >> > > > below sample can give you a start. > >> > > > > >> > > > //get the configuration parameters and assigns a job name > >> > > > JobConf conf = new JobConf(getConf(), MyClass.class); > >> > > > conf.setJobName("SMS Reports"); > >> > > > > >> > > > //setting key value types for mapper and reducer outputs > >> > > > conf.setOutputKeyClass(Text.class); > >> > > > conf.setOutputValueClass(Text.class); > >> > > > > >> > > > //specifying the custom reducer class > >> > > > conf.setReducerClass(SmsReducer.class); > >> > > > > >> > > > //Specifying the input directories(@ runtime) and Mappers > >> > > > independently for inputs from multiple sources > >> > > > FileInputFormat.addInputPath(conf, new Path(args[0])); > >> > > > > >> > > > //Specifying the output directory @ runtime > >> > > > FileOutputFormat.setOutputPath(conf, new Path(args[1])); > >> > > > > >> > > > JobClient.runJob(conf); > >> > > > > >> > > > Along with the hadoop jars you may need to have the config files > >> > > > as well on > >> > > > your client. > >> > > > > >> > > > The sample is from old map reduce API. You can use the new one as > >> > > > well in > >> > > > that we use the Job instead of JobClient. > >> > > > > >> > > > Hope it helps!.. > >> > > > > >> > > > Regards > >> > > > Bejoy.K.S > >> > > > > >> > > > > >> > > > On Tue, Oct 18, 2011 at 5:00 PM, Oleg Ruchovets > >> > > > <oruchov...@gmail.com>wrote: > >> > > > > Excellent. Can you give a small example of code. > >> > > > > > >> > > Good samle by Bejoy > >> > > hope, you have access for this site. > >> > > Also please go through this docs, > >> > > > >> > > > >> > > >> > http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v2.0 > >> > > Here is the wordcount example. > >> > > > >> > > > > > >> > > > > On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 < > >> > > > > mahesw...@huawei.com> wrote: > >> > > > > > >> > > > > > > >> > > > > > ----- Original Message ----- > >> > > > > > From: Oleg Ruchovets <oruchov...@gmail.com> > >> > > > > > Date: Tuesday, October 18, 2011 4:11 pm > >> > > > > > Subject: execute hadoop job from remote web application > >> > > > > > To: common-user@hadoop.apache.org > >> > > > > > > >> > > > > > > Hi , what is the way to execute hadoop job on remote > >> > > > cluster. I > >> > > > > > > want to > >> > > > > > > execute my hadoop job from remote web application , but I > >> > > > didn't> > > find any > >> > > > > > > hadoop client (remote API) to do it. > >> > > > > > > > >> > > > > > > Please advice. > >> > > > > > > Oleg > >> > > > > > > > >> > > > > > You can put the Hadoop jars in your web applications classpath > >> > > > and find > >> > > > > the > >> > > > > > Class JobClient and submit the jobs using it. > >> > > > > > > >> > > > > > Regards, > >> > > > > > Uma > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > Regards > >> > > Uma > >> > > > >> > > >> > > > > > > -- > Harsh J >