Can you show some sample code of submitting distcp job? Cheers! Manoj.
On Fri, Dec 14, 2012 at 11:44 AM, David Parks <davidpark...@yahoo.com>wrote: > Can I do that with s3distcp / distcp? The job is being configured in the > run() method of s3distcp (as it implements Tool). So I think I can’t use > this approach. I use this for the jobs I control of course, but the problem > is things like distcp where I don’t control the configuration.**** > > ** ** > > Dave**** > > ** ** > > ** ** > > *From:* Manoj Babu [mailto:manoj...@gmail.com] > *Sent:* Friday, December 14, 2012 12:57 PM > *To:* user@hadoop.apache.org > *Subject:* Re: How to submit Tool jobs programatically in parallel?**** > > ** ** > > David,**** > > ** ** > > You try like below instead of runJob() you can try submitJob().**** > > ** ** > > JobClient jc = new JobClient(job);**** > > jc.submitJob(job);**** > > ** ** > > ** ** > > > **** > > Cheers!**** > > Manoj.**** > > > > **** > > On Fri, Dec 14, 2012 at 10:09 AM, David Parks <davidpark...@yahoo.com> > wrote:**** > > I'm submitting unrelated jobs programmatically (using AWS EMR) so they run > in parallel. > > I'd like to run an s3distcp job in parallel as well, but the interface to > that job is a Tool, e.g. ToolRunner.run(...). > > ToolRunner blocks until the job completes though, so presumably I'd need to > create a thread pool to run these jobs in parallel. > > But creating multiple threads to submit concurrent jobs via ToolRunner, > blocking on the jobs completion, just feels improper. Is there an > alternative?**** > > ** ** >