[jira] Commented: (HADOOP-1622) Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on

Doug Cutting (JIRA) Wed, 18 Jul 2007 09:18:29 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513637
 ]


Doug Cutting commented on HADOOP-1622:
--------------------------------------

I don't disagree with any of your statements in the previous message: currently 
we encourage the main to be in the top-level jar specified, which can be 
awkward; and, yes, it would be more convenient to let users list multiple jars 
when submitting jobs.

I'm suggesting that JobClient should jar things together.  This would change 
the way that the job jar is determined, and thus the relationship between the 
main() and user jar files can be altered at the same time.

Users should be able to submit jobs specifying a set of jars.  That's the crux 
of this issue, and I agree we ought to support it.  But I suggest that the way 
we ought to implement this is to change JobClient to pack together the user's 
jars into a single jar, and submit this.  Few if any changes should be required 
to the JobTracker or TaskTracker.  Does that make sense?  Do you have an 
alternate implementation proposal?

> Hadoop should provide a way to allow the user to specify jar file(s) the user 
> job depends on
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1622
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1622
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>
> More likely than not, a user's job may depend on multiple jars.
> Right now, when submitting a job through bin/hadoop, there is no way for the 
> user to specify that. 
> A walk around for that is to re-package all the dependent jars into a new jar 
> or put the dependent jar files in the lib dir of the new jar.
> This walk around causes unnecessary inconvenience to the user. Furthermore, 
> if the user does not own the main function 
> (like the case when the user uses Aggregate, or datajoin, streaming), the 
> user has to re-package those system jar files too.
> It is much desired that hadoop provides a clean and simple way for the user 
> to specify a list of dependent jar files at the time 
> of job submission. Someting like:
> bin/hadoop .... --depending_jars j1.jar:j2.jar 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1622) Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on

Reply via email to