[ 
https://issues.apache.org/jira/browse/PIG-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061076#comment-14061076
 ] 

Daniel Dai commented on PIG-4047:
---------------------------------

Thanks for trying. Let me first explain what does the patch does:
1. Original jar target still works, but it only builds pig.jar (h1 or h2 
depends on hadoopversion), and copy dependencies to lib
1. Use tar-h12 for Apache release, it would bundle:
* pig-h1.jar, pig-h2.jar: Pig core jar compiled for H1 & H2
* lib: Pig dependent libraries extract Hadoop, include two parts: core 
dependencies which Pig would need at runtime; convenient dependencies which 
user may use when using some UDF
* lib/hadoop1: Hadoop 1 runtime. It is only used in local mode in case user 
don't have local hadoop installation

I don't see the problem you mentioned when I build it locally with 
PIG-4047-2.patch
1. You mean lib/hadoop2? I don't bundle hadoop2 runtime. I only bundle hadoop1 
runtime in case user don't have local hadoop installation. It does not seems 
necessary to bundle 2 Hadoop runtimes. And it only happens when doing tar-h12.
2. If no hadoopversion is specified, only pig-h1.jar would be generated

Now answer your questions:
1. Actually I didn't think about this case. How does it work before?
2. This will be in trunk, and will be hopefully in 0.14.0
3. I believe so
4. "uber-jar" means a jar wraps other jars, such as pig-withouthadoop.jar, 
which not only contains pig classes, but also wraps antlr-runtime.jar, 
automaton.jar, etc.

> Break up pig withouthadoop and fat jar
> --------------------------------------
>
>                 Key: PIG-4047
>                 URL: https://issues.apache.org/jira/browse/PIG-4047
>             Project: Pig
>          Issue Type: Improvement
>          Components: build
>    Affects Versions: site
>            Reporter: fang fang chen
>            Assignee: fang fang chen
>              Labels: build
>             Fix For: 0.14.0
>
>         Attachments: PIG-4047-1.patch, PIG-4047-2.patch, PIG-4047.patch
>
>
> pig-withouthadoop jar is packaging pig core and pig core dependencies. But 
> this jar should be removed due to following items:
> 1. the name is confusing. User did not know what the jar is used for at a 
> glance.
> 2. it is not absolutely clear for user what the core dependencies are.
> 3. it is hard to maintain dependencies, like dependencies version update. 
> Maybe user want to try different version avro without repackaging.
> It is better to not use pig-withouthadoop jar, instead:
> 1. devided without hadoop jar into pig core and pig core dependencies.
> 2. save jars in 1# in lib directory
> 3. in pig script, always add all the jars in lib directory into classpath and 
> add pig core jar into classpath.
> I used pig in this way since version 0.8.1 via launching pig grunt. No issue 
> found yet.
> Current branch-0.13 is packaging following jars into pig-withouhadoop jar:
>              <include name="antlr-runtime-${antlr.version}.jar"/>
>              <include name="ST4-${stringtemplate.version}.jar"/>
>              <include name="jline-${jline.version}.jar"/>
>              <include name="jackson-mapper-asl-${jackson.version}.jar"/>
>              <include name="jackson-core-asl-${jackson.version}.jar"/>
>              <include name="joda-time-${joda-time.version}.jar"/>
>              <include name="guava-${guava.version}.jar"/>
>              <include name="automaton-${automaton.version}.jar"/>
>              <include name="jansi-${jansi.version}.jar"/>
>              <include name="avro-${avro.version}.jar"/>
>              <include name="avro-mapred-${avro.version}.jar"/>
>              <include name="trevni-core-${avro.version}.jar"/>
>             <include name="trevni-avro-${avro.version}.jar"/>
>             <include name="snappy-java-${snappy. version}.jar"/>
> We could save upper jars and pig-core jar into lib directory.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to