[
https://issues.apache.org/jira/browse/PIG-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113208#comment-13113208
]
Daniel Dai commented on PIG-2262:
---------------------------------
There are a couple issues with this approach, actually most of issues are not
specific to AvroStorage, it is how we deal with UDF dependent jars:
1. Pig don't automatically ship all classes in pig-withouthadoop.jar
We also need to make code change in JarManager.jar to denote the package to
ship. Putting a jar into pig-withouthadoop.jar alone is equal to put this jar
in classpath. This mechanism confusing and we shall stop putting more jars into
pig-withouthadoop.jar
2. Conflict with hadoop bundled jars
Hadoop 20.204 bundles jackson-1.0.1, which is too old for AvroLoader. In
frontend, we can force hadoop take our jackson-1.7.3 by setting flag
HADOOP_USER_CLASSPATH_FIRST=true. But in the backend, seems hadoop always pick
bundled jackson-1.0.1, which results a job failure.
3. Do we need to bundle piggybank dependent jars?
We don't even bundle hbase.jar though HbaseLoader is in builtin. Further,
these jars are not even in Pig distribution. They are ivy dependencies and will
only be retrieved during compilation. My thinking is we need to bundle some
popular jars (hbase.jar, avro.jar, etc) in lib so user knows where to find it
when needed. But we don't want to ship all those jars to the backend. Ideally
Pig should be smart enough to ship jars when needed (as we do for jython.jar)
> AvroStorage dependencies are missing from the release tarball
> -------------------------------------------------------------
>
> Key: PIG-2262
> URL: https://issues.apache.org/jira/browse/PIG-2262
> Project: Pig
> Issue Type: Bug
> Components: build, piggybank
> Reporter: Tom White
> Assignee: Tom White
> Attachments: PIG-2262.patch
>
>
> This makes AvroStorage hard to use, since users have to download the
> dependencies manually, or build Pig themselves.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira