[ 
https://issues.apache.org/jira/browse/PIG-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891648#action_12891648
 ] 

Alan Gates commented on PIG-1511:
---------------------------------

The issue there is that blacklists are hard to maintain.  Every time some adds 
a package to Pig they have to remember to add to that blacklist.  

If you register your jar Pig will wrap it up and take it along.  Does this not 
work for your use case?

> Pig removes packages from its own jar when building the JAR to ship to Hadoop
> -----------------------------------------------------------------------------
>
>                 Key: PIG-1511
>                 URL: https://issues.apache.org/jira/browse/PIG-1511
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Eric Tschetter
>         Attachments: pig-1511.diff
>
>
> Pig generates a new jar file to ship over to Hadoop.  Pig has a couple of 
> packages whitelisted that it includes from its own jar.  Pig throws away 
> everything else.
> I package all of my dependencies into a single jar file.  Pig is included in 
> this jar file.  I do it this way because my code needs to run reliably and 
> reproducibly in production.  Pig throws away all of my dependencies.
> I don't know what the performance gain is of shaving ~5MB off of a jar that 
> is pushed to a job tracker once and then used to run over 100s of GB of data. 
>  The overhead is minimal on my cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to