[ 
https://issues.apache.org/jira/browse/PIG-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4373:
------------------------------------
    Description: duplicate jars get added to distributed cache

[~daijy],
   The patch fixes OOZIE-3300, but not the original issue of this jira. We can 
move this patch to a different jira if the intent is to just fix it for Hadoop 
3. Still there is one issue with the patch. It makes every resource type as 
APPLICATION instead of PUBLIC or PRIVATE which will impact cluster performance. 
[~jlowe] already asked us to fix that in TezResourceManager for the other 
resources we ship as he saw lot of churn in our clusters. Making it for all the 
files from Oozie as well, will make it worse. 

Jason was fine with rolling back the change in Hadoop and marked MAPREDUCE-7118 
a Blocker for Hadoop 3 releases. Just needs some other Hadoop PMC to chime in 
and +1. Does not make sense to introduce an unwanted backward incompatibility 
for Mapreduce which is slowly marching towards end of life. So we can postpone 
it on the pig side (and do the proper fix) and have your hadoop team pull 
MAPREDUCE-7118 instead.


> Implement PIG-3861 in Tez
> -------------------------
>
>                 Key: PIG-4373
>                 URL: https://issues.apache.org/jira/browse/PIG-4373
>             Project: Pig
>          Issue Type: Improvement
>          Components: tez
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Daniel Dai
>            Priority: Major
>              Labels: MissingFeature
>             Fix For: 0.18.0
>
>         Attachments: PIG-4373_1.patch
>
>
> duplicate jars get added to distributed cache



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to