Ryan Blue created PIG-4850:
------------------------------

             Summary: Registered jars do not use submit replication
                 Key: PIG-4850
                 URL: https://issues.apache.org/jira/browse/PIG-4850
             Project: Pig
          Issue Type: Bug
          Components: impl
            Reporter: Ryan Blue
            Assignee: Ryan Blue


PIG-4074 added support for mapred.submit.replication, which sets the 
replication factor for files added to the distributed cache. The purpose is to 
avoid a huge number of task attempts downloading the same file in HDFS at once 
during localization and slowing down because of contention over few replicas. 
The replication factor for files was set correctly, but registered jars are 
added to HDFS through a different code path and weren't using the submit 
replication factor. This causes localization time for jobs to increase by as 
much as 10 minutes (at which point the tasks are killed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to