Ryan Blue created PIG-4850:
------------------------------
Summary: Registered jars do not use submit replication
Key: PIG-4850
URL: https://issues.apache.org/jira/browse/PIG-4850
Project: Pig
Issue Type: Bug
Components: impl
Reporter: Ryan Blue
Assignee: Ryan Blue
PIG-4074 added support for mapred.submit.replication, which sets the
replication factor for files added to the distributed cache. The purpose is to
avoid a huge number of task attempts downloading the same file in HDFS at once
during localization and slowing down because of contention over few replicas.
The replication factor for files was set correctly, but registered jars are
added to HDFS through a different code path and weren't using the submit
replication factor. This causes localization time for jobs to increase by as
much as 10 minutes (at which point the tasks are killed).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)