Chao Shi created CRUNCH-352:
-------------------------------

             Summary: Share library jars between MR stages
                 Key: CRUNCH-352
                 URL: https://issues.apache.org/jira/browse/CRUNCH-352
             Project: Crunch
          Issue Type: Improvement
            Reporter: Chao Shi


Currently, library jars are copied to the staging directory every time when a 
MR job submitted. This is time-consuming when a pipeline consumes tens of 
stages. To make it even worse, the job client may run in a network away from 
cluster.

I found hive and pig have/will have this optimization (HIVE-860 and PIG-2672). 
Yarn also has similar plan (YARN-1492).

Although this is better done at Yarn/MR level, we can still do it at client 
side solution to benefit users who cannot upgrade to latest Yarn or have to use 
legacy MRv1.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to