[ 
https://issues.apache.org/jira/browse/PIG-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880084#comment-13880084
 ] 

Dmitriy V. Ryaboy commented on PIG-2672:
----------------------------------------

Seems like there is a lot of effort being spent here reinventing what is 
already designed for the general use case in the yarn ticket Aniket linked. 
Lets not let best be enemy of the good, and just get something in that will be 
decent for most cases, and if people don't like it, they can turn it off. This 
is an intermediate solution until that yarn patch goes in, at which point all 
of this becomes moot. 


> Optimize the use of DistributedCache
> ------------------------------------
>
>                 Key: PIG-2672
>                 URL: https://issues.apache.org/jira/browse/PIG-2672
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>             Fix For: 0.13.0
>
>         Attachments: PIG-2672-5.patch, PIG-2672.patch
>
>
> Pig currently copies jar files to a temporary location in hdfs and then adds 
> them to DistributedCache for each job launched. This is inefficient in terms 
> of 
>    * Space - The jars are distributed to task trackers for every job taking 
> up lot of local temporary space in tasktrackers.
>    * Performance - The jar distribution impacts the job launch time.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to