[ 
https://issues.apache.org/jira/browse/PIG-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779273#comment-13779273
 ] 

Rohini Palaniswamy commented on PIG-2672:
-----------------------------------------

To clarify:
  1) In 0.10 all jars (including pig and registered jars) were packaged into 
job.jar which was copied into /user/<username>/.staging by JobClient. In 0.11, 
registered extra jars are copied to FileLocalizer.getTemporaryPath(pigContext) 
which is a directory under FileLocalizer.relativeRoot. But still job.jar is 
copied into /user/<username>/.staging by JobClient. To address the 
FileLocalizer.getTemporaryPath security we need to set 700 on 
FileLocalizer.relativeRoot. This is an existing security problem in 0.11.  With 
your patch you copy to a shared or user cache location and if both are not 
configured you still fall back to FileLocalizer.getTemporaryPath, so it needs 
to be addressed.
  2) Second thing is writing to a user cache location which is introduced in 
this patch. Before writing to it we need to check if it is 700 and owned by 
that user similar to the check done by JobClient for /user/<username>/.staging.
                
> Optimize the use of DistributedCache
> ------------------------------------
>
>                 Key: PIG-2672
>                 URL: https://issues.apache.org/jira/browse/PIG-2672
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>             Fix For: 0.12.0
>
>         Attachments: PIG-2672.patch
>
>
> Pig currently copies jar files to a temporary location in hdfs and then adds 
> them to DistributedCache for each job launched. This is inefficient in terms 
> of 
>    * Space - The jars are distributed to task trackers for every job taking 
> up lot of local temporary space in tasktrackers.
>    * Performance - The jar distribution impacts the job launch time.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to