[ 
https://issues.apache.org/jira/browse/PIG-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907447#comment-13907447
 ] 

Brock Noland commented on PIG-2672:
-----------------------------------

FYI in in HIVE-860 a reviewer asked me if the following code (copied from this 
patch) closed the stream:

{noformat}
String checksum = DigestUtils.shaHex(url.openStream());
{noformat}

Doesn't look like it does according to the common-codec source. Therefore I 
think pig has a file descriptor leak.

> Optimize the use of DistributedCache
> ------------------------------------
>
>                 Key: PIG-2672
>                 URL: https://issues.apache.org/jira/browse/PIG-2672
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Aniket Mokashi
>             Fix For: 0.13.0
>
>         Attachments: PIG-2672-10.patch, PIG-2672-5.patch, PIG-2672-7.patch, 
> PIG-2672.patch
>
>
> Pig currently copies jar files to a temporary location in hdfs and then adds 
> them to DistributedCache for each job launched. This is inefficient in terms 
> of 
>    * Space - The jars are distributed to task trackers for every job taking 
> up lot of local temporary space in tasktrackers.
>    * Performance - The jar distribution impacts the job launch time.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to