[ https://issues.apache.org/jira/browse/PIG-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832713#action_12832713 ]
Olga Natkovich commented on PIG-1218: ------------------------------------- Looks like this patch is for trunk. Since we are planning to merge LSR branch onto trunk next week, it would be better if this patch directly applied to LSR. > Use distributed cache to store samples > -------------------------------------- > > Key: PIG-1218 > URL: https://issues.apache.org/jira/browse/PIG-1218 > Project: Pig > Issue Type: Improvement > Reporter: Olga Natkovich > Assignee: Richard Ding > Fix For: 0.7.0 > > Attachments: PIG-1218.patch > > > Currently, in the case of skew join and order by we use sample that is just > written to the dfs (not distributed cache) and, as the result, get opened and > copied around more than necessary. This impacts query performance and also > places unnecesary load on the name node -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.