[ https://issues.apache.org/jira/browse/PIG-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai closed PIG-1218. --------------------------- > Use distributed cache to store samples > -------------------------------------- > > Key: PIG-1218 > URL: https://issues.apache.org/jira/browse/PIG-1218 > Project: Pig > Issue Type: Improvement > Reporter: Olga Natkovich > Assignee: Richard Ding > Fix For: 0.7.0 > > Attachments: PIG-1218.patch, PIG-1218_2.patch, PIG-1218_3.patch > > > Currently, in the case of skew join and order by we use sample that is just > written to the dfs (not distributed cache) and, as the result, get opened and > copied around more than necessary. This impacts query performance and also > places unnecesary load on the name node -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.