[ https://issues.apache.org/jira/browse/PIG-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheolsoo Park updated PIG-4135: ------------------------------- Attachment: PIG-4135-1.patch Uploading a patch that marks the plan as fetchable only if it's map-only job, finishes with dump, and includes limit. I also updated the Pig docs about this condition. > Fetch optimization should be disabled if plan contains no limit > --------------------------------------------------------------- > > Key: PIG-4135 > URL: https://issues.apache.org/jira/browse/PIG-4135 > Project: Pig > Issue Type: Bug > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.14.0 > > Attachments: PIG-4135-1.patch > > > After deploying fetch optimization in production, a couple of users ran into > this situation. They had fairly large input data, but after filtering it by a > regular expression, it becomes small. So they didn't add limit to the query. > The problem is that even though the output is small, processing the input > must be done in the cluster not in the client. However, fetch optimization > blindly fetches the entire input into the client since the plan is map-only > job and finishes with dump. -- This message was sent by Atlassian JIRA (v6.2#6252)