[ 
https://issues.apache.org/jira/browse/PIG-4135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105186#comment-14105186
 ] 

Lorand Bendig commented on PIG-4135:
------------------------------------

[~cheolsoo], thanks for pointing out this issue. Filter is a good safeguard, 
but it also reduces the use cases of fetch.
I'm wondering, whether we can have an input size estimation instead, like 
pig.auto.local.input.maxbytes ?

> Fetch optimization should be disabled if plan contains no limit
> ---------------------------------------------------------------
>
>                 Key: PIG-4135
>                 URL: https://issues.apache.org/jira/browse/PIG-4135
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.14.0
>
>         Attachments: PIG-4135-1.patch
>
>
> After deploying fetch optimization in production, a couple of users ran into 
> this situation. They had fairly large input data, but after filtering it by a 
> regular expression, it becomes small. So they didn't add limit to the query. 
> The problem is that even though the output is small, processing the input 
> must be done in the cluster not in the client. However, fetch optimization 
> blindly fetches the entire input into the client since the plan is map-only 
> job and finishes with dump.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to