[
https://issues.apache.org/jira/browse/PIG-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheolsoo Park updated PIG-4329:
-------------------------------
Attachment: PIG-4329-1.patch
Uploading a patch that disables fetch optimization when limit is not pushed up.
> Fetch optimization should be disabled when limit is not pushed up
> -----------------------------------------------------------------
>
> Key: PIG-4329
> URL: https://issues.apache.org/jira/browse/PIG-4329
> Project: Pig
> Issue Type: Bug
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.15.0
>
> Attachments: PIG-4329-1.patch
>
>
> Although PIG-4135 disable fetch optimization when there is no limit in the
> plan, that doesn't solve the problem completely. In fact, fetch optimization
> should be still disabled if limit is not pushed up. Consider the following
> query-
> {code}
> random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions'
> using DseStorage();
> random_lists = filter random_lists by entity_section=='random';
> random_lists = limit random_lists 10;
> dump random_lists;
> {code}
> Because the {{filter by}} blocks limit from being pushed up, POLoad actually
> scans the full table. In this case, fetch optimization makes the job
> extremely slow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)