[ 
https://issues.apache.org/jira/browse/PIG-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheolsoo Park updated PIG-4329:
-------------------------------
    Description: 
Although PIG-4135 disable fetch optimization when there is no limit in the 
plan, that doesn't solve the problem completely. In fact, fetch optimization 
should be still disabled if limit is not pushed up. Consider the following 
query-
{code}
random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions' 
using DseStorage();
random_lists = filter random_lists by entity_section=='random';
random_lists = limit random_lists 10;
dump random_lists;
{code}
Because the {{filter by}} blocks limit from being pushed up, POLoad actually 
scans the full table. In this case, fetch optimization makes the job extremely 
slow.

  was:
Although PIG-4135 disable fetch optimization when there is no limit in the 
plan, that doesn't solve the problem completely. In fact, fetch optimization 
should be still disabled if limit is not pushed up. Consider the following 
query-
{code}
random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions' 
using DseStorage();
random_lists = filter random_lists by entity_section=='random');
random_lists = limit random_lists 10;
dump random_lists;
{code}
Because the {{filter by}} blocks limit from being pushed up, POLoad actually 
scans the full table. In this case, fetch optimization makes the job extremely 
slow.


> Fetch optimization should be disabled when limit is not pushed up
> -----------------------------------------------------------------
>
>                 Key: PIG-4329
>                 URL: https://issues.apache.org/jira/browse/PIG-4329
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.15.0
>
>
> Although PIG-4135 disable fetch optimization when there is no limit in the 
> plan, that doesn't solve the problem completely. In fact, fetch optimization 
> should be still disabled if limit is not pushed up. Consider the following 
> query-
> {code}
> random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions' 
> using DseStorage();
> random_lists = filter random_lists by entity_section=='random';
> random_lists = limit random_lists 10;
> dump random_lists;
> {code}
> Because the {{filter by}} blocks limit from being pushed up, POLoad actually 
> scans the full table. In this case, fetch optimization makes the job 
> extremely slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to