[
https://issues.apache.org/jira/browse/PIG-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheolsoo Park updated PIG-4329:
-------------------------------
Description:
Although PIG-4135 disable fetch optimization when there is no limit in the
plan, that doesn't solve the problem completely. In fact, fetch optimization
should be still disabled if limit is not pushed up. Consider the following
query-
{code}
random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions'
using DseStorage();
random_lists = filter random_lists by entity_section=='random';
random_lists = limit random_lists 10;
dump random_lists;
{code}
Because the {{filter by}} blocks limit from being pushed up, POLoad actually
scans the full table. In this case, fetch optimization makes the job extremely
slow.
was:
Although PIG-4135 disable fetch optimization when there is no limit in the
plan, that doesn't solve the problem completely. In fact, fetch optimization
should be still disabled if limit is not pushed up. Consider the following
query-
{code}
random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions'
using DseStorage();
random_lists = filter random_lists by entity_section=='random');
random_lists = limit random_lists 10;
dump random_lists;
{code}
Because the {{filter by}} blocks limit from being pushed up, POLoad actually
scans the full table. In this case, fetch optimization makes the job extremely
slow.
> Fetch optimization should be disabled when limit is not pushed up
> -----------------------------------------------------------------
>
> Key: PIG-4329
> URL: https://issues.apache.org/jira/browse/PIG-4329
> Project: Pig
> Issue Type: Bug
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.15.0
>
>
> Although PIG-4135 disable fetch optimization when there is no limit in the
> plan, that doesn't solve the problem completely. In fact, fetch optimization
> should be still disabled if limit is not pushed up. Consider the following
> query-
> {code}
> random_lists = load 'prodhive.schakraborty.search_server_denorm_impressions'
> using DseStorage();
> random_lists = filter random_lists by entity_section=='random';
> random_lists = limit random_lists 10;
> dump random_lists;
> {code}
> Because the {{filter by}} blocks limit from being pushed up, POLoad actually
> scans the full table. In this case, fetch optimization makes the job
> extremely slow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)