Illya Yalovyy created HIVE-9225:
-----------------------------------
Summary: Windowing functions are not executing efficiently when
the window is identical
Key: HIVE-9225
URL: https://issues.apache.org/jira/browse/HIVE-9225
Project: Hive
Issue Type: Improvement
Components: PTF-Windowing
Affects Versions: 0.13.0
Environment: Linux
Reporter: Illya Yalovyy
Hive optimizer and the runtime are not smart enough to recognize if the
windowing is the same. Even when the window is identical, the windowing is
re-executed again and cause the runtime increase proportionally to # of
windows.
Example:
{code:sql}
select code,min(emp) over (partition by code order by emp range between
current row and 300000000 following)from sample_big limit 10;
{code}
*Time taken: 1h:36m:12s*
{code:sql}
select code,
min(emp) over (partition by code order by emp range between current row and
300000000 following),
max(emp) over (partition by code order by emp range between current row and
300000000 following),
min(salary) over (partition by code order by emp range between current row and
300000000 following),
max(salary) over (partition by code order by emp range between current row and
300000000 following)
from sample_big limit 10;
{code}
*Time taken: 4h:0m:37s*
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)