Shubham Chaurasia created HIVE-22903:
----------------------------------------
Summary: Vectorized row_number() resets the row number after one
batch in case of constant expression in partition clause
Key: HIVE-22903
URL: https://issues.apache.org/jira/browse/HIVE-22903
Project: Hive
Issue Type: Bug
Components: UDF, Vectorization
Affects Versions: 4.0.0
Reporter: Shubham Chaurasia
Assignee: Shubham Chaurasia
Vectorized row number implementation resets the row number when constant
expression is passed in partition clause.
Repro Query
{code}
select row_number() over(partition by 1) r1, t from over10k_n8;
Or
select row_number() over() r1, t from over10k_n8;
{code}
where table over10k_n8 contains more than 1024 records.
This happens because currently in VectorPTFOperator, we reset evaluators if
only partition clause is there.
{code:java}
// If we are only processing a PARTITION BY, reset our evaluators.
if (!isPartitionOrderBy) {
groupBatches.resetEvaluators();
}
{code}
To resolve, we should also check if the entire partition clause is a constant
expression, if it is so then we should not do {{groupBatches.resetEvaluators()}}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)