Shubham Chaurasia created HIVE-22903:
----------------------------------------

             Summary: Vectorized row_number() resets the row number after one 
batch in case of constant expression in partition clause
                 Key: HIVE-22903
                 URL: https://issues.apache.org/jira/browse/HIVE-22903
             Project: Hive
          Issue Type: Bug
          Components: UDF, Vectorization
    Affects Versions: 4.0.0
            Reporter: Shubham Chaurasia
            Assignee: Shubham Chaurasia


Vectorized row number implementation resets the row number when constant 
expression is passed in partition clause.

Repro Query
{code}
select row_number() over(partition by 1) r1, t from over10k_n8;

Or

select row_number() over() r1, t from over10k_n8;
{code}
where table over10k_n8 contains more than 1024 records.

This happens because currently in VectorPTFOperator, we reset evaluators if 
only partition clause is there.
{code:java}
    // If we are only processing a PARTITION BY, reset our evaluators.
    if (!isPartitionOrderBy) {
      groupBatches.resetEvaluators();
    }
{code}

To resolve, we should also check if the entire partition clause is a constant 
expression, if it is so then we should not do {{groupBatches.resetEvaluators()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to