[ 
https://issues.apache.org/jira/browse/HIVE-27291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-27291:
------------------------------------------

    Assignee: Krisztian Kasa  (was: Stamatis Zampetakis)

> Constant reduction in CBO does not work for UNIX_TIMESTAMP
> ----------------------------------------------------------
>
>                 Key: HIVE-27291
>                 URL: https://issues.apache.org/jira/browse/HIVE-27291
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Stamatis Zampetakis
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>
> {{UNIX_TIMESTAMP}} function always returns the same output given the same 
> input for the duration of the query. In Hive terminology, this function is a 
> [runtimeConstant|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/udf/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L72].
> Such functions can be computed statically (reduced) at compile time and this 
> happens successfully for the vast majority of them with the most relevant 
> example being {{{}CURRENT_TIMESTAMP(){}}}.
> However, constant reduction does not work for UNIX_TIMESTAMP in CBO:
> {code:sql}
> EXPLAIN CBO SELECT unix_timestamp();
> {code}
> {noformat}
> HiveProject(_o__c0=[UNIX_TIMESTAMP()])
>   HiveTableScan(table=[[_dummy_database, _dummy_table]], 
> table:alias=[_dummy_table])
> {noformat}
> {code:sql}
> EXPLAIN CBO SELECT unix_timestamp('2009-03-20', 'yyyy-MM-dd');
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(_o__c0=[UNIX_TIMESTAMP(_UTF-16LE'2009-03-20':VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE", _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE")])
>   HiveTableScan(table=[[_dummy_database, _dummy_table]], 
> table:alias=[_dummy_table])
> {noformat}
> Observe that constant reduction works fine in the physical plan.
> {code:sql}
> EXPLAIN SELECT unix_timestamp();
> {code}
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         TableScan
>           alias: _dummy_table
>           Row Limit Per Split: 1
>           Select Operator
>             expressions: 1682411039L (type: bigint)
>             outputColumnNames: _col0
>             ListSink
> {noformat}
> Generally, we want to perform constant reduction as much as possible in CBO 
> level cause it can affect expression pushdown in various storage handlers 
> (HIVE-21388) but also predicate simplification/elimination.
> Currently we fail to reduce {{UNIX_TIMESTAMP}} in CBO level cause the 
> respective operator is marked as a 
> [dynamicFunction|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveUnixTimestampSqlOperator.java#L38]
>  and the reduction rules in Calcite explicitly skip reduction [in this 
> case|https://github.com/apache/calcite/blob/68b02dfd4af15bc94a91a0cd2a30655d04439555/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java#L1098].
> As of Calcite 1.28.0, (CALCITE-2736) the reduction of dynamic functions 
> becomes configurable so we may be able to exploit this feature. 
> Alternatively, we will have to treat UNIX_TIMESTAMP in a similar fashion to 
> CURRENT_TIMESTAMP and possibly rely on HiveSqlFunction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to