[
https://issues.apache.org/jira/browse/HIVE-27291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stamatis Zampetakis reassigned HIVE-27291:
------------------------------------------
Assignee: Krisztian Kasa (was: Stamatis Zampetakis)
> Constant reduction in CBO does not work for UNIX_TIMESTAMP
> ----------------------------------------------------------
>
> Key: HIVE-27291
> URL: https://issues.apache.org/jira/browse/HIVE-27291
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Affects Versions: 4.0.0-alpha-2
> Reporter: Stamatis Zampetakis
> Assignee: Krisztian Kasa
> Priority: Major
> Labels: pull-request-available
>
> {{UNIX_TIMESTAMP}} function always returns the same output given the same
> input for the duration of the query. In Hive terminology, this function is a
> [runtimeConstant|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/udf/src/java/org/apache/hadoop/hive/ql/udf/UDFType.java#L72].
> Such functions can be computed statically (reduced) at compile time and this
> happens successfully for the vast majority of them with the most relevant
> example being {{{}CURRENT_TIMESTAMP(){}}}.
> However, constant reduction does not work for UNIX_TIMESTAMP in CBO:
> {code:sql}
> EXPLAIN CBO SELECT unix_timestamp();
> {code}
> {noformat}
> HiveProject(_o__c0=[UNIX_TIMESTAMP()])
> HiveTableScan(table=[[_dummy_database, _dummy_table]],
> table:alias=[_dummy_table])
> {noformat}
> {code:sql}
> EXPLAIN CBO SELECT unix_timestamp('2009-03-20', 'yyyy-MM-dd');
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(_o__c0=[UNIX_TIMESTAMP(_UTF-16LE'2009-03-20':VARCHAR(2147483647)
> CHARACTER SET "UTF-16LE", _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER
> SET "UTF-16LE")])
> HiveTableScan(table=[[_dummy_database, _dummy_table]],
> table:alias=[_dummy_table])
> {noformat}
> Observe that constant reduction works fine in the physical plan.
> {code:sql}
> EXPLAIN SELECT unix_timestamp();
> {code}
> {noformat}
> STAGE DEPENDENCIES:
> Stage-0 is a root stage
> STAGE PLANS:
> Stage: Stage-0
> Fetch Operator
> limit: -1
> Processor Tree:
> TableScan
> alias: _dummy_table
> Row Limit Per Split: 1
> Select Operator
> expressions: 1682411039L (type: bigint)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Generally, we want to perform constant reduction as much as possible in CBO
> level cause it can affect expression pushdown in various storage handlers
> (HIVE-21388) but also predicate simplification/elimination.
> Currently we fail to reduce {{UNIX_TIMESTAMP}} in CBO level cause the
> respective operator is marked as a
> [dynamicFunction|https://github.com/apache/hive/blob/59058c65457fb7ab9d8575a555034e6633962661/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveUnixTimestampSqlOperator.java#L38]
> and the reduction rules in Calcite explicitly skip reduction [in this
> case|https://github.com/apache/calcite/blob/68b02dfd4af15bc94a91a0cd2a30655d04439555/core/src/main/java/org/apache/calcite/rel/rules/ReduceExpressionsRule.java#L1098].
> As of Calcite 1.28.0, (CALCITE-2736) the reduction of dynamic functions
> becomes configurable so we may be able to exploit this feature.
> Alternatively, we will have to treat UNIX_TIMESTAMP in a similar fashion to
> CURRENT_TIMESTAMP and possibly rely on HiveSqlFunction.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)