[ https://issues.apache.org/jira/browse/IMPALA-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013035#comment-17013035 ]
ASF subversion and git services commented on IMPALA-9035: --------------------------------------------------------- Commit 6e3d1216f1680e7fca4b5979f182a19673018a2f in impala's branch refs/heads/master from xuzhou [ https://gitbox.apache.org/repos/asf?p=impala.git;h=6e3d121 ] IMPALA-9035: Simplify casting string to timestamp. This change will help with queries generated by some BI tools. Case 1: Simplify 'string -> bigint -> timestamp' TO 'string -> timestamp': cast(unix_timestamp('timestr') as timestamp) -> cast('timestr' as timestamp) Case 2: Simplify 'string[fmt] -> bigint -> timestamp' TO 'string -> timestamp': cast(unix_timestamp('timestr', 'fmt') as timestamp) -> to_timestamp('timestr', 'fmt') Tests: Add front-end tests in ExprRewriteRulesTest. Change-Id: I4ed72d6e7886eaf50d2be60cf45170ffaef5e72d Reviewed-on: http://gerrit.cloudera.org:8080/14896 Reviewed-by: Attila Jeges <atti...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Simplify cast string to timestamp > --------------------------------- > > Key: IMPALA-9035 > URL: https://issues.apache.org/jira/browse/IMPALA-9035 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Reporter: 徐洲 > Assignee: xuzhou > Priority: Critical > > Some BI tools generate compatible sqls to cast string to timestamp: > cast(unix_timestamp('timestr', 'fmt') as timestamp) or > cast(unix_timestamp('timestr') as timestamp) > The internal type change in impala: > StringVal->TimestampValue->BigIntVal→TimestampValue→TimestampVal > In earlier 2.x versions, casting TimestampValue to BigIntVal using libc > functions which is more expensive than current design with CCTZ. > However, the cast to BIgIntVal seems to be redundant. We can simplify > cast(unix_timestamp('timestr', 'fmt') as timestamp) to > to_timestamp('timestr', 'fmt'), simplify cast(unix_timestamp('timestr') as > timestamp) to cast('timestr' as timestamp). > I managed to modify the fe to support such expr rewriting. > Benefiting from the rewriting, the query time cost is reduced from 2.52s to > 2.02s. > Here is the sql: > SELECT `t2`.`phy_category1_name` `d0`, SUM(`t1`.`apply_count`) `m0` FROM > `tr_orders` `t1` LEFT JOIN `dim_sku` `t2` ON (`t1`.`sku_id` = `t2`.`sku_id`) > WHERE (((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS TIMESTAMP)) >= > cast('2018-08-26' as timestamp)) AND ((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS > TIMESTAMP)) < cast('2019-09-02' as timestamp))) GROUP BY > `t2`.`phy_category1_name` order by d0; > Query: SELECT `t2`.`phy_category1_name` `d0`, SUM(`t1`.`apply_count`) `m0` > FROM `tr_orders` `t1` LEFT JOIN `dim_sku` `t2` ON (`t1`.`sku_id` = > `t2`.`sku_id`) WHERE (((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS TIMESTAMP)) >= > cast('2018-08-26' as timestamp)) AND ((CAST(UNIX_TIMESTAMP(`t1`.`date_id`) AS > TIMESTAMP)) < cast('2019-09-02' as timestamp))) GROUP BY > `t2`.`phy_category1_name` order by d0 > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org