[ https://issues.apache.org/jira/browse/IMPALA-10116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aman Sinha resolved IMPALA-10116. --------------------------------- Fix Version/s: Impala 4.0 Resolution: Fixed > Builtin cast function's selectivity is different from that of explicit cast > --------------------------------------------------------------------------- > > Key: IMPALA-10116 > URL: https://issues.apache.org/jira/browse/IMPALA-10116 > Project: IMPALA > Issue Type: Sub-task > Components: Frontend > Affects Versions: Impala 3.4.0 > Reporter: Aman Sinha > Assignee: Aman Sinha > Priority: Major > Fix For: Impala 4.0 > > > Query 1 below uses 'casttobigint()' in the IS NOT NULL predicate and its > selectivity is computed as the default 10% of the input rows, resulting in > cardinality = 7.3K. The predicate in Query 2 with 'CAST' expr computes the > correct cardinality of 73.05K. > Query 1: > {noformat} > Query: explain select * from date_dim d1, date_dim d2 where d1.d_week_seq = > d2.d_week_seq - 52 and casttobigint(d1.d_week_seq) is not null and > casttobigint(d2.d_week_seq) is not null > | > | 00:SCAN HDFS [tpcds.date_dim d1] | > | HDFS partitions=1/1 files=1 size=9.84MB | > | predicates: casttobigint(d1.d_week_seq) IS NOT NULL | > | runtime filters: RF000 -> d1.d_week_seq | > | row-size=255B cardinality=7.30K | > +-------------------------------------------------------------+ > {noformat} > Query 2: > {noformat} > Query: explain select * from date_dim d1, date_dim d2 where d1.d_week_seq = > d2.d_week_seq - 52 and cast(d1.d_week_seq as bigint) is not null and > cast(d2.d_week_seq as bigint) is not null > | 00:SCAN HDFS [tpcds.date_dim d1] | > | HDFS partitions=1/1 files=1 size=9.84MB | > | predicates: CAST(d1.d_week_seq AS BIGINT) IS NOT NULL | > | runtime filters: RF000 -> d1.d_week_seq | > | row-size=255B cardinality=73.05K | > +-------------------------------------------------------------+ > {noformat} > Query 1 should ideally provide the same cardinality as Query 2. Note that I > had to comment out the following lines in FunctionCallExpr.java because a > user query is not supposed to directly call the builtin cast function. > However, for an external frontend module that calls functions in > impala-frontend.jar, this is supported and we should make the behavior > consistent. > {noformat} > +// if (isBuiltinCastFunction()) { > +// throw new AnalysisException(toSql() + > +// " is reserved for internal use only. Use 'cast(expr AS type)' > instead."); > +// } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)