[ https://issues.apache.org/jira/browse/IMPALA-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835825#comment-17835825 ]
ASF subversion and git services commented on IMPALA-10349: ---------------------------------------------------------- Commit 8ff51fbf74b4572ce3d1e43389fc10d35a8dd576 in impala's branch refs/heads/master from Csaba Ringhofer [ https://gitbox.apache.org/repos/asf?p=impala.git;h=8ff51fbf7 ] IMPALA-5323: Support BINARY columns in Kudu tables The patch adds read and write support for BINARY columns in Kudu tables. Predicate push down is implemented, but is incomplete: a constant binary argument will be only pushed down if the constant folding never encounters non-ascii strings. Examples: - cast(unhex(hex("aa")) as binary) can be pushed down - cast(hex(unhex("aa")) as binary) can't be pushed down as unhex("aa") is not ascii (even though the final result is ascii) See IMPALA-10349 for more details on this limitation. The patch also changes casting BINARY <-> STRING from noop to calling an actual function. While this may add some small overhead it allows the backend to know whether an expression returns STRING or BINARY. Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151 Reviewed-on: http://gerrit.cloudera.org:8080/18868 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Revisit constant folding on non-ASCII strings > --------------------------------------------- > > Key: IMPALA-10349 > URL: https://issues.apache.org/jira/browse/IMPALA-10349 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Reporter: Quanlong Huang > Priority: Critical > > Constant folding may produce non-ASCII strings. In such cases, we currently > abandon folding the constant. See commit message of IMPALA-1788 or codes > here: > [https://github.com/apache/impala/blob/9672d945963e1ca3c8699340f92d7d6ce1d91c9f/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java#L274-L282] > I think we should allow folding non-ASCII strings if they are legal UTF-8 > strings. > Example of constant folding work: > {code:java} > Query: explain select * from functional.alltypes where string_col = > substr('123', 1, 1) > +-------------------------------------------------------------+ > | Explain String | > +-------------------------------------------------------------+ > | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 | > | Per-Host Resource Estimates: Memory=160MB | > | Codegen disabled by planner | > | | > | PLAN-ROOT SINK | > | | | > | 01:EXCHANGE [UNPARTITIONED] | > | | | > | 00:SCAN HDFS [functional.alltypes] | > | HDFS partitions=24/24 files=24 size=478.45KB | > | predicates: string_col = '1' | > | row-size=89B cardinality=730 | > +-------------------------------------------------------------+ > {code} > Example of constant folding doesn't work: > {code:java} > Query: explain select * from functional.alltypes where string_col = > substr('引擎', 1, 3) > +-------------------------------------------------------------+ > | Explain String | > +-------------------------------------------------------------+ > | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 | > | Per-Host Resource Estimates: Memory=160MB | > | Codegen disabled by planner | > | | > | PLAN-ROOT SINK | > | | | > | 01:EXCHANGE [UNPARTITIONED] | > | | | > | 00:SCAN HDFS [functional.alltypes] | > | HDFS partitions=24/24 files=24 size=478.45KB | > | predicates: string_col = substr('引擎', 1, 3) | > | row-size=89B cardinality=730 | > +-------------------------------------------------------------+ > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org