[ 
https://issues.apache.org/jira/browse/IMPALA-10349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835825#comment-17835825
 ] 

ASF subversion and git services commented on IMPALA-10349:
----------------------------------------------------------

Commit 8ff51fbf74b4572ce3d1e43389fc10d35a8dd576 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=8ff51fbf7 ]

IMPALA-5323: Support BINARY columns in Kudu tables

The patch adds read and write support for BINARY columns in Kudu
tables.

Predicate push down is implemented, but is incomplete:
a constant binary argument will be only pushed down if
the constant folding never encounters non-ascii strings.
Examples:
 - cast(unhex(hex("aa")) as binary) can be pushed down
 - cast(hex(unhex("aa")) as binary) can't be pushed
   down as unhex("aa") is not ascii (even though the
   final result is ascii)
See IMPALA-10349 for more details on this limitation.

The patch also changes casting BINARY <-> STRING from noop
to calling an actual function. While this may add some small
overhead it allows the backend to know whether an expression
returns STRING or BINARY.

Change-Id: Iff701a4b3a09ce7b6982c5d238e65f3d4f3d1151
Reviewed-on: http://gerrit.cloudera.org:8080/18868
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Revisit constant folding on non-ASCII strings
> ---------------------------------------------
>
>                 Key: IMPALA-10349
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10349
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Quanlong Huang
>            Priority: Critical
>
> Constant folding may produce non-ASCII strings. In such cases, we currently 
> abandon folding the constant. See commit message of IMPALA-1788 or codes 
> here: 
> [https://github.com/apache/impala/blob/9672d945963e1ca3c8699340f92d7d6ce1d91c9f/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java#L274-L282]
> I think we should allow folding non-ASCII strings if they are legal UTF-8 
> strings.
> Example of constant folding work:
> {code:java}
> Query: explain select * from functional.alltypes where string_col = 
> substr('123', 1, 1)
> +-------------------------------------------------------------+
> | Explain String                                              |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 |
> | Per-Host Resource Estimates: Memory=160MB                   |
> | Codegen disabled by planner                                 |
> |                                                             |
> | PLAN-ROOT SINK                                              |
> | |                                                           |
> | 01:EXCHANGE [UNPARTITIONED]                                 |
> | |                                                           |
> | 00:SCAN HDFS [functional.alltypes]                          |
> |    HDFS partitions=24/24 files=24 size=478.45KB             |
> |    predicates: string_col = '1'                             |
> |    row-size=89B cardinality=730                             |
> +-------------------------------------------------------------+
> {code}
> Example of constant folding doesn't work:
> {code:java}
> Query: explain select * from functional.alltypes where string_col = 
> substr('引擎', 1, 3)
> +-------------------------------------------------------------+
> | Explain String                                              |
> +-------------------------------------------------------------+
> | Max Per-Host Resource Reservation: Memory=32.00KB Threads=3 |
> | Per-Host Resource Estimates: Memory=160MB                   |
> | Codegen disabled by planner                                 |
> |                                                             |
> | PLAN-ROOT SINK                                              |
> | |                                                           |
> | 01:EXCHANGE [UNPARTITIONED]                                 |
> | |                                                           |
> | 00:SCAN HDFS [functional.alltypes]                          |
> |    HDFS partitions=24/24 files=24 size=478.45KB             |
> |    predicates: string_col = substr('引擎', 1, 3)            |
> |    row-size=89B cardinality=730                             |
> +-------------------------------------------------------------+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to