Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16066 )

Change subject: IMPALA-9482: Support for BINARY columns
......................................................................


Patch Set 14:

(11 comments)

Ps 15 is rebase + conflict resolution, ps 16 contains fixes for the comments.

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/exec/hbase-table-writer.cc
File be/src/exec/hbase-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/exec/hbase-table-writer.cc@161
PS14, Line 161:               data_len = encoded_value.length();
> I just realized we support writing to HBase. It worths adding tests for thi
Thanks for the comment, it turned out that the writer was buggy as 
'base64_encoded_value' went out of scope while 'data' kept a pointer to its 
buffer.

Added a binary column to insertalltypesagg/insertalltypesaggbinary (only used 
by HBase tests) and added few tests to hbase-inserts.test

Also checked manually that Hive can read our data.


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/exec/text-converter.cc@359
PS14, Line 359:       return auxType.string_subtype != 
AuxColumnType::StringSubtype::BINARY;
> Add JIRA for codegen on reading BINARY from text files? I guess we just nee
Created a jira.


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/descriptors.h@256
PS14, Line 256:     return col_descs_[slot_desc->col_path().back()];
> Not sure if this is correct for nested types. Can we add tests on BINARY in
Added tests with binary in struct, binary in array was only tested manually so 
far.


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/types.cc
File be/src/runtime/types.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/runtime/types.cc@141
PS14, Line 141:       return TPrimitiveType::BINARY;
> Should we return INVALID_TYPE here because of DCHECK(false) so that we catc
Agree with returning INVALID_TYPE to improve error handling in release builds.


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/hs2-util.cc
File be/src/service/hs2-util.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/hs2-util.cc@872
PS14, Line 872:   AuxColumnType aux_type(columnType);
> If this is only used in the TYPE_STRING, perhaps it should be initialized t
Done


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/hs2-util.cc@950
PS14, Line 950:       type_entry.__set_type(thrift::TTypeId::STRING_TYPE);
> If DCHECK is false, do we want to set this to STRING_TYPE?  Or maybe we sho
There is no "invalid type" concept in the HS2 thrift api + this field is 
required, so I don't see a better idea here (maybe we could return some status 
in case of error, but that seems a bit too much effort for the unlikely case of 
not catching this scenario in debug builds.


http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

http://gerrit.cloudera.org:8080/#/c/16066/14/be/src/service/impala-beeswax-server.cc@212
PS14, Line 212:     return "string";
> How come we don't return TypeToOdbcString here?
What do you mean for inlineing? It is also used in some tests, so we can't keep 
it in this file.

About returning TypeToOdbcString - while odbc does support binary type, my 
guess is that existing beeswax clients don't, so it seemed safer to return 
"string"


http://gerrit.cloudera.org:8080/#/c/16066/14/fe/src/main/java/org/apache/impala/analysis/CastExpr.java
File fe/src/main/java/org/apache/impala/analysis/CastExpr.java:

http://gerrit.cloudera.org:8080/#/c/16066/14/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@187
PS14, Line 187:         // No built-in function needed for BINARY <-> STRING 
conversion.
> I think the comment should also reflect other datatypes versus binary are n
Done


http://gerrit.cloudera.org:8080/#/c/16066/14/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@357
PS14, Line 357:     boolean isStringBinaryCast =
> I may be understanding this wrong...
Added a precondition to ensure that if one is binary, the other must be string.


http://gerrit.cloudera.org:8080/#/c/16066/14/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
File fe/src/main/java/org/apache/impala/analysis/LikePredicate.java:

http://gerrit.cloudera.org:8080/#/c/16066/14/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java@116
PS14, Line 116:   private static boolean isLikeableType(Type type) {
> No real comment here. But I feel the other types might get upset for not be
:D poor other types - tbh I don't remember whether the naming was intentional


http://gerrit.cloudera.org:8080/#/c/16066/14/testdata/workloads/functional-query/queries/QueryTest/binary-type.test
File testdata/workloads/functional-query/queries/QueryTest/binary-type.test:

http://gerrit.cloudera.org:8080/#/c/16066/14/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@68
PS14, Line 68:   where binary_col_with_nulls = cast("01/02/09" as binary)
> This is identical to the above query.
oops, I think I wanted to test > here



--
To view, visit http://gerrit.cloudera.org:8080/16066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582
Gerrit-Change-Number: 16066
Gerrit-PatchSet: 14
Gerrit-Owner: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Attila Jeges <atti...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Steve Carlin <scar...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Wed, 03 Aug 2022 18:10:42 +0000
Gerrit-HasComments: Yes

Reply via email to