Daniel Vanko has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22922 )

Change subject: IMPALA-13625: Allow reading Parquet int32/int64 as decimal 
without logical types
......................................................................


Patch Set 13:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/22922/12/be/src/exec/parquet/parquet-data-converter.h
File be/src/exec/parquet/parquet-data-converter.h:

http://gerrit.cloudera.org:8080/#/c/22922/12/be/src/exec/parquet/parquet-data-converter.h@99
PS12, Line 99:     if (col_type_->type == TYPE_TIMESTAMP) {
             :       return timestamp_decoder_.NeedsConversion();
             :     }
             :     if (col_type_->type == TYPE_CHAR) {
             :       r
> IIUC, prior to this, we didn't do conversion when
Yes, I changed to checking the logical type.


http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-data-converter.h
File be/src/exec/parquet/parquet-data-converter.h:

http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-data-converter.h@94
PS10, Line 94:
> This change might not needed if we change GetPrecision(), see comment in th
I removed this, but had to introduce a check for INTs otherwise negative values 
aren't correctly displayed without conversion.


http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc@377
PS10, Line 377: pe == TYPE_DECIMAL) {
> The code could be greatly simplified if we changed GetPrecision() to return
Done


http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc@423
PS10, Line 423:   return Status::OK();
              : }
              :
              : parquet::Type::type 
ParquetMetadataUtils::ConvertInternalToParquetType(
              :     PrimitiveType type, TParquetTimestampType::type 
timestamp_type) {
              :   DCHECK_GE(type, 0);
> I think we shouldn't report an error from now on if the file type is INT32/
Done


http://gerrit.cloudera.org:8080/#/c/22922/10/testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test:

http://gerrit.cloudera.org:8080/#/c/22922/10/testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test@11
PS10, Line 11: # IMPALA-13625: Allow reading Parquet int32/int64 as decimal 
without logical types
> Please add tests for INT_MIN, INT_MAX, and INT64_MIN, INT64_MAX.
Done



--
To view, visit http://gerrit.cloudera.org:8080/22922
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56006eb3cca28c81ec8467d77b35005fbf669680
Gerrit-Change-Number: 22922
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Vanko <[email protected]>
Gerrit-Reviewer: Daniel Vanko <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 08 Jul 2025 11:30:13 +0000
Gerrit-HasComments: Yes

Reply via email to