Daniel Vanko has posted comments on this change. ( http://gerrit.cloudera.org:8080/22922 )
Change subject: IMPALA-13625: Allow reading Parquet int32/int64 as decimal without logical types ...................................................................... Patch Set 13: (5 comments) http://gerrit.cloudera.org:8080/#/c/22922/12/be/src/exec/parquet/parquet-data-converter.h File be/src/exec/parquet/parquet-data-converter.h: http://gerrit.cloudera.org:8080/#/c/22922/12/be/src/exec/parquet/parquet-data-converter.h@99 PS12, Line 99: if (col_type_->type == TYPE_TIMESTAMP) { : return timestamp_decoder_.NeedsConversion(); : } : if (col_type_->type == TYPE_CHAR) { : r > IIUC, prior to this, we didn't do conversion when Yes, I changed to checking the logical type. http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-data-converter.h File be/src/exec/parquet/parquet-data-converter.h: http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-data-converter.h@94 PS10, Line 94: > This change might not needed if we change GetPrecision(), see comment in th I removed this, but had to introduce a check for INTs otherwise negative values aren't correctly displayed without conversion. http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc@377 PS10, Line 377: pe == TYPE_DECIMAL) { > The code could be greatly simplified if we changed GetPrecision() to return Done http://gerrit.cloudera.org:8080/#/c/22922/10/be/src/exec/parquet/parquet-metadata-utils.cc@423 PS10, Line 423: return Status::OK(); : } : : parquet::Type::type ParquetMetadataUtils::ConvertInternalToParquetType( : PrimitiveType type, TParquetTimestampType::type timestamp_type) { : DCHECK_GE(type, 0); > I think we shouldn't report an error from now on if the file type is INT32/ Done http://gerrit.cloudera.org:8080/#/c/22922/10/testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test File testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test: http://gerrit.cloudera.org:8080/#/c/22922/10/testdata/workloads/functional-query/queries/QueryTest/parquet-type-widening.test@11 PS10, Line 11: # IMPALA-13625: Allow reading Parquet int32/int64 as decimal without logical types > Please add tests for INT_MIN, INT_MAX, and INT64_MIN, INT64_MAX. Done -- To view, visit http://gerrit.cloudera.org:8080/22922 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I56006eb3cca28c81ec8467d77b35005fbf669680 Gerrit-Change-Number: 22922 Gerrit-PatchSet: 13 Gerrit-Owner: Daniel Vanko <[email protected]> Gerrit-Reviewer: Daniel Vanko <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 08 Jul 2025 11:30:13 +0000 Gerrit-HasComments: Yes
