Hello Thomas Marshall, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/10483 to look at the new patch set (#2). Change subject: IMPALA-7044: Prevent overflow when computing Parquet block size ...................................................................... IMPALA-7044: Prevent overflow when computing Parquet block size When writing Parquet files we compute a minimum block size based on the number of columns in the target table: 3 * page_size * num_cols For tables with a large number of columns (> ~10k), this value will get larger than 2GB. When we pass it to hdfsOpenFile() in HdfsTableSink::CreateNewTmpFile() it gets cast to a signed int32 and can overflow. To fix this we return an error if we detect that the minimum block size exceed 2GB. This change adds a test using CTAS into a table with 12k columns, making sure that Impala returns the correct error. Change-Id: I6e63420e5a093c0bbc789201771708865b16e138 --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc M tests/query_test/test_insert_parquet.py 4 files changed, 35 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/10483/2 -- To view, visit http://gerrit.cloudera.org:8080/10483 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6e63420e5a093c0bbc789201771708865b16e138 Gerrit-Change-Number: 10483 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker <l...@cloudera.com> Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>