[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Reviewed-on: http://gerrit.cloudera.org:8080/17806 Tested-by: Impala Public Jenkins Reviewed-by: Attila Jeges --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,025 insertions(+), 33 deletions(-) Approvals: Impala Public Jenkins: Verified Attila Jeges: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 11 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 10 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 02 Sep 2021 21:34:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,025 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/8 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. Patch Set 6: patch-set #6 contains the refactored flat buffer generation. -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 27 Aug 2021 18:18:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,025 insertions(+), 33 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/6 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,036 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/5 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,034 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/4 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. - BE test for single-value serialization. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/CMakeLists.txt M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h A be/src/exec/parquet/serialize-single-value-test.cc M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 19 files changed, 1,033 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/3 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 26 Aug 2021 15:09:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py@197 PS1, Line 197: # Query old snapshot > We are using the local timezone of the machine that executes the test. I do What I was thinking of is to set TIMEZONE query option to a specific timezone for the queries and get the current timestamp after each query with "select now();" (with TIMEZONE set to the same timezone). This way we wouldn't depend on the local timezone of the machine. A test could compare the results for the same timestamp in different TIMEZONEs, to prove that time travel uses the coordinator's local timezone. Anyway, it was just a silly idea. Now that I think about it, it doesn't sound too useful. Feel free to ignore it. http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py@197 PS1, Line 197: # Query old snapshot > Currently querying the future behaves the same as querying by now(). I'm no Sure, use your best judgement. -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 26 Aug 2021 15:08:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@222 PS1, Line 222: timeTravelSpec_ = other.timeTravelSpec_; > Maybe cloning the TimeTravelSpec object would be safer than just copying th Alternatively, you could set timeTravelSpec_ to null in reset(), instead of calling timeTravelSpec_.reset() -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Aug 2021 17:55:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@152 PS1, Line 152: protected TimeTravelSpec timeTravelSpec_; Please add a comment. http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TableRef.java@222 PS1, Line 222: timeTravelSpec_ = other.timeTravelSpec_; Maybe cloning the TimeTravelSpec object would be safer than just copying the reference. Perhaps it is not an issue, but if 2 TableRef instances share the same 'timeTravelSpec_' reference and one of them is reset(), that will affect the other instance as well, right? http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TimeTravelSpec.java File fe/src/main/java/org/apache/impala/analysis/TimeTravelSpec.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/analysis/TimeTravelSpec.java@113 PS1, Line 113: asOfVersion_ = asOfExpr_.evalToInteger(analyzer, "SYSTEM_VERSION AS OF"); you could also check that asOfVersion_ > 0 http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@132 PS1, Line 132: } catch (IOException ex) { Does 'ex' contain dataFile.path() ? If not, please add dataFile.path() to the exception in L133 -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Aug 2021 16:08:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/17765/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17765/1//COMMIT_MSG@9 PS1, Line 9: This patch adds support "FOR SYSTEM_TIME AS OF" and Please clarify the the timestamp specified with "FOR SYSTEM_TIME AS OF" is interpreted to be in the local timezone. Local timezone meaning the coordinator node's local timezone. http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py@197 PS1, Line 197: # Query old snapshot > Maybe add another test to query with a timestamp in the future. You could also test (if not too much work) that switching Impala to another timezone (e.g. using TIMEZONE query option) changes the results of the time travel query. -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Aug 2021 11:58:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py@197 PS1, Line 197: # Query old snapshot Maybe add another test to query with a timestamp in the future. -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 25 Aug 2021 11:03:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10874: Upgrade impyla to the latest version
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17795 ) Change subject: IMPALA-10874: Upgrade impyla to the latest version .. Patch Set 2: I've added Joe McDonnell as a reviewer to approve the change. -- To view, visit http://gerrit.cloudera.org:8080/17795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I990e5cdde4e98d6ab3581fe48f53a5d0590ce492 Gerrit-Change-Number: 17795 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 24 Aug 2021 18:43:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10874: Upgrade impyla to the latest version
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17795 ) Change subject: IMPALA-10874: Upgrade impyla to the latest version .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/17795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I990e5cdde4e98d6ab3581fe48f53a5d0590ce492 Gerrit-Change-Number: 17795 Gerrit-PatchSet: 2 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Tue, 24 Aug 2021 18:40:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10840: Add support for "FOR SYSTEM TIME AS OF" and "FOR SYSTEM VERSION AS OF" for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17765 ) Change subject: IMPALA-10840: Add support for "FOR SYSTEM_TIME AS OF" and "FOR SYSTEM_VERSION AS OF" for Iceberg tables .. Patch Set 1: (3 comments) The patch looks good. At first glance I found only minor issues. http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@515 PS1, Line 515: TableScan scan = createScanAsOf( nit: Since 'baseTable' is not used anywhere else, you could move L514 inside createScanAsOf(). This way createScanAsOf() would need only 2 params : table and timeTravelSpec. http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java: http://gerrit.cloudera.org:8080/#/c/17765/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@4884 PS1, Line 4884: iceT, "FOR SYSTEM_VERSION AS OF must be an integer type but is"); The end of the error msg was left out intentionally? http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17765/1/tests/query_test/test_iceberg.py@134 PS1, Line 134: ts 'snapshot_id' ? -- To view, visit http://gerrit.cloudera.org:8080/17765 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib523c5e47b8d9c377bea39a82fe20249177cf824 Gerrit-Change-Number: 17765 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 24 Aug 2021 18:01:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 17 files changed, 964 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/2 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17806 ) Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. Patch Set 1: (11 comments) http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@24 PS1, Line 24: import avro.schema > flake8: F401 'avro.schema' imported but unused Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@25 PS1, Line 25: from avro.datafile import DataFileReader, DataFileWriter > flake8: F401 'avro.datafile.DataFileWriter' imported but unused Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@26 PS1, Line 26: from avro.io import DatumReader, DatumWriter > flake8: F401 'avro.io.DatumWriter' imported but unused Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@199 PS1, Line 199: > flake8: E202 whitespace before ']' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@199 PS1, Line 199: > flake8: E201 whitespace after '[' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@260 PS1, Line 260: , > flake8: E231 missing whitespace after ',' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@268 PS1, Line 268: , > flake8: E231 missing whitespace after ',' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@282 PS1, Line 282: , > flake8: E231 missing whitespace after ',' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@290 PS1, Line 290: , > flake8: E231 missing whitespace after ',' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@292 PS1, Line 292: : > flake8: E231 missing whitespace after ':' Done http://gerrit.cloudera.org:8080/#/c/17806/1/tests/query_test/test_iceberg.py@307 PS1, Line 307: : > flake8: E231 missing whitespace after ':' Done -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 24 Aug 2021 12:01:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10879: Add parquet stats to iceberg manifest
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17806 Change subject: IMPALA-10879: Add parquet stats to iceberg manifest .. IMPALA-10879: Add parquet stats to iceberg manifest This patch adds parquet stats to iceberg manifest as per-datafile metrics. The following metrics are supported: - column_sizes : Map from column id to the total size on disk of all regions that store the column. Does not include bytes necessary to read other columns, like footers. - null_value_counts : Map from column id to number of null values in the column. - lower_bounds : Map from column id to lower bound in the column serialized as binary. Each value must be less than or equal to all non-null, non-NaN values in the column for the file. - upper_bounds : Map from column id to upper bound in the column serialized as binary. Each value must be greater than or equal to all non-null, non-Nan values in the column for the file. The corresponding parquet stats are collected by 'ColumnStats' (in 'min_value_', 'max_value_', 'null_count_' members) and 'HdfsParquetTableWriter::BaseColumnWriter' (in 'total_compressed_byte_size_' member). Testing: - New e2e test was added to verify that the metrics are written to the Iceberg manifest upon inserting data. - New e2e test was added to verify that lower_bounds/upper_bounds metrics are used to prune data files on querying iceberg tables. - Existing e2e tests were updated to work with the new behavior. Relevant Iceberg documentation: - Manifest: https://iceberg.apache.org/spec/#manifests - Values in lower_bounds and upper_bounds maps should be Single-value serialized to binary: https://iceberg.apache.org/spec/#appendix-d-single-value-serialization Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-table-writer.h M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/exec/parquet/parquet-column-stats.h M be/src/exec/parquet/parquet-column-stats.inline.h M be/src/runtime/dml-exec-state.cc M be/src/runtime/dml-exec-state.h M be/src/util/bit-util.h M common/fbs/IcebergObjects.fbs M common/protobuf/control_service.proto M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M infra/python/deps/requirements.txt M testdata/workloads/functional-query/queries/QueryTest/iceberg-partition-transform-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 17 files changed, 965 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17806/1 -- To view, visit http://gerrit.cloudera.org:8080/17806 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic31f2260bc6f6a7f307ac955ff05eb154917675b Gerrit-Change-Number: 17806 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10874: Upgrade impyla to the latest version
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17795 ) Change subject: IMPALA-10874: Upgrade impyla to the latest version .. Patch Set 1: @Bikramjeet If I remember correctly, impyla doesn't rely on 'sasl' anymore, so that dependency can be removed from requirements.txt. I think it is safe to upgrade 'bitarray'. -- To view, visit http://gerrit.cloudera.org:8080/17795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I990e5cdde4e98d6ab3581fe48f53a5d0590ce492 Gerrit-Change-Number: 17795 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 23 Aug 2021 19:27:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10874: Upgrade impyla to the latest version
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17795 ) Change subject: IMPALA-10874: Upgrade impyla to the latest version .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/17795/1/infra/python/deps/requirements.txt File infra/python/deps/requirements.txt: http://gerrit.cloudera.org:8080/#/c/17795/1/infra/python/deps/requirements.txt@40 PS1, Line 40: sasl == 0.3.1 I think impyla doesn't rely on sasl package anymore. Could you please check it? -- To view, visit http://gerrit.cloudera.org:8080/17795 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I990e5cdde4e98d6ab3581fe48f53a5d0590ce492 Gerrit-Change-Number: 17795 Gerrit-PatchSet: 1 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 23 Aug 2021 19:24:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10784 (part 2): Fix retaining cookies for impala-shell
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17796 ) Change subject: IMPALA-10784 (part 2): Fix retaining cookies for impala-shell .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17796 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I65432b952929c1c96a081bb87fd4a096624d711b Gerrit-Change-Number: 17796 Gerrit-PatchSet: 3 Gerrit-Owner: Wenzhe Zhou Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 23 Aug 2021 19:17:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10741: Set engine.hive.enabled=true table property for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17750 ) Change subject: IMPALA-10741: Set engine.hive.enabled=true table property for Iceberg tables .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17750 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6aa0240829697a27f48d0defcce48920a5d6f49b Gerrit-Change-Number: 17750 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 05 Aug 2021 08:50:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10739: Support setting new partition spec for Iceberg tables
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17723 ) Change subject: IMPALA-10739: Support setting new partition spec for Iceberg tables .. IMPALA-10739: Support setting new partition spec for Iceberg tables With this patch Impala will support partition evolution for Iceberg tables. The DDL statement to change the default partition spec is: ALTER TABLE SET PARTITION SPEC() Hive uses the same SQL syntax. Testing: - Added FE test to exercise parsing various well-formed and ill-formed ALTER TABLE SET PARTITION SPEC statements. - Added e2e tests for: - ALTER TABLE SET PARTITION SPEC works for tables with HadoopTables and HadoopCatalog Catalog. - When evolving partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Data written with earlier spec and new spec can be fetched in a single query. - Invalid ALTER TABLE SET PARTITION SPEC statements yield the expected analysis error messages. Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 --- M be/src/exec/hdfs-table-sink.cc M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test 10 files changed, 306 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/17723/3 -- To view, visit http://gerrit.cloudera.org:8080/17723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Gerrit-Change-Number: 17723 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10739: Support setting new partition spec for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17723 ) Change subject: IMPALA-10739: Support setting new partition spec for Iceberg tables .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/17723/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test: http://gerrit.cloudera.org:8080/#/c/17723/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test@309 PS2, Line 309: .*.0.parq','.*','' > This matches to all data files. Can we exclude '=' from the first .*? Good catch! Done. -- To view, visit http://gerrit.cloudera.org:8080/17723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Gerrit-Change-Number: 17723 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 03 Aug 2021 08:55:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10739: Support setting new partition spec for Iceberg tables
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17723 ) Change subject: IMPALA-10739: Support setting new partition spec for Iceberg tables .. IMPALA-10739: Support setting new partition spec for Iceberg tables With this patch Impala will support partition evolution for Iceberg tables. The DDL statement to change the default partition spec is: ALTER TABLE SET PARTITION SPEC() Hive uses the same SQL syntax. Testing: - Added FE test to exercise parsing various well-formed and ill-formed ALTER TABLE SET PARTITION SPEC statements. - Added e2e tests for: - ALTER TABLE SET PARTITION SPEC works for tables with HadoopTables and HadoopCatalog Catalog. - When evolving partition spec, the old data written with an earlier spec remains unchanged. New data is written using the new spec in a new layout. Data written with earlier spec and new spec can be fetched in a single query. - Invalid ALTER TABLE SET PARTITION SPEC statements yield the expected analysis error messages. Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 --- M be/src/exec/hdfs-table-sink.cc M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test 10 files changed, 306 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/17723/2 -- To view, visit http://gerrit.cloudera.org:8080/17723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Gerrit-Change-Number: 17723 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10739: Support setting new partition spec for Iceberg tables
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17723 ) Change subject: IMPALA-10739: Support setting new partition spec for Iceberg tables .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/17723/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17723/1//COMMIT_MSG@15 PS1, Line 15: rhe > the Done http://gerrit.cloudera.org:8080/#/c/17723/1//COMMIT_MSG@16 PS1, Line 16: > Please add section about testing. Done http://gerrit.cloudera.org:8080/#/c/17723/1/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/17723/1/common/thrift/JniCatalog.thrift@469 PS1, Line 469: from > for Done http://gerrit.cloudera.org:8080/#/c/17723/1/fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java: http://gerrit.cloudera.org:8080/#/c/17723/1/fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java@50 PS1, Line 50: sb.append(getTbl()).append(" SET PARTITION SPEC ") > Do we need to add parenthesis here, or are they added by icebergPartSpec_.t The parentheses are added in icebergPartSpec_.toSql() http://gerrit.cloudera.org:8080/#/c/17723/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test: http://gerrit.cloudera.org:8080/#/c/17723/1/testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test@303 PS1, Line 303: TYPES > Could you please add a SHOW FILES statement at the end so we can see the di Done -- To view, visit http://gerrit.cloudera.org:8080/17723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Gerrit-Change-Number: 17723 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 02 Aug 2021 11:57:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10739: Support setting new partition spec for Iceberg tables
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17723 Change subject: IMPALA-10739: Support setting new partition spec for Iceberg tables .. IMPALA-10739: Support setting new partition spec for Iceberg tables With this patch Impala will support partition evolution for Iceberg tables. The DDL statement to change the default partition spec is: ALTER TABLE SET PARTITION SPEC() Hive uses rhe same SQL syntax. Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 --- M be/src/exec/hdfs-table-sink.cc M common/thrift/JniCatalog.thrift M fe/src/main/cup/sql-parser.cup A fe/src/main/java/org/apache/impala/analysis/AlterTableSetPartitionSpecStmt.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test 10 files changed, 295 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/17723/1 -- To view, visit http://gerrit.cloudera.org:8080/17723 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9bd935b8a82e977df9ee90d464b5fe2a7acc83f2 Gerrit-Change-Number: 17723 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10820: Fix calculating default block size for parquest files
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/17719 ) Change subject: IMPALA-10820: Fix calculating default block size for parquest files .. IMPALA-10820: Fix calculating default block size for parquest files This patch fixes a bug introduced in IMPALA-10627. Because of the bug the wrong default block size was used for parquet files which broke TestInsertWideTable.test_insert_wide_table e2e test. Testing: - Run test_insert_wide_table with exhaustive strategy. Change-Id: Iac8c6dd80dfe84cb7b3d2106713eae87ce923934 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h 2 files changed, 12 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/17719/2 -- To view, visit http://gerrit.cloudera.org:8080/17719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iac8c6dd80dfe84cb7b3d2106713eae87ce923934 Gerrit-Change-Number: 17719 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10820: Fix calculating default block size for parquest files
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17719 Change subject: IMPALA-10820: Fix calculating default block size for parquest files .. IMPALA-10820: Fix calculating default block size for parquest files This patch fixes a bug introduced in IMPALA-10627. Because of the bug the wrong default block size was used for parquet files which broke TestInsertWideTable.test_insert_wide_table e2e test. Testing: - Run test_insert_wide_table with exhaustive strategy. Change-Id: Iac8c6dd80dfe84cb7b3d2106713eae87ce923934 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h 2 files changed, 8 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/17719/1 -- To view, visit http://gerrit.cloudera.org:8080/17719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iac8c6dd80dfe84cb7b3d2106713eae87ce923934 Gerrit-Change-Number: 17719 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. Patch Set 8: Code-Review+2 Carry +2 after rebase. -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Tue, 20 Jul 2021 10:14:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-catalogs.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 20 files changed, 1,147 insertions(+), 115 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/8 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/17654/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc File be/src/exec/parquet/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/17654/6/be/src/exec/parquet/hdfs-parquet-table-writer.cc@1150 PS6, Line 1150: : columns_.resize(num_co > Now the implementation of these could be moved to Configure()/ConfigureForI Done http://gerrit.cloudera.org:8080/#/c/17654/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/17654/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@325 PS6, Line 325: able.getParameters(); > For backward compatibility we might also want to search for "iceberg.file_f Done -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Fri, 16 Jul 2021 14:11:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 19 files changed, 1,147 insertions(+), 109 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/7 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 19 files changed, 1,134 insertions(+), 93 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/6 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 19 files changed, 1,134 insertions(+), 93 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/5 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 14 Jul 2021 14:07:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default value), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). The table property will be ignored if PARQUET_FILE_SIZE query option is set. If neither the table property nor the PARQUET_FILE_SIZE query option is set, the way Impala calculates row group size will remain unchanged. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates page size will remain unchanged. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). If the table property is unset, the way Impala calculates dictionary page size will remain unchanged. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 20 files changed, 1,088 insertions(+), 75 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/4 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10732: Use consistent DDL for specifying Iceberg partitions
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17575 ) Change subject: IMPALA-10732: Use consistent DDL for specifying Iceberg partitions .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17575/4//COMMIT_MSG@33 PS4, Line 33: makes Impala to use typo: makes Impala use http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@291 PS4, Line 291: transformType.startsWit Not your change, but why is startsWith() used instead of equals() for BUCKET and TRUNCATE transports? http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@302 PS4, Line 302: "Unsupported iceberg partition type: " Do we have a test that exercises this error message? http://gerrit.cloudera.org:8080/#/c/17575/4/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@296 PS4, Line 296: switch (transformType) { : case "HOUR": case "HOURS": return TIcebergPartitionTransformType.HOUR; : case "DAY": case "DAYS": return TIcebergPartitionTransformType.DAY; : case "MONTH": case "MONTHS": return TIcebergPartitionTransformType.MONTH; : case "YEAR": case "YEARS": return TIcebergPartitionTransformType.YEAR; : default: : throw new TableLoadingException("Unsupported iceberg partition type: " + : transformType); : } nit: Maybe adding these transform type strings and the ones above to a String -> TIcebergPartitionTransformType immutable map would make the code shorter and simpler. -- To view, visit http://gerrit.cloudera.org:8080/17575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib72ae445fd68fb0ab75d87b34779dbab922bbc62 Gerrit-Change-Number: 17575 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng Gerrit-Comment-Date: Wed, 07 Jul 2021 14:04:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/17654 ) Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). Setting it to 0 signals that the table property should be ignored. The table property will also be ignored if PARQUET_FILE_SIZE query option is set. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). Setting it to 0 signals that the table property should be ignored. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). Setting it to 0 signals that the table property should be ignored. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 20 files changed, 1,123 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/3 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: wangsheng
[Impala-ASF-CR] IMPALA-10627: Use standard parquet-related Iceberg table properties
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17654 Change subject: IMPALA-10627: Use standard parquet-related Iceberg table properties .. IMPALA-10627: Use standard parquet-related Iceberg table properties This patch adds support for the following standard Iceberg properties: write.parquet.compression-codec: Parquet compression codec. Supported values are: NONE, GZIP, SNAPPY (default), LZ4, ZSTD. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.compression-level: Parquet compression level. Used with ZSTD compression only. Supported range is [1, 22]. Default value is 3. The table property will be ignored if COMPRESSION_CODEC query option is set. write.parquet.row-group-size-bytes : Parquet row group size in bytes. Supported range is [8388608, 2146435072] (8MB - 2047MB). Setting it to 0 signals that the table property should be ignored. The table property will also be ignored if PARQUET_FILE_SIZE query option is set. write.parquet.page-size-bytes: Parquet page size in bytes. Used for PLAIN encoding. Supported range is [65536, 1073741824] (64KB - 1GB). Setting it to 0 signals that the table property should be ignored. write.parquet.dict-size-bytes: Parquet dictionary page size in bytes. Used for dictionary encoding. Supported range is [65536, 1073741824] (64KB - 1GB). Setting it to 0 signals that the table property should be ignored. This patch also renames 'iceberg.file_format' table property to 'write.format.default' which is the standard Iceberg name for the table property. Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M be/src/exec/parquet/hdfs-parquet-table-writer.h M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCtasTarget.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/IcebergUtil.java M testdata/datasets/functional/functional_schema_template.sql M testdata/workloads/functional-query/queries/QueryTest/iceberg-alter.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-insert.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test 20 files changed, 1,123 insertions(+), 82 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/17654/2 -- To view, visit http://gerrit.cloudera.org:8080/17654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3b8aa9a52c13c41b48310d2f7c9c7426e1ff5f23 Gerrit-Change-Number: 17654 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10750: Impala-shell changes for HS2 compatibility
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17590 ) Change subject: IMPALA-10750: Impala-shell changes for HS2 compatibility .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/17590/2/shell/impala_client.py File shell/impala_client.py: http://gerrit.cloudera.org:8080/#/c/17590/2/shell/impala_client.py@846 PS2, Line 846: is_null.frombytes(tcol.nulls) Would it be possible for tcol.nulls to be None (no NULL values)? If so, this line will raise an exception. -- To view, visit http://gerrit.cloudera.org:8080/17590 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id3a4c4ce8a5d60db136df1743f32dba22172ee13 Gerrit-Change-Number: 17590 Gerrit-PatchSet: 2 Gerrit-Owner: Steve Carlin Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 15 Jun 2021 13:26:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5121: Fix AVG() on timestamp col with use local tz for unix timestamp conversions
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17412 ) Change subject: IMPALA-5121: Fix AVG() on timestamp col with use_local_tz_for_unix_timestamp_conversions .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17412 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I999099de8e07269b96b75d473f5753be4479cecd Gerrit-Change-Number: 17412 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 14 May 2021 08:09:46 + Gerrit-HasComments: No
[Impala-ASF-CR] POC: use puresasl instead of sasl in impala-shell
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17351 ) Change subject: POC: use puresasl instead of sasl in impala-shell .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17351/4/infra/python/deps/requirements.txt File infra/python/deps/requirements.txt: http://gerrit.cloudera.org:8080/#/c/17351/4/infra/python/deps/requirements.txt@39 PS4, Line 39: pure-sasl == 0.6.2 Note that there's also a shell/packaging/requirements.txt that refers to sasl==0.2.1 -- To view, visit http://gerrit.cloudera.org:8080/17351 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba5a15e867969938792d120cd8f1ad1ed6370906 Gerrit-Change-Number: 17351 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 29 Apr 2021 16:11:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 4: Code-Review+2 Thanks! -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 4 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Tue, 20 Apr 2021 16:33:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17325 ) Change subject: IMPALA-10662: Change EE tests to return the same results for HS2 as Beeswax .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/17325/3/tests/common/impala_connection.py File tests/common/impala_connection.py: http://gerrit.cloudera.org:8080/#/c/17325/3/tests/common/impala_connection.py@301 PS3, Line 301: convert_types=False According to the impyla comments: convert_types : bool, optional When `False`, timestamps and decimal values will not be converted to Python `datetime` and `Decimal` values. (These conversions are expensive.) Only applies when using HS2 protocol versions > 6. The comment mentions DECIMAL and TIMESTAMP values but it doesn't mention FLOAT & DOUBLE values. I'm just curious, why were FLOAT/DOUBLE values previously converted to a lower precision value? -- To view, visit http://gerrit.cloudera.org:8080/17325 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If69ae90c6333ff245c2b951af5689e3071f85cb2 Gerrit-Change-Number: 17325 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Comment-Date: Tue, 20 Apr 2021 16:18:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17107 ) Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17107 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8 Gerrit-Change-Number: 17107 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 22 Feb 2021 20:18:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10496: Remove checking port in FLAGS saml2 sp callback url
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/17087 ) Change subject: IMPALA-10496: Remove checking port in FLAGS_saml2_sp_callback_url .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/17087 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2534b7a1a2bf16bf48ba533dc13fd300f690f4e5 Gerrit-Change-Number: 17087 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 19 Feb 2021 13:24:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. Patch Set 6: Code-Review+2 Fixed test failures. Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 17 Nov 2020 13:53:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually with nginx HTTP proxy. TODO: - Test with Knox HTTP proxy as well. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py M shell/make_shell_tarball.sh M shell/packaging/make_python_package.sh A tests/shell/test_cookie_util.py 8 files changed, 286 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/6 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually with nginx HTTP proxy. TODO: - Test with Knox HTTP proxy as well. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py A tests/shell/test_cookie_util.py 6 files changed, 284 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/5 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 5 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually with nginx HTTP proxy. TODO: - Test with Knox HTTP proxy as well. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M fe/src/test/java/org/apache/impala/customcluster/LdapImpalaShellTest.java M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py A tests/shell/test_cookie_util.py 6 files changed, 284 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/4 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py A tests/shell/test_cookie_util.py 5 files changed, 314 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/3 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16660 ) Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py A tests/shell/test_cookie_util.py 5 files changed, 314 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/2 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-10234: Add support for cookie authentication to impala-shell
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16660 Change subject: IMPALA-10234: Add support for cookie authentication to impala-shell .. IMPALA-10234: Add support for cookie authentication to impala-shell IMPALA-8584 added support for cookie authentication to Impala. This change adds cookie authentication support to impala-shell as well when using 'hs2-http' protocol. Testing: - Unit tests were added to test cookie handling methods. - Tested e2e manually. Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 --- M shell/ImpalaHttpClient.py A shell/cookie_util.py M shell/impala_client.py M shell/impala_shell.py A tests/shell/test_cookie_util.py 5 files changed, 307 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/16660/1 -- To view, visit http://gerrit.cloudera.org:8080/16660 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Icb0bc6e0f58f236866ca9913a2e63d97d5148f51 Gerrit-Change-Number: 16660 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10224: Add startup flag not to expose debug web url to clients
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/16573 ) Change subject: IMPALA-10224: Add startup flag not to expose debug web url to clients .. IMPALA-10224: Add startup flag not to expose debug web url to clients This patch introduces a new startup flag --ping_expose_webserver_url (true by default) to control whether PingImpalaService, PingImpalaHS2Service RPC calls should expose the debug web url to the client or not. This is necessary as the debug web UI is not something that end-users will necessarily have access to. If the flag is set to false, the RPC calls will return an empty string instead of the real url signalling that the debug web ui is not available. Note that if the webserver is disabled (--enable_webserver flag is set to false) the RPC calls will behave the same and return an empty string for the url. Change-Id: I7ec3e92764d712b8fee63c1f45b038c31c184cfc --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M shell/impala_client.py M shell/impala_shell.py M tests/custom_cluster/test_web_pages.py 7 files changed, 62 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/16573/2 -- To view, visit http://gerrit.cloudera.org:8080/16573 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7ec3e92764d712b8fee63c1f45b038c31c184cfc Gerrit-Change-Number: 16573 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-10224: Add startup flag not to expose debug web url to clients
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16573 ) Change subject: IMPALA-10224: Add startup flag not to expose debug web url to clients .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16573/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16573/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-10224: Add startup flag not to expose debug web url to clients > It looks like one of the calls to print the query link in impala_shell is g Good catch, thanks! I've also noticed that there's another get_query_link() call in impala_client.py. I've added the guard there too. -- To view, visit http://gerrit.cloudera.org:8080/16573 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7ec3e92764d712b8fee63c1f45b038c31c184cfc Gerrit-Change-Number: 16573 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Oct 2020 09:54:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10224: Add startup flag not to expose debug web url to clients
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16573 Change subject: IMPALA-10224: Add startup flag not to expose debug web url to clients .. IMPALA-10224: Add startup flag not to expose debug web url to clients This patch introduces a new startup flag --ping_expose_webserver_url (true by default) to control whether PingImpalaService, PingImpalaHS2Service RPC calls should expose the debug web url to the client or not. This is necessary as the debug web UI is not something that end-users will necessarily have access to. If the flag is set to false, the RPC calls will return an empty string instead of the real url signalling that the debug web ui is not available. Note that if the webserver is disabled (--enable_webserver flag is set to false) the RPC calls will behave the same and return an empty string for the url. Change-Id: I7ec3e92764d712b8fee63c1f45b038c31c184cfc --- M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M tests/custom_cluster/test_web_pages.py 5 files changed, 48 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/16573/1 -- To view, visit http://gerrit.cloudera.org:8080/16573 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7ec3e92764d712b8fee63c1f45b038c31c184cfc Gerrit-Change-Number: 16573 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-10225: bump impyla version to 0.17a1
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16562 ) Change subject: IMPALA-10225: bump impyla version to 0.17a1 .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I70a0e883275f3c29e2b01fd5bab7725857c8a1ed Gerrit-Change-Number: 16562 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 09 Oct 2020 14:09:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10054: Fix flakiness in test multiple sort run bytes limits
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16301 ) Change subject: IMPALA-10054: Fix flakiness in test_multiple_sort_run_bytes_limits .. Patch Set 3: Code-Review+2 Thanks for the explanation! -- To view, visit http://gerrit.cloudera.org:8080/16301 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad Gerrit-Change-Number: 16301 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Sun, 09 Aug 2020 11:47:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10054: Fix flakiness in test multiple sort run bytes limits
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16301 ) Change subject: IMPALA-10054: Fix flakiness in test_multiple_sort_run_bytes_limits .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/16301/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/16301/1/tests/query_test/test_sort.py@90 PS1, Line 90: ' - SpilledRuns:.*' > nit: Perhaps you could use a more complete regex pattern here: Also, please use raw strings for regex patterns, e.g.: r'\s+\- SpilledRuns: %s' % spilled_runs -- To view, visit http://gerrit.cloudera.org:8080/16301 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad Gerrit-Change-Number: 16301 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 07 Aug 2020 11:02:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10054: Fix flakiness in test multiple sort run bytes limits
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16301 ) Change subject: IMPALA-10054: Fix flakiness in test_multiple_sort_run_bytes_limits .. Patch Set 1: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/16301/1/tests/query_test/test_sort.py File tests/query_test/test_sort.py: http://gerrit.cloudera.org:8080/#/c/16301/1/tests/query_test/test_sort.py@90 PS1, Line 90: ' - SpilledRuns:.*' nit: Perhaps you could use a more complete regex pattern here: '\s+\- SpilledRuns: %s.*' % spilled_runs and then you can remove the extra check in L92. You can also use re.search() instead of re.findall() since you don't need to scan the whole runtime profile after the first match. -- To view, visit http://gerrit.cloudera.org:8080/16301 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I84a8b579c943cddba4432cf183f7f002ef8ec6ad Gerrit-Change-Number: 16301 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Fri, 07 Aug 2020 10:02:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10006: handle non-writable /opt/impala/logs
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16237 ) Change subject: IMPALA-10006: handle non-writable /opt/impala/logs .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If32d6eef75422b51f8877478bbfb1a709c02f756 Gerrit-Change-Number: 16237 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 30 Jul 2020 16:34:37 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-9482 Support for BINARY columns
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/16066 ) Change subject: WIP IMPALA-9482 Support for BINARY columns .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc File be/src/runtime/types.cc: http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@122 PS2, Line 122: ToThrift(PrimitiveType ptype) Should this function work as the inverse of ThriftToType() ? If so, shouldn't it take a AuxColumnType parameter as well? http://gerrit.cloudera.org:8080/#/c/16066/2/be/src/runtime/types.cc@222 PS2, Line 222: ColumnType::ToThrift(TColumnType* thrift_type) Same as above, maybe it needs now an additional AuxColumnType parameter? http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java File fe/src/main/java/org/apache/impala/analysis/CastExpr.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/analysis/CastExpr.java@126 PS2, Line 126:// No built-in function needed for BINARY <-> STRING conversion. : if (fromType.getPrimitiveType() == PrimitiveType.BINARY || : toType.getPrimitiveType() == PrimitiveType.BINARY){ : continue; : } BINARY<->STRING conversions are no-op conversins, so maybe this block should be moved after L191. http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java File fe/src/main/java/org/apache/impala/catalog/Type.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/catalog/Type.java@820 PS2, Line 820: // STRING <-> BINARY conversion is not lossy, but implicit cast is not allowed. I'm probably misunderstanding something but the commit msg suggests that BINARY to STRING implicit conversion is supported. "UDF/UDAFs that expect STRING argument accept BINARY too, while in Hive explicit cast is needed in this case." Please clarify. http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java File fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java: http://gerrit.cloudera.org:8080/#/c/16066/2/fe/src/main/java/org/apache/impala/util/AvroSchemaConverter.java@154 PS2, Line 154: case BINARY: return Schema.create(Schema.Type.STRING); I'm not sure about this, maybe binary should be converted to avro bytes. The avro documentation on primitive types states that: bytes: sequence of 8-bit unsigned bytes string: unicode character sequence http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/16066/2/testdata/bin/generate-schema-statements.py@213 PS2, Line 213: string Again, avro has bytes type which might be a better fit for binary. -- To view, visit http://gerrit.cloudera.org:8080/16066 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I36861a9ca6c2047b0d76862507c86f7f153bc582 Gerrit-Change-Number: 16066 Gerrit-PatchSet: 2 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Thu, 18 Jun 2020 15:24:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9555 part 2: [Hive3] Fix test failure introduced by HIVE-22589
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15618 Change subject: IMPALA-9555 part 2: [Hive3] Fix test failure introduced by HIVE-22589 .. IMPALA-9555 part 2: [Hive3] Fix test failure introduced by HIVE-22589 This patch is a continuation of IMPALA-9555. It makes Avro DATE tests more resilient by using regex for expected error messages instead of using concrete error messages. Change-Id: I36340be70a37b75997cf49625a173ec2690ed9b8 --- M testdata/workloads/functional-query/queries/QueryTest/avro_date.test 1 file changed, 8 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/15618/1 -- To view, visit http://gerrit.cloudera.org:8080/15618 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I36340be70a37b75997cf49625a173ec2690ed9b8 Gerrit-Change-Number: 15618 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15564 ) Change subject: IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589 .. Patch Set 1: > >. the test is skipped for ORC (not sure if this is on purpose or > by accident). > My guess is that updating this test was forgotten in the quite > recent https://gerrit.cloudera.org/#/c/14982/ > > I think that in the ideal case we should test both: Julian to test > that invalid dates are handled properly (this probably has to be > file format specific, as error messages are different) and > Gregorian to have a more extended suite of tests that can run on > more file formats. > > The change itself looks good to me, but I am worried about the back > and forth changes in Hive. Thanks for the review. Let's merge this in now to unblock the core test suite. I agree that DATE testing across different fileformats and Hive versions is pretty messy now and we don't cover all the different scenarios. There's a lot of room for improvement, but we should address that in a separate patch-set. -- To view, visit http://gerrit.cloudera.org:8080/15564 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51dd933867ea7877235e7f6e1f2b56711dca107e Gerrit-Change-Number: 15564 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Fri, 27 Mar 2020 12:39:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15564 ) Change subject: IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589 .. Patch Set 1: > I have a basic design question: couldn't we set hive.avro.proleptic.gregorian > to true during dataload instead of changing the tests? As other > formats use gregorian as far as I know, this seems a better to me, > at least to test interop with Impala. Parquet and Orc fileformats have the same issues with the DATE type as Avro. They may also use Gregorian or Julian Calendar depending on which version of Hive they were written by. The failing test is failing only for Avro because: 1. the test is skipped for ORC (not sure if this is on purpose or by accident). 2. the Parquet test table has been written by Impala (instead of Hive) during the data load. We also have tests for ORC and Parquet to demonstrate the issues related to the Julian vs Gregorian Calendars, but they use pre-created ORC/Parquet files (written by Hive2) and are not affected by HIVE-22589. I don't see much value in forcing Gregorian Calendar for writing Avro tables. The rewritten tests show the default behavior users can expect: pre -1582-10-12 DATEs are incorrect, but everything after that is working fine. -- To view, visit http://gerrit.cloudera.org:8080/15564 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51dd933867ea7877235e7f6e1f2b56711dca107e Gerrit-Change-Number: 15564 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Thu, 26 Mar 2020 20:38:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15564 Change subject: IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589 .. IMPALA-9555: [Hive3] Fix test failure introduced by HIVE-22589 With HIVE-22589 Hive3 switched back to using Julian Calendar for historical dates by default which caused an Impala test failure around Avro DATE values. Change-Id: I51dd933867ea7877235e7f6e1f2b56711dca107e --- M testdata/workloads/functional-query/queries/QueryTest/avro_date.test M tests/query_test/test_date_queries.py 2 files changed, 33 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/15564/1 -- To view, visit http://gerrit.cloudera.org:8080/15564 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I51dd933867ea7877235e7f6e1f2b56711dca107e Gerrit-Change-Number: 15564 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[native-toolchain-CR] IMPALA-9226: Add patch to ORC-1.6.2 in the toolchain
Attila Jeges has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15265 ) Change subject: IMPALA-9226: Add patch to ORC-1.6.2 in the toolchain .. IMPALA-9226: Add patch to ORC-1.6.2 in the toolchain This commit adds a bugfix patch that blocks IMPALA-9226. Tests: - Run query_test for orc/def/block locally. - Builds succeeded in all supported platforms. Change-Id: I0f86d9493d3907e51a8d559adeb4f4b042379457 Reviewed-on: http://gerrit.cloudera.org:8080/15265 Reviewed-by: Quanlong Huang Tested-by: Attila Jeges --- M buildall.sh A source/orc/orc-1.6.2-patches/0007-ORC-600-Fix-StringDictionaryColumnReader-to-update-i.patch 2 files changed, 154 insertions(+), 1 deletion(-) Approvals: Quanlong Huang: Looks good to me, approved Attila Jeges: Verified -- To view, visit http://gerrit.cloudera.org:8080/15265 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I0f86d9493d3907e51a8d559adeb4f4b042379457 Gerrit-Change-Number: 15265 Gerrit-PatchSet: 2 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Quanlong Huang
[native-toolchain-CR] IMPALA-9226: Add patch to ORC-1.6.2 in the toolchain
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15265 ) Change subject: IMPALA-9226: Add patch to ORC-1.6.2 in the toolchain .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15265 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0f86d9493d3907e51a8d559adeb4f4b042379457 Gerrit-Change-Number: 15265 Gerrit-PatchSet: 1 Gerrit-Owner: Norbert Luksa Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 24 Feb 2020 11:33:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9395: fix duplicate broadcast SetFilter() calls
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15242 ) Change subject: IMPALA-9395: fix duplicate broadcast SetFilter() calls .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15242 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I95d0620c4dbb5e4066702db48442cebee7389f5a Gerrit-Change-Number: 15242 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 21 Feb 2020 13:46:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9385: Unix time conversion cleanup + ORC fix
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15222 ) Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/15222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14e2a7e512ccd013d5d9fe480a5467ed4c46b76e Gerrit-Change-Number: 15222 Gerrit-PatchSet: 12 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Fri, 21 Feb 2020 13:45:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9036: Fix CTRL+C a multiline query in impala-shell
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15233 ) Change subject: IMPALA-9036: Fix CTRL+C a multiline query in impala-shell .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/15233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8d8bdaee929e2655eb66e886ae92a02d3fbd83f Gerrit-Change-Number: 15233 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Feb 2020 15:24:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9385: Unix time conversion cleanup + ORC fix
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15222 ) Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc File be/src/exec/data-source-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc@352 PS7, Line 352: // TODO The timezone depends on flag use_local_tz_for_unix_timestamp_conversions. : // Check if this is the intended behaviour. : RETURN_IF_ERROR(MaterializeNextRow( : state->time_zone_for_unix_time_conversions(), tuple_pool, tuple)); > I was thinking about UTCPTR instead. Using local_time_zone() would mean tha Yes, you're correct, it should be UTCPTR. -- To view, visit http://gerrit.cloudera.org:8080/15222 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I14e2a7e512ccd013d5d9fe480a5467ed4c46b76e Gerrit-Change-Number: 15222 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Norbert Luksa Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Wed, 19 Feb 2020 13:05:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9036: Fix CTRL+C a multiline query in impala-shell
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15233 ) Change subject: IMPALA-9036: Fix CTRL+C a multiline query in impala-shell .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/15233/2/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/15233/2/tests/shell/test_shell_interactive.py@252 PS2, Line 252:"wrong\n1", "[1]: incorrect\n2", : "select 3 --comment\n", "[2]: select 4 --comment", : "select 5 --comment\n\n\n", "[3]: select 6 --comment", : "select /*comment*/\n7", "[4]: select /*comment*/\n8", : "select\n/*comm\nent*/\n9", "[5]: select\n/*comm\nent*/\n10" I'd use input lines like "line 1", "line 2', "one", "two" or something similar, with newlines scattered through. You can throw in some SQL- or C-style comments too for good measure but keep them short. In general the idea is to make these input lines as simple as possible to make the intent clear, which is that we expect that these erroneous lines will be ignored because of Ctrl-C. http://gerrit.cloudera.org:8080/#/c/15233/2/tests/shell/test_shell_interactive.py@256 PS2, Line 256: "[5]: select\n/*comm\nent*/\n10" I think there are two test cases here that we should address: 1. When the last line before Ctrl-C ends with newline. 2. When it doesn't. http://gerrit.cloudera.org:8080/#/c/15233/2/tests/shell/test_shell_interactive.py@259 PS2, Line 259: child_proc.sendintr() If the very last line before Ctrl-C ends with a newline, you should add before L259 a child_proc.expect(' >') to make it clear that impala-shell is waiting for more input. -- To view, visit http://gerrit.cloudera.org:8080/15233 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id8d8bdaee929e2655eb66e886ae92a02d3fbd83f Gerrit-Change-Number: 15233 Gerrit-PatchSet: 2 Gerrit-Owner: Adam Tamas Gerrit-Reviewer: Adam Tamas Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Wed, 19 Feb 2020 10:35:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9385: Unix time conversion cleanup + ORC fix
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15222 ) Change subject: IMPALA-9385: Unix time conversion cleanup + ORC fix .. Patch Set 7: (10 comments) http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@13 PS6, Line 13: is was was no ? http://gerrit.cloudera.org:8080/#/c/15222/6//COMMIT_MSG@26 PS6, Line 26: that nit: not necessary http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h File be/src/common/global-types.h: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/common/global-types.h@31 PS6, Line 31: #define UTCPTR nullptr Why no define a const Timezone* instead? http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc File be/src/exec/data-source-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/15222/7/be/src/exec/data-source-scan-node.cc@352 PS7, Line 352: // TODO The timezone depends on flag use_local_tz_for_unix_timestamp_conversions. : // Check if this is the intended behaviour. : RETURN_IF_ERROR(MaterializeNextRow( : state->time_zone_for_unix_time_conversions(), tuple_pool, tuple)); You're raising a good point in the comment. I think here we should just pass state->local_time_zone(). http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/exprs/expr-test.cc@170 PS6, Line 170: Returning const Timezone* here might simplify things a bit. http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h File be/src/runtime/timestamp-value.h: http://gerrit.cloudera.org:8080/#/c/15222/6/be/src/runtime/timestamp-value.h@102 PS6, Line 102: 'unix_time' is assumed to be UTC nit: The comment is a bit confusing. 'unix_time' is always in UTC (by definition) not just when 'local_tz' is set to non-UTC. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test File testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test: http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-local-tz-conversion.test@3 PS6, Line 3: This test is also called with convert_legacy_hive_parquet_utc_timestamps=true. Comment is confusing: I think the test is only called with convert_legacy_hive_parquet_utc_timestamps=true. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test File testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test: http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@18 PS6, Line 18: : QUERY Move the new sections to a separate .test file or rename this file to something more appropriate. Originally this test file was meant for testing UTC timestamp functions only. http://gerrit.cloudera.org:8080/#/c/15222/6/testdata/workloads/functional-query/queries/QueryTest/utc-timestamp-functions.test@45 PS6, Line 45: QUERY : SET timezone=CET; : select min(timestamp_col) from functional_avro.alltypestiny; : TYPES : STRING : RESULTS : '2009-01-01 00:00:00' Since functional_avro.alltypestiny.timestamp_col is a string so probably you can remove this section. http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py File tests/custom_cluster/test_hive_parquet_timestamp_conversion.py: http://gerrit.cloudera.org:8080/#/c/15222/6/tests/custom_cluster/test_hive_parquet_timestamp_conversion.py@74 PS6, Line 74: self.check_sanity(True) : # Test with UTC too to check the optimizations added in IMPALA-9385. : for tz_name in ["PST8PDT", "UTC"]: : # The value read from the Hive table should be the same as reading a UTC converted : # value from the Impala table. : data = self.execute_query_expect_success(self.client, """ : SELECT h.id, h.day, h.timestamp_col, i.timestamp_col : FROM functional_parquet.alltypesagg_hive_13_1 h : JOIN functional_parquet.alltypesagg : i ON i.id = h.id AND i.day = h.day -- serves as a unique key : WHERE : (h.timestamp_col IS NULL AND i.timestamp_col IS NOT NULL) : OR (h.timestamp_col IS NOT NULL AND i.timestamp_col IS NULL) : OR h.timestamp_col !=
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 9: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 9 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Wed, 12 Feb 2020 08:33:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 8: > Uploaded patch set 7. Bumped Kudu version and got rid of installing libcurl3 dependency. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 11 Feb 2020 13:30:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 42 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/8 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 8: > Uploaded patch set 8. Rebased patch-set. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 8 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 11 Feb 2020 13:29:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 42 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/7 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 7 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[native-toolchain-CR] IMPALA-9279: part 2: Bump Kudu version to 5c610bf40
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15192 ) Change subject: IMPALA-9279: part 2: Bump Kudu version to 5c610bf40 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3ff92cc5e1de220a4c140cf6c0117b5fa1e89226 Gerrit-Change-Number: 15192 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 11 Feb 2020 06:40:20 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9279: part 2: Bump Kudu version to 5c610bf40
Attila Jeges has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15192 ) Change subject: IMPALA-9279: part 2: Bump Kudu version to 5c610bf40 .. IMPALA-9279: part 2: Bump Kudu version to 5c610bf40 This pulls in a Kudu change that links Kudu executables statically to libcurl in Kudu's thirdparty directory instead of relying on the dynamic linker to find libcurl at runtime. Testing: - Ran the C6 toolchain build job with the Kudu version bump for native toolchain to make sure that it builds on all supported platforms. Change-Id: I3ff92cc5e1de220a4c140cf6c0117b5fa1e89226 Reviewed-on: http://gerrit.cloudera.org:8080/15192 Reviewed-by: Joe McDonnell Tested-by: Attila Jeges --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Joe McDonnell: Looks good to me, approved Attila Jeges: Verified -- To view, visit http://gerrit.cloudera.org:8080/15192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3ff92cc5e1de220a4c140cf6c0117b5fa1e89226 Gerrit-Change-Number: 15192 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 6: > I just saw this commit in Kudu, and I'm wondering if it helps with > the libcurl situation: https://gerrit.cloudera.org/#/c/15180/ > > If they link libcurl, then we might not need to install it. Correct, this Kudu change eliminates the runtime dependency on the libcurl shared library. Here's a native-toolchain CR to bump kudu version once again to include the fix: https://gerrit.cloudera.org/#/c/15192/ -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 6 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 10 Feb 2020 15:43:46 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9279: part 2: Bump Kudu version to 5c610bf40
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15192 Change subject: IMPALA-9279: part 2: Bump Kudu version to 5c610bf40 .. IMPALA-9279: part 2: Bump Kudu version to 5c610bf40 This pulls in a Kudu change that links Kudu executables statically to libcurl in Kudu's thirdparty directory instead of relying on the dynamic linker to find libcurl at runtime. Testing: - Ran the C6 toolchain build job with the Kudu version bump for native toolchain to make sure that it builds on all supported platforms. Change-Id: I3ff92cc5e1de220a4c140cf6c0117b5fa1e89226 --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/92/15192/1 -- To view, visit http://gerrit.cloudera.org:8080/15192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I3ff92cc5e1de220a4c140cf6c0117b5fa1e89226 Gerrit-Change-Number: 15192 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: (2 comments) > > > Patch Set 2: > > > The verify job failed because kudu-3ba5ec5d0 (kudu-1.12.0-SNAPSHOT) > > has a new run-time dependency: libcurl.so.4 which is not > available > > in the ubuntu-16.04-configured jenkins worker label. I'm > discussing > > with laszlog the possibility of adding libcurls.so.4 to the > worker > > labe;. > > > > > > > If we decide to take this new Kudu version as a dependency, then > > the correct way to handle libcurl.so.4 as a new runtime > dependency > > is to add it to the list of packages we install in > > bin/bootstrap_system.sh. > > The worker image referenced above is only minimally preconfigured > > to allow fast startup times; Impala runtime/development time > > dependencies should be managed in the bootstrap scripts. > > > > Additionally, the dependency on libcurl.so.4 should be evaluated > > for all OS platforms we claim to have support for: e.g. a brief > > scan of this article[1] claims that running both libcurl.so.3 and > > libcurl.so.4 on Ubuntu 18.04 is at least non-trivial to set up. > > > > [1]: > > https://dev.to/jake/using-libcurl3-and-libcurl4-on-ubuntu-1804-bionic-184g, > > "Using libcurl3 and libcurl4 on Ubuntu 18.04 (Bionic)" > > In bin/bootstrap_system.sh, I don't see us installing curl for > ubuntu, but I see us installing it for centos. I would try adding > it and see if that helps. (We have curl installed in all the docker > images we use to build kudu for the native toolchain.) > > We can run a ubuntu-18.04-from-scratch job to see if it works. Installing curl on Ubuntu 16.04 installs libcurl-gnutls.so.4 but it doesn't install the required libcurl.so.4. "apt install libcurl3" on the other hand works for all supported Ubuntu releases, so I've added that to bin/bootstrap_system.sh. http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@719 PS2, Line 719: export IMPALA_TOOLCHAIN_KUDU_MAVEN_REPOSITORY="file://${IMPALA_TOOLCHAIN}" > Since this is disabled, I think we can set it to an empty string. If that w Setting url to an empty string results in an error but I can set it to something like "file:///non/existing/repo" What do you think? http://gerrit.cloudera.org:8080/#/c/15134/2/bin/impala-config.sh@722 PS2, Line 722: export IMPALA_KUDU_VERSION="3ba5ec5d0" : export IMPALA_KUDU_JAVA_VERSION="1.12.0-SNAPSHOT" > One use case that we want to support is for someone to be able to override Done -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 03 Feb 2020 14:49:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_system.sh M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 4 files changed, 43 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/3 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. Patch Set 2: > (2 comments) > > Thanks for working on this. This is looking pretty good. I'm > thinking through the edge cases where we want to override some > versions, so I may have a couple more comments. > > In the meantime, I'm going to run an upstream verify job on this > review. The verify job failed because kudu-3ba5ec5d0 (kudu-1.12.0-SNAPSHOT) has a new run-time dependency: libcurl.so.4 which is not available in the ubuntu-16.04-configured jenkins worker label. I'm discussing with laszlog the possibility of adding libcurls.so.4 to the worker labe;. As far as I know, both CDH GBN Kudu and CDP GBN Kudu are based on kudu-1.11.0 which doesn't depend on libcurl.so.4. I considered updating toolchain Kudu to 1.11.0 or 1.11.1 (which is the latest upstream Kudu release) instead of 3ba5ec5d0, but kudu-1.11.x doesn't have support for VARCHAR yet. -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Fri, 31 Jan 2020 14:11:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/15134 ) Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 43 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/2 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[Impala-ASF-CR] IMPALA-9279: Update the Kudu version to include VARCHAR support
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15134 Change subject: IMPALA-9279: Update the Kudu version to include VARCHAR support .. IMPALA-9279: Update the Kudu version to include VARCHAR support Before this change the preferred way of getting Kudu was to pull it in from the specified CDH build (even if USE_CDP_HIVE was set to true). Optionally by setting USE_CDH_KUDU to false, one could force Impala to use the native toolchain Kudu. But even then, the Kudu Java artifacts would be downloaded from CDH. Since Kudu VARCHAR support won't be backported to CDH, this behavior blocks the Impala side of the Kudu/Impala VARCHAR integration. With this change: 1. Using the native toolchain Kudu (including the Java artifacts) is the default behavior. From now on USE_CDH_KUDU will be set to false by default. Impala can be forced to fall back on using the CDH Kudu by explicitly setting USE_CDH_KUDU to true. 2. Kudu version is updated to include the VARCHAR support. Testing: Ran exhaustive tests with USE_CDH_KUDU=true and USE_CDH_KUDU=false. Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a --- M bin/bootstrap_toolchain.py M bin/impala-config.sh M impala-parent/pom.xml 3 files changed, 42 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/15134/1 -- To view, visit http://gerrit.cloudera.org:8080/15134 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iafe56342d43cb63e35c0bbb1b4a99327dda0a44a Gerrit-Change-Number: 15134 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[native-toolchain-CR] IMPALA-9279: Bump Kudu version to 3ba5ec5d0
Attila Jeges has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15119 ) Change subject: IMPALA-9279: Bump Kudu version to 3ba5ec5d0 .. IMPALA-9279: Bump Kudu version to 3ba5ec5d0 This pulls in Kudu VARCHAR support which is needed for the Impala side of the Kudu/Impala VARCHAR integration. Testing: - Ran the C6 toolchain build job with the kudu version bump for native toolchain to make sure that it builds on all supported platforms. - Built Impala locally with kudu-3ba5ec5d0 and ran test_kudu.py E2E and AnalyzeKuduDDLTest FE tests. Change-Id: Ibc3fd6f0c7d31f1f80753402adc0ca5b3c5759a0 Reviewed-on: http://gerrit.cloudera.org:8080/15119 Reviewed-by: Joe McDonnell Tested-by: Attila Jeges --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Joe McDonnell: Looks good to me, approved Attila Jeges: Verified -- To view, visit http://gerrit.cloudera.org:8080/15119 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ibc3fd6f0c7d31f1f80753402adc0ca5b3c5759a0 Gerrit-Change-Number: 15119 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell
[native-toolchain-CR] IMPALA-9279: Bump Kudu version to 3ba5ec5d0
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15119 ) Change subject: IMPALA-9279: Bump Kudu version to 3ba5ec5d0 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15119 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibc3fd6f0c7d31f1f80753402adc0ca5b3c5759a0 Gerrit-Change-Number: 15119 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Wed, 29 Jan 2020 08:13:30 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9279: Bump Kudu version to 3ba5ec5d0
Attila Jeges has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15119 Change subject: IMPALA-9279: Bump Kudu version to 3ba5ec5d0 .. IMPALA-9279: Bump Kudu version to 3ba5ec5d0 This pulls in Kudu VARCHAR support which is needed for the Impala side of the Kudu/Impala VARCHAR integration. Testing: - Ran the C6 toolchain build job with the kudu version bump for native toolchain to make sure that it builds on all supported platforms. - Built Impala locally with kudu-3ba5ec5d0 and ran test_kudu.py E2E and AnalyzeKuduDDLTest FE tests. Change-Id: Ibc3fd6f0c7d31f1f80753402adc0ca5b3c5759a0 --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/19/15119/1 -- To view, visit http://gerrit.cloudera.org:8080/15119 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ibc3fd6f0c7d31f1f80753402adc0ca5b3c5759a0 Gerrit-Change-Number: 15119 Gerrit-PatchSet: 1 Gerrit-Owner: Attila Jeges
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 28 Jan 2020 14:51:30 + Gerrit-HasComments: No
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. IMPALA-9265: Support for toolchain Kudu to provide Java artifacts The build script was modified to generate Kudu JARs and add them to the Kudu tarball. redhat6 and redhat7 docker images were modified to update Java 8 to a newer version that is suitable for building the Java artifacts. ubuntu1404 docker image was modified to include CA certificate file for Java. Testing: Ran the C6 toolchain build job to verify that native-toolchain builds on all supported platforms. Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Reviewed-on: http://gerrit.cloudera.org:8080/15072 Reviewed-by: Joe McDonnell Tested-by: Attila Jeges --- M docker/all/postinstall.sh A docker/redhat/Centos7-Vault.repo M docker/redhat6.df M docker/redhat7.df M docker/ubuntu1404.df M source/kudu/build.sh 6 files changed, 45 insertions(+), 6 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Attila Jeges: Verified -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 4 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/15072/3/source/kudu/build.sh File source/kudu/build.sh: http://gerrit.cloudera.org:8080/#/c/15072/3/source/kudu/build.sh@137 PS3, Line 137: local JAVA_INSTALL_DIR="$LOCAL_INSTALL/java" : mkdir -p "$JAVA_INSTALL_DIR" : pushd java : export GRADLE_USER_HOME="$(pwd)" : wrap ./gradlew :kudu-hive:assemble :kudu-client:assemble : # Copy kudu-hive jars to JAVA_INSTALL_DIR. : local F : for F in kudu-hive/build/libs/kudu-hive-*.jar; do : cp "$F" "$JAVA_INSTALL_DIR" : done : # Install kudu-client artifacts to the Local Maven Repository: : wrap ./gradlew -Dmaven.repo.local="${JAVA_INSTALL_DIR}/repository" :kudu-client:install : popd I've also tested Impala quickly with kudu-hive and kudu-client built like this locally on my dev machine. It looks good, impala builds without an issue and the kudu-related E2E tests that I've tried passed. -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Mon, 27 Jan 2020 14:38:17 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. IMPALA-9265: Support for toolchain Kudu to provide Java artifacts The build script was modified to generate Kudu JARs and add them to the Kudu tarball. redhat6 and redhat7 docker images were modified to update Java 8 to a newer version that is suitable for building the Java artifacts. ubuntu1404 docker image was modified to include CA certificate file for Java. Testing: Ran the C6 toolchain build job to verify that native-toolchain builds on all supported platforms. Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af --- M docker/all/postinstall.sh A docker/redhat/Centos7-Vault.repo M docker/redhat6.df M docker/redhat7.df M docker/ubuntu1404.df M source/kudu/build.sh 6 files changed, 45 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/72/15072/3 -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 3 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. Patch Set 2: (3 comments) > (3 comments) > > I think the changes to the docker side make sense. I have a few > small comments, but it sounds good. > > For the actual kudu build steps, I think we'll need to see what > Impala needs from the Kudu java side to get this right. I left a > comment about what I think we would need, but I don't think our > progress needs to be blocked on everything being perfect. One way > forward is that we put the Kudu java artifacts in a subdirectory > (which we know will not conflict with the existing implementation) > and then it becomes fairly harmless to check in something that is > not perfect. Then, as we do the Impala change and find what we > need, we do an additional change to get anything we missed. Another > way forward is to merge the docker stuff (which lets us update the > docker images) and then do the Kudu part in concert with the Impala > change. Thanks for the help. I decided to generate the Java artifacts and put them into a subdirectory in this patch-set. I'll do a kudu version bump in a separate patch and finally change Impala to consume the generated Java artifacts in a third patch. http://gerrit.cloudera.org:8080/#/c/15072/2/docker/redhat6.df File docker/redhat6.df: http://gerrit.cloudera.org:8080/#/c/15072/2/docker/redhat6.df@15 PS2, Line 15: # Install a newer java-1.8.0-openjdk-devel from centos:6.8. : # The java-1.8.0-openjdk-devel version shipped with centos:6.6 is unable to handle ECDHE : # ciphers. : RUN yum-install --disablerepo='*' --enablerepo=C6.8-base java-1.8.0-openjdk-devel > The way I think about this is that there are some libraries we use newer ve Done http://gerrit.cloudera.org:8080/#/c/15072/2/docker/redhat7.df File docker/redhat7.df: http://gerrit.cloudera.org:8080/#/c/15072/2/docker/redhat7.df@9 PS2, Line 9: # We get a newer java-1.8.0-openjdk-devel from centos:7.4. : # The java-1.8.0-openjdk version shipped with centos:7.2 is unable to handle ECDHE : # ciphers. : RUN yum-install --disablerepo='*' --enablerepo=C7.4-base java-1.8.0-openjdk-devel > Same as the redhat6.df comment, move this below the big install command. Done http://gerrit.cloudera.org:8080/#/c/15072/2/source/kudu/build.sh File source/kudu/build.sh: http://gerrit.cloudera.org:8080/#/c/15072/2/source/kudu/build.sh@140 PS2, Line 140: wrap ./gradlew :kudu-hive:assemble : for F in kudu-hive/build/libs/kudu-*.jar; do : cp "$F" "$JAVA_INSTALL_DIR" : done > My thought on this part is that we are going to want to test it with the co I've changed the code to put the kudu-hive jars and the kudu-client maven repo to $LOCAL_INSTALL/java. Done. -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Sat, 25 Jan 2020 07:21:59 + Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-9265: Support for toolchain Kudu to provide Java artifacts
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/15072 ) Change subject: IMPALA-9265: Support for toolchain Kudu to provide Java artifacts .. Patch Set 2: > Took a first pass through it, it looks pretty good. > There is one issue that makes me wonder: > 1.we explicitly include the C/C++ compiler version in the resulting > tarballs' name > 2. We now start including Java binaries in the same tarballs. Java > binaries can (in theory at least, if not in current practice) be > produced by different JDK versions, > so should we start including the JDK version (or distro+version) > string in the artifact names? > Currently we build only with JDK 8, but as JDK 8 is nearing its End > of Support Date, this may change one day. > We can also defer the decision and establish the convention that no > explicit Java version means JDK 8, and everything else is marked; > this can be left to our future selves. Distro names are already included in the tarball names uploaded to the S3 bucket. I agree that JDK version should be added to the tarball names eventually. I'm not sure if we should do it now or as a separate change. Let's see what Joe thinks about it. -- To view, visit http://gerrit.cloudera.org:8080/15072 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iba03dfe9c302513b825cbed7146c582e7d97c3af Gerrit-Change-Number: 15072 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Tue, 21 Jan 2020 15:10:06 + Gerrit-HasComments: No