Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19353 )
Change subject: IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format ...................................................................... IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format This patch extends the support of Iceberg tables containing multiple file formats. Now AVRO data files can also be read in a mixed table besides Parquet and ORC. Impala uses its avro scanner to read AVRO files, therefore all the avro related limitations apply here as well: writes/metadata changes are not supported. testing: - E2E testing: extending 'iceberg-mixed-file-format.test' to include AVRO files as well, in order to test reading all three currently supported file formats: avro+orc+parquet Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6 Reviewed-on: http://gerrit.cloudera.org:8080/19353 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M be/src/exec/hdfs-scan-node-base.cc M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221026130844_b228ff88-5625-494b-b27a-7819aad52ced-job_16629766502890_0016-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028111610_c7e89043-49e0-40fe-95a5-bf24d958ebc7-job_16629766502890_0017-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028113321_fbfa5f31-421d-406a-9d46-6bec36d7a93c-job_16629766502890_0018-1-00001.orc D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028114730_e2f7d99d-7ad8-478c-a814-19e2d7912ad1-job_16629766502890_0019-1-00001.parquet D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/13c55017-b018-4ccb-a407-08e37e28eec8-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/7b422180-e3f8-4500-b240-1424ef012246-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/80a79f8a-5a47-44c9-b16d-4bef4a5ecec3-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/8e66c338-5cd3-4b85-b986-18ec29b67d94-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-1131576191504541058-1-8e66c338-5cd3-4b85-b986-18ec29b67d94.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-1744181916149214787-1-13c55017-b018-4ccb-a407-08e37e28eec8.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-3243718219085059034-1-7b422180-e3f8-4500-b240-1424ef012246.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-5089000375160183133-1-80a79f8a-5a47-44c9-b16d-4bef4a5ecec3.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v1.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v2.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v3.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v4.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v5.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v6.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v7.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v8.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v9.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/version-hint.txt D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/data/00000-0-data-noemi_20221021195331_77fbb37f-2393-4a66-9656-61cd56b94b46-job_16629766502890_0015-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/a9f8d35c-a852-49fe-996a-d94ae1896c32-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/snap-725782911885631732-1-a9f8d35c-a852-49fe-996a-d94ae1896c32.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/v1.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/v2.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/version-hint.txt D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/version-hint.txt M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-avro.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-mixed-file-format.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M tests/query_test/test_iceberg.py 37 files changed, 61 insertions(+), 1,283 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/19353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6 Gerrit-Change-Number: 19353 Gerrit-PatchSet: 5 Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org>