Noemi Pap-Takacs has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19353
Change subject: IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format ...................................................................... IMPALA-11708: Add support for mixed Iceberg tables with AVRO file format This patch extends the support of Iceberg tables containing multiple file formats. Now AVRO data files can also be read in a mixed table besides Parquet and ORC. Impala uses its avro scanner to read AVRO files, therefore all the avro related limitations apply here as well: writes/metadata changes are not supported. testing: - E2E testing: extending 'iceberg-mixed-file-format.test' to include AVRO files as well, in order to test reading all three currently supported file formats: avro+orc+parquet Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6 --- M be/src/exec/hdfs-scan-node-base.cc M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221026130844_b228ff88-5625-494b-b27a-7819aad52ced-job_16629766502890_0016-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028111610_c7e89043-49e0-40fe-95a5-bf24d958ebc7-job_16629766502890_0017-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028113321_fbfa5f31-421d-406a-9d46-6bec36d7a93c-job_16629766502890_0018-1-00001.orc D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/data/00000-0-data-noemi_20221028114730_e2f7d99d-7ad8-478c-a814-19e2d7912ad1-job_16629766502890_0019-1-00001.parquet D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/13c55017-b018-4ccb-a407-08e37e28eec8-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/7b422180-e3f8-4500-b240-1424ef012246-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/80a79f8a-5a47-44c9-b16d-4bef4a5ecec3-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/8e66c338-5cd3-4b85-b986-18ec29b67d94-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-1131576191504541058-1-8e66c338-5cd3-4b85-b986-18ec29b67d94.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-1744181916149214787-1-13c55017-b018-4ccb-a407-08e37e28eec8.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-3243718219085059034-1-7b422180-e3f8-4500-b240-1424ef012246.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/snap-5089000375160183133-1-80a79f8a-5a47-44c9-b16d-4bef4a5ecec3.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v1.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v2.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v3.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v4.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v5.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v6.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v7.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v8.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/v9.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_mixed/metadata/version-hint.txt D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/data/00000-0-data-noemi_20221021195331_77fbb37f-2393-4a66-9656-61cd56b94b46-job_16629766502890_0015-1-00001.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/a9f8d35c-a852-49fe-996a-d94ae1896c32-m0.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/snap-725782911885631732-1-a9f8d35c-a852-49fe-996a-d94ae1896c32.avro D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/v1.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/v2.metadata.json D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/metadata/version-hint.txt D testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_avro_only/version-hint.txt M testdata/datasets/functional/functional_schema_template.sql M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/iceberg-avro.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-mixed-file-format.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test M tests/query_test/test_iceberg.py 37 files changed, 60 insertions(+), 1,283 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/19353/1 -- To view, visit http://gerrit.cloudera.org:8080/19353 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I941adfb659218283eb5fec1b394bb3003f8072a6 Gerrit-Change-Number: 19353 Gerrit-PatchSet: 1 Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org>