Hello Quanlong Huang, Tamas Mate, Gergely Fürnstáhl, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18514 to look at the new patch set (#2). Change subject: IMPALA-801, IMPALA-8011: Add INPUT__FILE__NAME virtual column for file name ...................................................................... IMPALA-801, IMPALA-8011: Add INPUT__FILE__NAME virtual column for file name Hive has virtual column INPUT__FILE__NAME which returns the data file name that stores the actual row. It can be used in several ways, see the above two Jira tickets for examples. This virtual column is also needed to support position-based delete files in Iceberg V2 tables. This patch also adds the foundations to support further table-level virtual columns later. Virtual columns are stored at the table level in a separate list from the table schema. During path resolution in Path.resolve() we also try to resolve virtual columns. Slot descriptors also store the information whether they refer to a virtual column. Currently we only add the INPUT__FILE__NAME virtual column. The value of this column can be set in the template tuple of the scanners. All kinds of operations are possible on this virtual column, users can invoke additional functions on it, can filter rows, can group by, etc. Testing: * added e2e tests Change-Id: I498591f1db08a91a5c846df59086d2291df4ff61 --- M be/src/exec/file-metadata-utils.cc M be/src/exec/file-metadata-utils.h M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/orc-column-readers.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M common/thrift/Descriptors.thrift M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/catalog/FeTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java A fe/src/main/java/org/apache/impala/catalog/VirtualColumn.java M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java A testdata/workloads/functional-query/queries/QueryTest/virtual-column-input-file-name-complextypes.test A testdata/workloads/functional-query/queries/QueryTest/virtual-column-input-file-name.test M tests/query_test/test_scanners.py 25 files changed, 558 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/18514/2 -- To view, visit http://gerrit.cloudera.org:8080/18514 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I498591f1db08a91a5c846df59086d2291df4ff61 Gerrit-Change-Number: 18514 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org>