Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19654


Change subject: IMPALA-11954: Fix for URL encoded partition columns for Iceberg 
tables
......................................................................

IMPALA-11954: Fix for URL encoded partition columns for Iceberg tables

There is a bug when an Iceberg table has a string partition column and
Impala insert special chars into this column that need to be URL
encoded. In this case the partition name is URL encoded not to confuse
the file paths for that partition. E.g. 'b=1/2' value is converted to
'b=1%2F2'.
This if fine for path creation, however, for Iceberg tables
the same URL encoded partition name is saved into catalog as the
partition name also used for Iceberg column stats. This brings to
incorrect results when querying the table as the URL encoded values
are returned in a SELECT * query instead of what the user inserted.
Additionally, when adding a filter to the query, Iceberg will filter
out all the rows because it compares the non-encoded values to the URL
encoded values.

Testing:
  - Added new tests to iceberg-partitioned-insert.test to cover this
    scenario.
  - Re-run the existing test suite.

Change-Id: I67edc3d04738306fed0d4ebc5312f3d8d4f14254
---
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/output-partition.h
M be/src/runtime/dml-exec-state.cc
M common/fbs/IcebergObjects.fbs
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
8 files changed, 199 insertions(+), 52 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/19654/1
--
To view, visit http://gerrit.cloudera.org:8080/19654
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I67edc3d04738306fed0d4ebc5312f3d8d4f14254
Gerrit-Change-Number: 19654
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab <[email protected]>

Reply via email to