Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/23166
Change subject: IMPALA-14224: Cleanup subdirectories in TRUNCATE ...................................................................... IMPALA-14224: Cleanup subdirectories in TRUNCATE If an external table contains data files in subdirectories, and recursive listing is enabled, Impala considers the files in the subdirectories as part of the table. However, currently INSERT OVERWRITE and TRUNCATE do not always delete these files, leading to data corruption. This change takes care of TRUNCATE. Currently TRUNCATE can be run in two different ways: - if the table is being replicated, the HMS api is used - otherwise catalogd deletes the files itself. Two differences between these methods are: - calling HMS leads to an ALTER_TABLE event - calling HMS leads to recursive delete while catalogd only deletes files directly in the partition/table directory. This commit introduces the '--truncate_external_tables_with_hms' startup flag, with default value 'true'. If this flag is set to true, Impala always uses the HMS api for TRUNCATE operations. Note that HMS always deletes stats on TRUNCATE, so setting the DELETE_STATS_IN_TRUNCATE query option to false is not supported if '--truncate_external_tables_with_hms' is set to true: an exception is thrown. Testing: - extended the tests in test_recursive_listing.py::TestRecursiveListing to include TRUNCATE - Moved tests with DELETE_STATS_IN_TRUNCATE=0 from truncate-table.test to truncate-table-no-delete-stats.test, which is run in a new custom cluster test (custom_cluster/test_no_delete_stats_in_truncate.py). Change-Id: Ic0fcc6cf1eca8a0bcf2f93dbb61240da05e35519 --- M be/src/catalog/catalog-server.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java A testdata/workloads/functional-query/queries/QueryTest/truncate-table-no-delete-stats.test M testdata/workloads/functional-query/queries/QueryTest/truncate-table.test M tests/custom_cluster/test_events_custom_configs.py A tests/custom_cluster/test_no_delete_stats_in_truncate.py M tests/metadata/test_ddl.py M tests/metadata/test_recursive_listing.py 11 files changed, 213 insertions(+), 119 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/23166/5 -- To view, visit http://gerrit.cloudera.org:8080/23166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ic0fcc6cf1eca8a0bcf2f93dbb61240da05e35519 Gerrit-Change-Number: 23166 Gerrit-PatchSet: 5 Gerrit-Owner: Daniel Becker <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
