Vihang Karajgaonkar has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16439 )
Change subject: IMPALA-9664: Support hive replication ...................................................................... IMPALA-9664: Support hive replication This patch makes some improvements to the INSERT event generated by Impala. Specifically, the INSERT event will now include new file information when Impala inserts into a table. This information can be used by external tools like Hive Replication to replicate the changes made by Impala in their source databases. Additionally, this patch modifies the truncate table execution so that it uses HMS API to truncate the table instead of deleting the files directly on the filesystem. Following changes were made. 1. Fires insert events for insert overwrite. 2. Has the names of the new files in the events. In case of insert overwrite, this is just a list of files which were added by the insert overwrite operation. 3. In case of ACID tables, fires transactional notification API for all the partitions in which data is inserted. 4. For tables which have replication enabled, the truncate table operation now uses a HMS API to truncate the table. This is necessary since HMS API moves the files to a replication change manager location if needed. Additionally, it generates ALTER_TABLE events with truncate flag set to true. TODO: 1. For external tables, replication does not seem to work in the dev environment. This will be done as a followup. Testing: 1. Created a new test in test_events_processing.py which inserts into managed tables which are being replicated. It makes sure that hive replication detects the new rows which are added into the tables. The test also exercises insert overwrite and truncate statements and makes sure that the table is replicated correctly. Change-Id: Icaf3fe0adff755ff853960f270ceb45b11a84f0a --- M be/src/service/client-request-state.cc M common/thrift/CatalogService.thrift M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/HiveStorageDescriptorFactory.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/AcidUtils.java M fe/src/main/java/org/apache/impala/util/MetaStoreUtil.java M fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java M fe/src/test/java/org/apache/impala/catalog/events/SynchronousHMSEventProcessorForTests.java M fe/src/test/resources/hive-site.xml.py M tests/custom_cluster/test_event_processing.py 13 files changed, 792 insertions(+), 186 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/16439/4 -- To view, visit http://gerrit.cloudera.org:8080/16439 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icaf3fe0adff755ff853960f270ceb45b11a84f0a Gerrit-Change-Number: 16439 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>