----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69631/ -----------------------------------------------------------
(Updated Dec. 26, 2018, 10:04 p.m.) Review request for atlas, Abhishek Kadam, Ayub Pathan, Ashutosh Mestry, keval bhatt, Kapildeo Nayak, Abhay Kulkarni, Nixon Rodrigues, Sarath Subramanian, and Sharmadha Sainath. Changes ------- - added caching of hive_table ignore/prune state - updated to use following Atlas server configuration: . atlas.notification.consumer.preprocess.hive_table.ignore.pattern . atlas.notification.consumer.preprocess.hive_table.prune.pattern . atlas.notification.consumer.preprocess.hive_table.cache.size - updated to use following Hive hook configuration: . atlas.hive.hook.hive_table.ignore.pattern . atlas.hive.hook.hive_table.prune.pattern . atlas.hive.hook.hive_table.cache.size Bugs: ATLAS-3006 https://issues.apache.org/jira/browse/ATLAS-3006 Repository: atlas Description ------- Introduced following configurations to specify temporary/staging Hive tables, so that Hive hook/Atlas server can ignore or prune these tables. For pruned tables, columns and column-lineage details will be ignored. # configurations for Hive hook atlas.hook.hive.ignore.hive_table.pattern= atlas.hook.hive.prune.hive_table.pattern= # configurations for Atlas server atlas.notification.consumer.ignore.hive_table.pattern= atlas.notification.consumer.prune.hive_table.pattern= Appropriate use of these configurations can avoid loading Atlas with unnecessary metadata of trainsient tables. Diffs (updated) ----- addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/AtlasHiveHookContext.java 23cb853ca addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 7b6055387 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/AlterTableRename.java 6ced340b2 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/BaseHiveEvent.java 2d90a1560 addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/CreateTable.java 2afaf9f4f webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java 2d2a6fb95 webapp/src/main/java/org/apache/atlas/notification/preprocessor/EntityPreprocessor.java PRE-CREATION webapp/src/main/java/org/apache/atlas/notification/preprocessor/HivePreprocessor.java PRE-CREATION webapp/src/main/java/org/apache/atlas/notification/preprocessor/PreprocessorContext.java PRE-CREATION Diff: https://reviews.apache.org/r/69631/diff/2/ Changes: https://reviews.apache.org/r/69631/diff/1-2/ Testing ------- Verified with the following script that Hive hook and Atlas server ignore/prune metadata for specified Hive tables: # configurations for Hive hook atlas.hook.hive.ignore.hive_table.pattern=temp\..*,test\..* atlas.hook.hive.prune.hive_table.pattern=staging\..*,.*_stg@.* # configurations for Atlas server atlas.notification.consumer.ignore.hive_table.pattern=temp\..*,test\..* atlas.notification.consumer.prune.hive_table.pattern=staging\..*,.*_stg@.* CREATE DATABASE IF NOT EXISTS test; CREATE DATABASE IF NOT EXISTS temp; CREATE DATABASE IF NOT EXISTS staging; CREATE DATABASE IF NOT EXISTS prod; DROP VIEW IF EXISTS test.testView; DROP TABLE IF EXISTS test.testTable; DROP VIEW IF EXISTS temp.tempView; DROP TABLE IF EXISTS temp.tempTable; DROP VIEW IF EXISTS staging.stagingView; DROP TABLE IF EXISTS staging.stagingTable; DROP VIEW IF EXISTS prod.prodView; DROP TABLE IF EXISTS prod.prodTable; DROP TABLE IF EXISTS prod.prodSourceTable; DROP VIEW IF EXISTS prod.myTable_stg; DROP TABLE IF EXISTS prod.myView_stg; CREATE TABLE test.testTable(id INT, name STRING); CREATE VIEW test.testView AS SELECT * FROM test.testTable; CREATE TABLE temp.tempTable(id INT, name STRING); CREATE VIEW temp.tempView AS SELECT * FROM temp.tempTable; CREATE TABLE staging.stagingTable(id INT, name STRING); CREATE VIEW staging.stagingView AS SELECT * FROM staging.stagingTable; CREATE TABLE prod.prodSourceTable(id INT, name STRING); CREATE TABLE prod.prodTable(id INT, name STRING); CREATE VIEW prod.prodView AS SELECT * FROM prod.prodTable; CREATE TABLE prod.myTable_stg(id INT, name STRING); CREATE VIEW prod.myView_stg AS SELECT * FROM prod.prodTable; INSERT INTO TABLE prod.prodTable SELECT * FROM staging.stagingTable; INSERT INTO TABLE prod.prodTable SELECT * FROM temp.tempTable; INSERT INTO TABLE prod.prodTable SELECT * FROM prod.myView_stg; INSERT INTO TABLE prod.prodTable SELECT * FROM staging.stagingView; INSERT INTO TABLE prod.prodTable SELECT * FROM prod.prodSourceTable; Thanks, Madhan Neethiraj