-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69631/
-----------------------------------------------------------

(Updated Dec. 26, 2018, 10:04 p.m.)


Review request for atlas, Abhishek Kadam, Ayub Pathan, Ashutosh Mestry, keval 
bhatt, Kapildeo Nayak, Abhay Kulkarni, Nixon Rodrigues, Sarath Subramanian, and 
Sharmadha Sainath.


Changes
-------

- added caching of hive_table ignore/prune state
- updated to use following Atlas server configuration:
. atlas.notification.consumer.preprocess.hive_table.ignore.pattern
. atlas.notification.consumer.preprocess.hive_table.prune.pattern
. atlas.notification.consumer.preprocess.hive_table.cache.size
- updated to use following Hive hook configuration:
. atlas.hive.hook.hive_table.ignore.pattern
. atlas.hive.hook.hive_table.prune.pattern
. atlas.hive.hook.hive_table.cache.size


Bugs: ATLAS-3006
    https://issues.apache.org/jira/browse/ATLAS-3006


Repository: atlas


Description
-------

Introduced following configurations to specify temporary/staging Hive tables, 
so that Hive hook/Atlas server can ignore or prune these tables. For pruned 
tables, columns and column-lineage details will be ignored.

  # configurations for Hive hook
  atlas.hook.hive.ignore.hive_table.pattern=
  atlas.hook.hive.prune.hive_table.pattern=

  # configurations for Atlas server
  atlas.notification.consumer.ignore.hive_table.pattern=
  atlas.notification.consumer.prune.hive_table.pattern=


Appropriate use of these configurations can avoid loading Atlas with 
unnecessary metadata of trainsient tables.


Diffs (updated)
-----

  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/AtlasHiveHookContext.java
 23cb853ca 
  addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/HiveHook.java 
7b6055387 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/AlterTableRename.java
 6ced340b2 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/BaseHiveEvent.java
 2d90a1560 
  
addons/hive-bridge/src/main/java/org/apache/atlas/hive/hook/events/CreateTable.java
 2afaf9f4f 
  
webapp/src/main/java/org/apache/atlas/notification/NotificationHookConsumer.java
 2d2a6fb95 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/EntityPreprocessor.java
 PRE-CREATION 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/HivePreprocessor.java
 PRE-CREATION 
  
webapp/src/main/java/org/apache/atlas/notification/preprocessor/PreprocessorContext.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69631/diff/2/

Changes: https://reviews.apache.org/r/69631/diff/1-2/


Testing
-------

Verified with the following script that Hive hook and Atlas server ignore/prune 
metadata for specified Hive tables:

# configurations for Hive hook
atlas.hook.hive.ignore.hive_table.pattern=temp\..*,test\..*
atlas.hook.hive.prune.hive_table.pattern=staging\..*,.*_stg@.*


# configurations for Atlas server
atlas.notification.consumer.ignore.hive_table.pattern=temp\..*,test\..*
atlas.notification.consumer.prune.hive_table.pattern=staging\..*,.*_stg@.*

CREATE DATABASE IF NOT EXISTS test;
CREATE DATABASE IF NOT EXISTS temp;
CREATE DATABASE IF NOT EXISTS staging;
CREATE DATABASE IF NOT EXISTS prod;

DROP VIEW  IF EXISTS test.testView;
DROP TABLE IF EXISTS test.testTable;

DROP VIEW  IF EXISTS temp.tempView;
DROP TABLE IF EXISTS temp.tempTable;

DROP VIEW  IF EXISTS staging.stagingView;
DROP TABLE IF EXISTS staging.stagingTable;

DROP VIEW  IF EXISTS prod.prodView;
DROP TABLE IF EXISTS prod.prodTable;
DROP TABLE IF EXISTS prod.prodSourceTable;

DROP VIEW  IF EXISTS prod.myTable_stg;
DROP TABLE IF EXISTS prod.myView_stg;

CREATE TABLE test.testTable(id INT, name STRING);
CREATE VIEW  test.testView AS SELECT * FROM test.testTable;

CREATE TABLE temp.tempTable(id INT, name STRING);
CREATE VIEW  temp.tempView AS SELECT * FROM temp.tempTable;

CREATE TABLE staging.stagingTable(id INT, name STRING);
CREATE VIEW  staging.stagingView AS SELECT * FROM staging.stagingTable;

CREATE TABLE prod.prodSourceTable(id INT, name STRING);
CREATE TABLE prod.prodTable(id INT, name STRING);
CREATE VIEW  prod.prodView AS SELECT * FROM prod.prodTable;

CREATE TABLE prod.myTable_stg(id INT, name STRING);
CREATE VIEW  prod.myView_stg AS SELECT * FROM prod.prodTable;

INSERT INTO TABLE prod.prodTable SELECT * FROM staging.stagingTable;
INSERT INTO TABLE prod.prodTable SELECT * FROM temp.tempTable;
INSERT INTO TABLE prod.prodTable SELECT * FROM prod.myView_stg;
INSERT INTO TABLE prod.prodTable SELECT * FROM staging.stagingView;
INSERT INTO TABLE prod.prodTable SELECT * FROM prod.prodSourceTable;


Thanks,

Madhan Neethiraj

Reply via email to