[ https://issues.apache.org/jira/browse/ATLAS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chiran Ravani updated ATLAS-3085: --------------------------------- Description: When Atlas HiveHook is enabled, it captures DML statements such as inserts which in cases may not be required to capture, need a way to disable a functionality via configuration property. Steps to reproduce: {code:java} CREATE TABLE test_hive_atlas (SSIN int, name string) CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC TBLPROPERTIES('transactional'='true'); set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.support.concurrency=true; set hive.enforce.bucketing=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.txn.strict.locking.mode=true; insert into test_hive_atlas(12398431, 'Name1'); insert into test_hive_atlas values(342198432, 'Name2');{code} was: When Atlas HiveHook is enabled, it captures DML statements which can have sensitive data and the entity is created under type hive_process with the name as DML Statement itself. Steps to reproduce: {code:java} CREATE TABLE test_hive_atlas (SSIN int, name string) CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC TBLPROPERTIES('transactional'='true'); set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.support.concurrency=true; set hive.enforce.bucketing=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.txn.strict.locking.mode=true; insert into test_hive_atlas(12398431, 'Name1'); insert into test_hive_atlas values(342198432, 'Name2');{code} After running the above statements, an entity is created in atlas under hive_process with its name as insert statement, a configuration to disable DMLs for HiveHook will help. > Disable capturing Hive DMLs through HiveHook > -------------------------------------------- > > Key: ATLAS-3085 > URL: https://issues.apache.org/jira/browse/ATLAS-3085 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Affects Versions: 1.1.0 > Reporter: Chiran Ravani > Priority: Critical > > When Atlas HiveHook is enabled, it captures DML statements such as inserts > which in cases may not be required to capture, need a way to disable a > functionality via configuration property. > Steps to reproduce: > {code:java} > CREATE TABLE test_hive_atlas > (SSIN int, > name string) > CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC > TBLPROPERTIES('transactional'='true'); > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; > set hive.support.concurrency=true; > set hive.enforce.bucketing=true; > set hive.exec.dynamic.partition.mode=nonstrict; > set hive.txn.strict.locking.mode=true; > insert into test_hive_atlas(12398431, 'Name1'); > insert into test_hive_atlas values(342198432, 'Name2');{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)