[ 
https://issues.apache.org/jira/browse/ATLAS-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chiran Ravani updated ATLAS-3085:
---------------------------------
    Description: 
When Atlas HiveHook is enabled, it captures DML statements such as inserts 
which in cases may not be required to capture, need a way to disable a 
functionality via configuration property.

Steps to reproduce:
{code:java}
CREATE TABLE test_hive_atlas
(SSIN int,
name string)
CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC 
TBLPROPERTIES('transactional'='true');

set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.support.concurrency=true;
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.strict.locking.mode=true;

insert into test_hive_atlas(12398431, 'Name1');
insert into test_hive_atlas values(342198432, 'Name2');{code}
 

  was:
When Atlas HiveHook is enabled, it captures DML statements which can have 
sensitive data and the entity is created under type hive_process with the name 
as DML Statement itself.

Steps to reproduce:
{code:java}
CREATE TABLE test_hive_atlas
(SSIN int,
name string)
CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC 
TBLPROPERTIES('transactional'='true');

set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.support.concurrency=true;
set hive.enforce.bucketing=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.strict.locking.mode=true;

insert into test_hive_atlas(12398431, 'Name1');
insert into test_hive_atlas values(342198432, 'Name2');{code}
 

After running the above statements, an entity is created in atlas under 
hive_process with its name as insert statement, a configuration to disable DMLs 
for HiveHook will help.


> Disable capturing Hive DMLs through HiveHook
> --------------------------------------------
>
>                 Key: ATLAS-3085
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3085
>             Project: Atlas
>          Issue Type: Bug
>          Components:  atlas-core
>    Affects Versions: 1.1.0
>            Reporter: Chiran Ravani
>            Priority: Critical
>
> When Atlas HiveHook is enabled, it captures DML statements such as inserts 
> which in cases may not be required to capture, need a way to disable a 
> functionality via configuration property.
> Steps to reproduce:
> {code:java}
> CREATE TABLE test_hive_atlas
> (SSIN int,
> name string)
> CLUSTERED BY (SSIN) INTO 3 BUCKETS STORED AS ORC 
> TBLPROPERTIES('transactional'='true');
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.support.concurrency=true;
> set hive.enforce.bucketing=true;
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.txn.strict.locking.mode=true;
> insert into test_hive_atlas(12398431, 'Name1');
> insert into test_hive_atlas values(342198432, 'Name2');{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to