Ramesh Mani created ATLAS-2649: ---------------------------------- Summary: Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive Key: ATLAS-2649 URL: https://issues.apache.org/jira/browse/ATLAS-2649 Project: Atlas Issue Type: Bug Reporter: Ramesh Mani
Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive. When Hive on HBase is done via Hive's HBaseStorageHandler mechanism, corresponding HBase table is created in HBase and data is store in it. In this process Hive Hook should show Input process as Hive Table and Output as HBase Table. e.g CREATE TABLE hbase_table_emp(id int, name string, role string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role") TBLPROPERTIES ("hbase.table.name" = "emp"); This will create a corresponding HBase table emp hbase(main):003:0> list TABLE ATLAS_ENTITY_AUDIT_EVENTS atlas_janus emp 3 row(s) Took 0.0127 seconds => ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_janus", "emp"] hbase(main):004:0> describe 'emp' Table emp is ENABLED emp COLUMN FAMILIES DESCRIPTION {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION _SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'fals e', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} 1 row(s) Took 0.1961 seconds In this process the Hive hook should provide the lineage info for the corresponding Hive table -> HBase Table Storage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)