[ https://issues.apache.org/jira/browse/ATLAS-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nikhil Bonte updated ATLAS-3836: -------------------------------- Description: Apache Ozone is the new object store for Hadoop - [https://hadoop.apache.org/ozone/] Apache Atlas needs to add entity types to support creation of Ozone entities. Hive hook should also be updated to create lineage between ozone entities and hive tables (for EXTERNAL TABLE) +*Approach :*+ # Refactored BaseHiveEvent.getPathEntity() -> moved to AtlasPathExtractorUtil.java # Created PathExtractorContext.java to wrap most arguments. # AtlasPathExtractorUtil.getPathEntity() -> accept Path, PathExtractorContext -> return AtlasEntityWithExtInfo # Added specific condition in AtlasPathExtractorUtil.getPathEntity() to handle Ozone path -> path starts with "ofs://" or "o3fs://" # Added UT around AtlasPathExtractorUtil.getPathEntity() -> AtlasPathExtractorUtilTest.java +*Examples :*+ {code:java} -> CREATE EXTERNAL TABLE sales (id int) row format delimited fields terminated by ' ' stored as textfile location 'o3fs://bucket1.volume1.ozone1/sale1/q1/sales';{code} || ||Name||Qualified Name|| |ozone_key|/sale1/q1/sales|o3fs://bucket1.volume1.ozone1/sale1/q1/sales@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| {code:java} -> create EXTERNAL table stocks (id int) row format delimited fields terminated by ' ' stored as textfile; -> load data inpath 'o3fs://bucket1.volume1.ozone1/stocks.txt' into table stocks; {code} || ||Name||Qualified Name|| |ozone_key|/stocks.txt|o3fs://bucket1.volume1.ozone1/stocks.txt@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| {code:java} -> create EXTERNAL table stocks_q1 (id int) row format delimited fields terminated by ' ' stored as textfile; -> load data inpath 'o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt' into table stocks_q1;{code} || ||Name||Qualified Name|| |ozone_key|/quarter1/stocks_q1.txt|o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| *Note:* The approach has been updated in ATLAS-3879 was: Apache Ozone is the new object store for Hadoop - [https://hadoop.apache.org/ozone/] Apache Atlas needs to add entity types to support creation of Ozone entities. Hive hook should also be updated to create lineage between ozone entities and hive tables (for EXTERNAL TABLE) +*Approach :*+ # Refactored BaseHiveEvent.getPathEntity() -> moved to AtlasPathExtractorUtil.java # Created PathExtractorContext.java to wrap most arguments. # AtlasPathExtractorUtil.getPathEntity() -> accept Path, PathExtractorContext -> return AtlasEntityWithExtInfo # Added specific condition in AtlasPathExtractorUtil.getPathEntity() to handle Ozone path -> path starts with "ofs://" or "o3fs://" # Added UT around AtlasPathExtractorUtil.getPathEntity() -> AtlasPathExtractorUtilTest.java +*Examples :*+ {code:java} -> CREATE EXTERNAL TABLE sales (id int) row format delimited fields terminated by ' ' stored as textfile location 'o3fs://bucket1.volume1.ozone1/sale1/q1/sales';{code} || ||Name||Qualified Name|| |ozone_key|/sale1/q1/sales|o3fs://bucket1.volume1.ozone1/sale1/q1/sales@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| {code:java} -> create EXTERNAL table stocks (id int) row format delimited fields terminated by ' ' stored as textfile; -> load data inpath 'o3fs://bucket1.volume1.ozone1/stocks.txt' into table stocks; {code} || ||Name||Qualified Name|| |ozone_key|/stocks.txt|o3fs://bucket1.volume1.ozone1/stocks.txt@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| {code:java} -> create EXTERNAL table stocks_q1 (id int) row format delimited fields terminated by ' ' stored as textfile; -> load data inpath 'o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt' into table stocks_q1;{code} || ||Name||Qualified Name|| |ozone_key|/quarter1/stocks_q1.txt|o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt@cm| |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| |ozone_volume|volume1|o3fs://volume1@cm| > Add Apache Ozone support in hive hook > ------------------------------------- > > Key: ATLAS-3836 > URL: https://issues.apache.org/jira/browse/ATLAS-3836 > Project: Atlas > Issue Type: New Feature > Components: atlas-core > Affects Versions: 2.0.0 > Reporter: Sarath Subramanian > Assignee: Nikhil Bonte > Priority: Major > Labels: hive-hooks, ozone > Fix For: 2.1.0 > > Attachments: Hive_table_lineage.png, > Hive_table_lineage_load_in_path.png, Ozone_bucket.png, Ozone_key.png, > Ozone_volume.png > > > Apache Ozone is the new object store for Hadoop - > [https://hadoop.apache.org/ozone/] > Apache Atlas needs to add entity types to support creation of Ozone entities. > Hive hook should also be updated to create lineage between ozone entities and > hive tables (for EXTERNAL TABLE) > +*Approach :*+ > # Refactored BaseHiveEvent.getPathEntity() -> moved to > AtlasPathExtractorUtil.java > # Created PathExtractorContext.java to wrap most arguments. > # AtlasPathExtractorUtil.getPathEntity() -> accept Path, > PathExtractorContext -> return AtlasEntityWithExtInfo > # Added specific condition in AtlasPathExtractorUtil.getPathEntity() to > handle Ozone path > -> path starts with "ofs://" or "o3fs://" > # Added UT around AtlasPathExtractorUtil.getPathEntity() -> > AtlasPathExtractorUtilTest.java > > +*Examples :*+ > {code:java} > -> CREATE EXTERNAL TABLE sales (id int) row format delimited fields > terminated by ' ' stored as textfile location > 'o3fs://bucket1.volume1.ozone1/sale1/q1/sales';{code} > || ||Name||Qualified Name|| > |ozone_key|/sale1/q1/sales|o3fs://bucket1.volume1.ozone1/sale1/q1/sales@cm| > |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| > |ozone_volume|volume1|o3fs://volume1@cm| > > > {code:java} > -> create EXTERNAL table stocks (id int) row format delimited fields > terminated by ' ' stored as textfile; > -> load data inpath 'o3fs://bucket1.volume1.ozone1/stocks.txt' into table > stocks; > {code} > || ||Name||Qualified Name|| > |ozone_key|/stocks.txt|o3fs://bucket1.volume1.ozone1/stocks.txt@cm| > |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| > |ozone_volume|volume1|o3fs://volume1@cm| > > > {code:java} > -> create EXTERNAL table stocks_q1 (id int) row format delimited fields > terminated by ' ' stored as textfile; > -> load data inpath 'o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt' > into table stocks_q1;{code} > || ||Name||Qualified Name|| > |ozone_key|/quarter1/stocks_q1.txt|o3fs://bucket1.volume1.ozone1/quarter1/stocks_q1.txt@cm| > |ozone_bucket|bucket1|o3fs://volume1.bucket1@cm| > |ozone_volume|volume1|o3fs://volume1@cm| > > *Note:* The approach has been updated in ATLAS-3879 > -- This message was sent by Atlassian Jira (v8.3.4#803005)