[ https://issues.apache.org/jira/browse/ATLAS-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16086851#comment-16086851 ]
Sharmadha Sainath commented on ATLAS-1948: ------------------------------------------ [~ashutoshm], hive commands : > create database database1; > create database database2; > create table database1.table1(id int,name string); > create table database2.table2 as select * from database1.table1; Export command : {code} curl -v -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d @db1tb1.json "http://host1:21000/api/atlas/admin/export" > db1tb1.zip {code} db1t1.json contents : {code} { "itemsToExport": [ { "typeName": "hive_table", "uniqueAttributes": { "qualifiedName": "database1.table1@cl1" } } ], "options":{ "fetchType":"full" } } {code} Zip file after export : [^db1tb1.zip] Import command : {code} curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@db1tb1.zip "http://host2:21000/api/atlas/admin/import" -F request=@tabletransform.json {code} tabletransform.json file : {code} { "options": { "transforms": "{ \"hive_column\": { \"qualifiedName\": [ \"replace:cl1:cl2\" ] }}" } } {code} Result : {code} {"errorCode":"ATLAS-500-00-001","errorMessage":"org.apache.atlas.exception.AtlasBaseException: ObjectId is not valid AtlasObjectId{guid='a6954b5a-a5c4-4e30-bdf5-7b3408842bfa', typeName='hive_column', uniqueAttributes={}}"} {code} > Importing hive_table in a database which is a CTAS of another table in > different database throws exception due to export order. > ------------------------------------------------------------------------------------------------------------------------------- > > Key: ATLAS-1948 > URL: https://issues.apache.org/jira/browse/ATLAS-1948 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Affects Versions: 0.9-incubating > Reporter: Sharmadha Sainath > Assignee: Ashutosh Mestry > Priority: Critical > Fix For: 0.9-incubating > > Attachments: db1tb1.zip, ImportTransformsErrorOnCTASonDiffDB.txt > > > 1.Created 2 databases db1 , db2 in cluster1 > 2.Created 2 tables > 1. db1.t1 > 2. db2.t2 as select * from db1.t1 > 3.Exported db1.t1 into zip file. > 4.Imported zip file into cluster 2 with transforms option : > {code} > { > "options": { > "transforms": "{ \"hive_column\": { \"qualifiedName\": [ > \"replace:cl1:cl2\" ]} }" > } > } > {code} > 5. Import fails with > {code} > {"errorCode":"ATLAS-500-00-001","errorMessage":"org.apache.atlas.exception.AtlasBaseException: > ObjectId is not valid > AtlasObjectId{guid='51c77c1e-265e-46ab-bbb5-5316cf80a53c', > typeName='hive_column', uniqueAttributes={}}"} > {code} > Only db1.t1 is imported into Atlas without any lineage. > Attached the exception stack trace. > After this exporting db2.t2 and importing completes successfully. > That is , first import ,either db1.t1 or db2.t1 is unsuccessful with > exception. Next import is successful. > The exception *doesn't* happen and tables are successfully imported If both > the tables are in a single database. Export order if tables are in same db is > 1.table1, > 2.db, > 3.table2, > 4.hive_process > 5. hive_column_lineage > If the tables are in different db , the order is , > 1.table1, > 2.db1, > 3.hive_process, > 4.hive_column_lineage > 5.ctas table > 6.db2 > which is possibly causing the issue. > When cluster2 starts importing , it imports table1 , db1 and when it comes > to hive_column_lineage , it finds that column specified in > hive_column_lineage is not in cluster2 yet ,since ctas table comes after the > hive_column_lineage in import order and it throws "ObjectId is not valid > AtlasObjectId{guid='51c77c1e-265e-46ab-bbb5-5316cf80a53c', > typeName='hive_column' ". > Thanks [~ayubkhan] for the analysis. -- This message was sent by Atlassian JIRA (v6.4.14#64029)