[ https://issues.apache.org/jira/browse/ATLAS-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dharshana M Krishnamoorthy updated ATLAS-4595: ---------------------------------------------- Description: Scenario: use --filename in the import script in along with --output so that v2 api is invoked Eg: {code:java} export JAVA_HOME=/usr/java/default; /opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-hive.sh --filename /tmp/file_tejqc.txt --output /tmp/db_okgbi.zip{code} Steps: # Create 2 databases db_1 and db_2 # Create 2 tables under each db # Run import using filename that has database db_1 name The import was success, but the entities are not reflected in atlas {code:java} 2022-04-28 10:50:52,693|INFO|MainThread|machine.py:185 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|RUNNING: ssh -l root -i /tmp/hw-qe-keypair.pem -q -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null quasar-jagdkt-5.quasar-jagdkt.root.hwx.site "sudo -u root sh -c 'export JAVA_HOME=/usr/java/default; /opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-hive.sh --filename /tmp/file_tejqc.txt --output /tmp/db_okgbi.zip'" 2022-04-28 10:50:52,957|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Using Hive configuration directory [/etc/hive/conf] 2022-04-28 10:50:53,152|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|/etc/hive/conf:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/./:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/.//* 2022-04-28 10:50:53,152|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Log file for import is /var/log/atlas/import-hive.log 2022-04-28 10:50:55,328|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|log4j:WARN No such property [maxFileSize] in org.apache.log4j.PatternLayout. 2022-04-28 10:50:55,329|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|log4j:WARN No such property [maxBackupIndex] in org.apache.log4j.PatternLayout. 2022-04-28 10:51:18,889|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: An illegal reflective access operation has occurred 2022-04-28 10:51:18,890|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Illegal reflective access by org.apache.hadoop.hive.common.StringInternUtils (file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/jars/hive-exec-3.1.3000.7.1.8.0-581.jar) to field java.net.URI.string 2022-04-28 10:51:18,890|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.hive.common.StringInternUtils 2022-04-28 10:51:18,890|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations 2022-04-28 10:51:18,891|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: All illegal access operations will be denied in a future release 2022-04-28 10:51:20,824|INFO|MainThread|machine.py:200 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Hive Meta Data imported successfully! 2022-04-28 10:51:20,850|INFO|MainThread|machine.py:227 - run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Exit Code: 0 {code} Additional details: file_tejqc.txt file content {code:java} cat /tmp/file_tejqc.txt db_hive_db_dumeh {code} Tables in the db: {code:java} 0: jdbc:hive2://quasar-jagdkt-1.quasar-jagdkt> use db_hive_db_dumeh; INFO : Compiling command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463): use db_hive_db_dumeh INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463); Time taken: 0.016 seconds INFO : Executing command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463): use db_hive_db_dumeh INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463); Time taken: 0.007 seconds INFO : OK No rows affected (0.036 seconds) 0: jdbc:hive2://quasar-jagdkt-1.quasar-jagdkt> show tables; INFO : Compiling command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314): show tables INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314); Time taken: 0.134 seconds INFO : Executing command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314): show tables INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314); Time taken: 0.015 seconds INFO : OK +-----------+ | tab_name | +-----------+ | table_1 | | table_2 | +-----------+ 2 rows selected (0.688 seconds) {code} was: Scenario: use --filename in the import script in along with --output so that v2 api is invoked Eg: {code:java} '/opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-hive.sh --filename /tmp/file_hqavs.txt --output /tmp/db_axmqv.zip {code} There is some delay (few seconds) before it reflects in atlas. Steps: # Create 2 databases db_1 and db_2 # Run import using filename that has tables belonging to database1 When a search is performed immediately after the import, the data is not reflected in atlas, if we wait for 5 seconds and then search again, data is reflected. This does not happen in the following scenarios: # when v1 api is used # when v2 api is used with database name # when v2 api is used with table name *It happens only when v2 api is used along with file name* This is not a blocker bug as the data reflects in atlas. But creating to find the reason why this happens only while using file name in v2 api. Summary: [Hive import v2]When using file name to import via v2 api, the entities are not reflected in atlas though the import is successful (was: [Hive import v2] [Performance]When using file name to import via v2 api, there is some delay before the entities are reflected in atlas) > [Hive import v2]When using file name to import via v2 api, the entities are > not reflected in atlas though the import is successful > ---------------------------------------------------------------------------------------------------------------------------------- > > Key: ATLAS-4595 > URL: https://issues.apache.org/jira/browse/ATLAS-4595 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Reporter: Dharshana M Krishnamoorthy > Priority: Major > > Scenario: > use --filename in the import script in along with --output so that v2 api is > invoked > Eg: > {code:java} > export JAVA_HOME=/usr/java/default; > /opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-hive.sh --filename > /tmp/file_tejqc.txt --output /tmp/db_okgbi.zip{code} > Steps: > # Create 2 databases db_1 and db_2 > # Create 2 tables under each db > # Run import using filename that has database db_1 name > The import was success, but the entities are not reflected in atlas > {code:java} > 2022-04-28 10:50:52,693|INFO|MainThread|machine.py:185 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|RUNNING: ssh -l root -i > /tmp/hw-qe-keypair.pem -q -o StrictHostKeyChecking=no -o > UserKnownHostsFile=/dev/null quasar-jagdkt-5.quasar-jagdkt.root.hwx.site > "sudo -u root sh -c 'export JAVA_HOME=/usr/java/default; > /opt/cloudera/parcels/CDH/lib/atlas/hook-bin/import-hive.sh --filename > /tmp/file_tejqc.txt --output /tmp/db_okgbi.zip'" 2022-04-28 > 10:50:52,957|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Using Hive configuration > directory [/etc/hive/conf] 2022-04-28 > 10:50:53,152|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|/etc/hive/conf:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/./:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/lib/hadoop/libexec/../../hadoop-yarn/.//* > 2022-04-28 10:50:53,152|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Log file for import is > /var/log/atlas/import-hive.log 2022-04-28 > 10:50:55,328|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|log4j:WARN No such property > [maxFileSize] in org.apache.log4j.PatternLayout. 2022-04-28 > 10:50:55,329|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|log4j:WARN No such property > [maxBackupIndex] in org.apache.log4j.PatternLayout. 2022-04-28 > 10:51:18,889|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: An illegal > reflective access operation has occurred 2022-04-28 > 10:51:18,890|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Illegal reflective > access by org.apache.hadoop.hive.common.StringInternUtils > (file:/opt/cloudera/parcels/CDH-7.1.8-1.cdh7.1.8.p0.25947682/jars/hive-exec-3.1.3000.7.1.8.0-581.jar) > to field java.net.URI.string 2022-04-28 > 10:51:18,890|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Please consider > reporting this to the maintainers of > org.apache.hadoop.hive.common.StringInternUtils 2022-04-28 > 10:51:18,890|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: Use > --illegal-access=warn to enable warnings of further illegal reflective access > operations 2022-04-28 10:51:18,891|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|WARNING: All illegal access > operations will be denied in a future release 2022-04-28 > 10:51:20,824|INFO|MainThread|machine.py:200 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Hive Meta Data imported > successfully! 2022-04-28 10:51:20,850|INFO|MainThread|machine.py:227 - > run()||GUID=003c431b-4087-4990-9d50-d763ef06c51a|Exit Code: 0 {code} > > Additional details: file_tejqc.txt file content > {code:java} > cat /tmp/file_tejqc.txt > db_hive_db_dumeh {code} > Tables in the db: > {code:java} > 0: jdbc:hive2://quasar-jagdkt-1.quasar-jagdkt> use db_hive_db_dumeh; > INFO : Compiling > command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463): > use db_hive_db_dumeh > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) > INFO : Completed compiling > command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463); > Time taken: 0.016 seconds > INFO : Executing > command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463): > use db_hive_db_dumeh > INFO : Starting task [Stage-0:DDL] in serial mode > INFO : Completed executing > command(queryId=hive_20220428115112_876a9b0b-a19c-4ee6-b827-c777e4398463); > Time taken: 0.007 seconds > INFO : OK > No rows affected (0.036 seconds) > 0: jdbc:hive2://quasar-jagdkt-1.quasar-jagdkt> show tables; > INFO : Compiling > command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314): > show tables > INFO : Semantic Analysis Completed (retrial = false) > INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, > type:string, comment:from deserializer)], properties:null) > INFO : Completed compiling > command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314); > Time taken: 0.134 seconds > INFO : Executing > command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314): > show tables > INFO : Starting task [Stage-0:DDL] in serial mode > INFO : Completed executing > command(queryId=hive_20220428115116_01b637f7-7869-49a8-95df-255eb6be7314); > Time taken: 0.015 seconds > INFO : OK > +-----------+ > | tab_name | > +-----------+ > | table_1 | > | table_2 | > +-----------+ > 2 rows selected (0.688 seconds) {code} > -- This message was sent by Atlassian Jira (v8.20.7#820007)