[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310423#comment-15310423 ] Hemanth Yamijala commented on ATLAS-503: [~ssainath], what is the value of {{atlas.graph.storage.lock.wait-time}} in your tests? > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: ATLAS-503.patch, hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310209#comment-15310209 ] Sharmadha Sainath commented on ATLAS-503: - with Atlas commit id : 65d95ebe23744f72c0ed40b9a171d3c42cd2906c ATLAS-503.PATCH, atlas.graph.storage.lock.retries =10 , atlas.notification.hook.numthreads=5, ATLAS_HOOK topic created with 5 partitions, Following are the results of Creating Tags using Apache Jmeter to simulate load : 1 user 100 loops : 35 secs 10 users 10 loops : 32 secs 100 users 1 loop : 21 sec (No lock exceptions) But following error was thrown continuously : ERROR - [Thread-821:] ~ Evicted [359@0a1d3e8020994-Loaner-5142-local1] from cache but waiting too long for transactions to close. Stale transaction alert on: [standardtitantx[null], standardtitantx[null], standardtitantx[null], standardtitantx[null], standardtitantx[null]] (ManagementLogger:189) Results of querying using DSL query (hive_table where name="db.table@cluster") while importing 10,000 tables : 23 mins 07 secs 679 ms (No Lock or Read timed out exceptions) Then , as suggested by Hemanth , set atlas.graph.storage.cache.db-cache-time=12 and created tags with 100 users 1 loop, 10 users 10 loops . No errors were thrown . > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: ATLAS-503.patch, hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299499#comment-15299499 ] Hemanth Yamijala commented on ATLAS-503: This is just new tag creation. Not associating to any entity. One thing if you could help with: I tried to add a print to {{HBaseKeyColumnValueStore.acquireLock}} to print the key and column values as strings. It gets unprintable stuff like {{^HDq^Prt%__type.test}}. Do you know how to convert that byte array to something printable , if at all possible. I tried to do lots of tricks yesterday including String encodings / using HBase utilities etc. But none seemed to work. The only thing that's coming out is __type.test which could mean anything? > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299451#comment-15299451 ] Suma Shivaprasad commented on ATLAS-503: [~yhemanth] Is it trying to create multiple tags on the same entity? If so, could it possibly be the lock being held onTRAIT_NAMES_PROPERTY_KEY and/or the vertexId on which tags are being created? > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298263#comment-15298263 ] Hemanth Yamijala commented on ATLAS-503: I am able to replicate the issue with a test case that tries to create multiple tags at the same time. The exception stack trace shows a {{PermanentLockingException}}. This is being thrown from the shaded {{atlas-titan}} module and specifically the {{HBaseKeyColumnValueStore.acquireLock}} method. Turning on some debug statements, it looks like many threads are trying to acquire a transactional lock from these objects, and for some reason, they are all trying to acquire a lock on the same key, column, even though the tags being created are different. Further, we don't have any retries in place to handle a failure to acquire a lock, with the result that we are immediately failing in such situations. So, two questions to consider: * What are these keys / columns on which everyone is trying to acquire a lock? * If this is a valid scenario and there is this much contention, should we possibly add some retries until the lock succeeds? Doing further debugging on these lines. > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298058#comment-15298058 ] Hemanth Yamijala commented on ATLAS-503: Started looking at this. My primary focus for the bug will be to replicate, debug and fix it with a HBase backend, as BerkeleyDB is not a recommended production. I have been trying to replicate the issue by trying what [~ssainath] reported, except with HBase as the backend. However, I have had no success so far (imported 1000, 5000, 1 tables) in replicating the issue. Talking to Sharmadha, I found that we don't see this specific problem with HBase, only with BerkeleyDB. However, there are other scenarios where even HBase gives a lock exception. Some scenarios are: * Multiple consumer threads and partitions while importing data into Atlas. * Multiple threads creating tags I am assuming the underlying cause will be the same and will try to use one of these scenarios to replicate the issue. > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Assignee: Hemanth Yamijala >Priority: Critical > Fix For: 0.7-incubating > > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275317#comment-15275317 ] Hemanth Yamijala commented on ATLAS-503: Ran into the same bug with ATLAS-759. > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath >Priority: Critical > Fix For: 0.7-incubating > > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ATLAS-503) Not all Hive tables are not imported into Atlas when interrupted with search queries while importing.
[ https://issues.apache.org/jira/browse/ATLAS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150023#comment-15150023 ] Selvamohan Neethiraj commented on ATLAS-503: [~ssainath] - Can you please provide us the exact version of ATLAS - you were using ? Seems like a Critical bug ... Changing the priority to Critical ... > Not all Hive tables are not imported into Atlas when interrupted with search > queries while importing. > --- > > Key: ATLAS-503 > URL: https://issues.apache.org/jira/browse/ATLAS-503 > Project: Atlas > Issue Type: Bug >Reporter: Sharmadha Sainath > Attachments: hiv2atlaslogs.rtf > > > On running a file containing 100 table creation commands using beeline -f , > all hive tables are created. But only 81 of them are imported into Atlas > (HiveHook enabled) when queries like "hive_table" is searched frequently > while the import process for the table is going on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)