[ https://issues.apache.org/jira/browse/ATLAS-1720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sarath Subramanian updated ATLAS-1720: -------------------------------------- Summary: Add titan storage.lock.wait-time for Berkley DB to fix intermittent IT failures (was: Increase titan storage.lock.wait-time for Berkley DB to fix intermittent IT failures ) > Add titan storage.lock.wait-time for Berkley DB to fix intermittent IT > failures > -------------------------------------------------------------------------------- > > Key: ATLAS-1720 > URL: https://issues.apache.org/jira/browse/ATLAS-1720 > Project: Atlas > Issue Type: Bug > Components: atlas-core > Affects Versions: trunk, 0.9-incubating > Reporter: Sarath Subramanian > Assignee: Sarath Subramanian > > Some of the ITs in Atlas fail intermittently with exception - "Could not > execute operation due to backend exception" > Upon investigation it's found this is due to Berkley LockTimeoutException > (https://github.com/thinkaurelius/titan/issues/1113) > The default LockTimeout for berkley db is 500 ms and if a thread (some IT) is > waiting on titan storage resource which is locked by another thread and it > doesn't releases the lock within 500ms - fails with above exception. (see > error log below) > The fix for this is to increase the storage.lock.wait-time for berkley db to > 10000 ms. This is consistent with the lock wait timeout specified for HBase. > {code} > Caused by: com.sleepycat.je.LockTimeoutException: (JE 5.0.73) Lock expired. > Locker 1516581475 7535_NotificationHookConsumer thread-0_Txn: waited for lock > on database=edgestore LockAddr:284896285 LSN=0x0/0x21d55f type=WRITE > grant=WAIT_PROMOTION timeoutMillis=500 startTime=1491261268442 > endTime=1491261268942 > Owners: [<LockInfo locker="1445928922 7537_qtp184901207-1038 - > e015a355-d6c5-4424-b7a7-833a289aea9d_Txn" type="READ"/>, <LockInfo > locker="1516581475 7535_NotificationHookConsumer thread-0_Txn" type="READ"/>] > Waiters: [] > Transaction 1445928922 7537_qtp184901207-1038 - > e015a355-d6c5-4424-b7a7-833a289aea9d_Txn waits for LockAddr:471572402 > Owners:<LockInfo locker="1516581475 7535_NotificationHookConsumer > thread-0_Txn" type="WRITE"/> Waiters:[<LockInfo locker="1445928922 > 7537_qtp184901207-1038 - e015a355-d6c5-4424-b7a7-833a289aea9d_Txn" > type="READ"/>] > Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn owns > LockAddr:471572402 <LockInfo locker="1516581475 7535_NotificationHookConsumer > thread-0_Txn" type="WRITE"/> > Transaction 1516581475 7535_NotificationHookConsumer thread-0_Txn waits for > LockAddr:284896285 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)