[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2022-04-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22420:
--
Labels: pull-request-available  (was: )

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 4.0.0-alpha-1
>
> Attachments: HIVE-22420.1.patch, HIVE-22420.2.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-05 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-22420:
--
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~hamvas.aron]!

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22420.1.patch, HIVE-22420.2.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Target Version/s: 4.0.0  (was: 4.0.0, 3.1.3)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch, HIVE-22420.2.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Attachment: HIVE-22420.2.patch
Status: Patch Available  (was: In Progress)

Do not allow starting multiple heartbeaters for same transaction.

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch, HIVE-22420.2.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-11-04 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Status: In Progress  (was: Patch Available)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-31 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Status: Patch Available  (was: In Progress)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-31 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Status: In Progress  (was: Patch Available)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Attachment: HIVE-22420.1.patch
Status: Patch Available  (was: In Progress)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
> Attachments: HIVE-22420.1.patch
>
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Affects Version/s: (was: 3.1.2)
   3.1.0

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Affects Version/s: (was: 3.1.0)
   3.1.2

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Target Version/s: 4.0.0, 3.1.3

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Description: 
When a transactional query is being executed and interrupted via HS2 close 
operation request, both the background pool thread executing the query and the 
HttpHandler thread running the close operation logic will eventually call the 
below method:
{noformat}
Driver.releaseLocksAndCommitOrRollback(commit boolean)
{noformat}
Since this method is invoked several times in both threads, it can happen that 
the two threads invoke it at the same time, and due to a race condition, the 
txnId field of the DbTxnManager used by both threads could be set to 0 without 
actually successfully aborting the transaction.

The root cause is stopHeartbeat() method in DbTxnManager not being thread safe:

When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
difference, Thread-1 might successfully cancel the heartbeat task and set the 
heartbeatTask field to null, while Thread-2 is trying to observe its state. 
Thread-1 will return to the calling rollbackTxn() method and continue execution 
there, while Thread-2 wis thrown back to the same method with a 
NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
cannot be released later on either.

  was:
When a transactional query is being executed and interrupted via HS2 close 
operation request, both the background pool thread executing the query and the 
HttpHandler thread running the close operation logic will eventually call the 
below method:

{noformat}
Driver.releaseLocksAndCommitOrRollback(commit boolean)
{noformat}

Since this method is invoked several times in both threads, it can happen that 
the two threads   invoke it at the same time, and due to a race condition, the 
txnId field of the DbTxnManager used by both threads could be set to 0 without 
actually successfully aborting the transaction. 

E.g. if the two threads reach the stopHeartbeat() call at the same time, one 
will set the heartbeat task to null, the other will run into a 
NullPointerException and due to the unsuccessful call, set the value of txnId 
to 0 before the other thread (which successfully ran the stopHeartbeat() call) 
could invoke rollback on HMS with the proper txnId.


> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads invoke it at the same time, and due to a race condition, 
> the txnId field of the DbTxnManager used by both threads could be set to 0 
> without actually successfully aborting the transaction.
> The root cause is stopHeartbeat() method in DbTxnManager not being thread 
> safe:
> When Thread-1 and Thread-2 enter stopHeartbeat() with very little time 
> difference, Thread-1 might successfully cancel the heartbeat task and set the 
> heartbeatTask field to null, while Thread-2 is trying to observe its state. 
> Thread-1 will return to the calling rollbackTxn() method and continue 
> execution there, while Thread-2 wis thrown back to the same method with a 
> NullPointerException. Thread-2 will then set txnId to 0, and Thread-1 is 
> sending this 0 value to HMS. So, the txn will not be aborted, and the locks 
> cannot be released later on either.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22420) DbTxnManager.stopHeartbeat() should be thread-safe

2019-10-30 Thread Aron Hamvas (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aron Hamvas updated HIVE-22420:
---
Summary: DbTxnManager.stopHeartbeat() should be thread-safe  (was: 
Driver.releaseLocksAndCommitOrRollback is not thread safe)

> DbTxnManager.stopHeartbeat() should be thread-safe
> --
>
> Key: HIVE-22420
> URL: https://issues.apache.org/jira/browse/HIVE-22420
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Aron Hamvas
>Assignee: Aron Hamvas
>Priority: Major
>
> When a transactional query is being executed and interrupted via HS2 close 
> operation request, both the background pool thread executing the query and 
> the HttpHandler thread running the close operation logic will eventually call 
> the below method:
> {noformat}
> Driver.releaseLocksAndCommitOrRollback(commit boolean)
> {noformat}
> Since this method is invoked several times in both threads, it can happen 
> that the two threads   invoke it at the same time, and due to a race 
> condition, the txnId field of the DbTxnManager used by both threads could be 
> set to 0 without actually successfully aborting the transaction. 
> E.g. if the two threads reach the stopHeartbeat() call at the same time, one 
> will set the heartbeat task to null, the other will run into a 
> NullPointerException and due to the unsuccessful call, set the value of txnId 
> to 0 before the other thread (which successfully ran the stopHeartbeat() 
> call) could invoke rollback on HMS with the proper txnId.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)