Rajkumar Singh created HIVE-20442:
-------------------------------------
Summary: Hive stale lock when the hiveserver2 background thread
died abruptly
Key: HIVE-20442
URL: https://issues.apache.org/jira/browse/HIVE-20442
Project: Hive
Issue Type: Bug
Components: Hive, Transactions
Affects Versions: 2.1.1
Environment: Hive-2.1
Reporter: Rajkumar Singh
this look like a race condition where background thread is not able to release
the lock it aquired.
1. hiveserver2 background thread request for lock
{code}
2018-08-20T14:13:38,813 INFO [HiveServer2-Background-Pool: Thread-XXXXX]:
lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting:
queryId=hive_xxxxxxx LockRequest(component:[LockComponent(type:SHARED_READ,
level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)],
txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxxxxxx)
{code}
2. acquired the lock and start heartbeating
{code}
2018-08-20T14:36:30,233 INFO [HiveServer2-Background-Pool: Thread-XXXXX]:
lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started
heartbeat with delay/interval = 150000/150000 MILLISECONDS for
query: agentInfo:hive_xxxxxxx
{code}
3. during time between event #1 and #2, client disconnected and deleteContext
cleanup the session dir
{code}
2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXX]:
thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) -
Session disconnected without closing properly.
2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXXX]:
thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) -
Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a]
2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXXX]:
service.CompositeService (SessionManager.java:closeSession(363)) - Session
closed, SessionHandle [xxxxxxxxxxxxxxxxxxxxxxx], current sessions:2
{code}
4. background thread died with NPE while trying to get the queryid
{code}
java.lang.NullPointerException: null
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568)
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414)
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211)
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204)
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
[hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
[hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
[hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at java.security.AccessController.doPrivileged(Native Method)
[?:1.8.0_77]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77]
{code}
did not get a chance to release the lock and heartbeater thread continue
heartbeat indefinately.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)