[jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock

2017-08-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150003#comment-16150003
 ] 

Hudson commented on PHOENIX-4131:
-

FAILURE: Integrated in Jenkins build Phoenix-master #1762 (See 
[https://builds.apache.org/job/Phoenix-master/1762/])
PHOENIX-4131 UngroupedAggregateRegionObserver.preClose() and (samarth: rev 
c2e85f2131669c381e61cc3d6982ab66e4ed63b9)
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java
* (edit) 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java
* (edit) phoenix-core/src/test/java/org/apache/phoenix/query/BaseTest.java


> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can 
> deadlock
> 
>
> Key: PHOENIX-4131
> URL: https://issues.apache.org/jira/browse/PHOENIX-4131
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4131.patch
>
>
> On my local test run I saw that the tests were not completing because the 
> mini cluster couldn't shut down. So I took a jstack and discovered the 
> following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x7fafa6327000 
> nid=0x37b3f runnable [0x7000115f5000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
>   - locked <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> {code}
> "RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=59006" #16246 daemon 
> prio=5 os_prio=31 tid=0x7fafae856000 nid=0x1abdb waiting for monitor 
> entry [0x7000102bc000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:734)
>   - waiting to lock <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:236)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:281)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2629)
>   - locked <0x00072b625a90> (a 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2833)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 

[jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock

2017-08-31 Thread Samarth Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149834#comment-16149834
 ] 

Samarth Jain commented on PHOENIX-4131:
---

Thanks for the review, [~jamestaylor]. Pushed to 4.x and master branches.

> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can 
> deadlock
> 
>
> Key: PHOENIX-4131
> URL: https://issues.apache.org/jira/browse/PHOENIX-4131
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Fix For: 4.12.0
>
> Attachments: PHOENIX-4131.patch
>
>
> On my local test run I saw that the tests were not completing because the 
> mini cluster couldn't shut down. So I took a jstack and discovered the 
> following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x7fafa6327000 
> nid=0x37b3f runnable [0x7000115f5000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
>   - locked <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> {code}
> "RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=59006" #16246 daemon 
> prio=5 os_prio=31 tid=0x7fafae856000 nid=0x1abdb waiting for monitor 
> entry [0x7000102bc000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:734)
>   - waiting to lock <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:236)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:281)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2629)
>   - locked <0x00072b625a90> (a 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2833)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}
> preClose() has the object monitor and is waiting for scanReferencesCount to 
> go down to 0. doPostScannerOpen() is trying to acquire the same lock so that 
> it can reduce the scanReferencesCount to 0.
> I think this bug was introduced in PHOENIX-3111 to solve other deadlocks. 
> FYI, [~rajeshbabu], [~sergey.soldatov], [~enis], [~

[jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock

2017-08-31 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149797#comment-16149797
 ] 

James Taylor commented on PHOENIX-4131:
---

+1. LGTM. Thanks, [~samarthjain].

> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can 
> deadlock
> 
>
> Key: PHOENIX-4131
> URL: https://issues.apache.org/jira/browse/PHOENIX-4131
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4131.patch
>
>
> On my local test run I saw that the tests were not completing because the 
> mini cluster couldn't shut down. So I took a jstack and discovered the 
> following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x7fafa6327000 
> nid=0x37b3f runnable [0x7000115f5000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
>   - locked <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> {code}
> "RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=59006" #16246 daemon 
> prio=5 os_prio=31 tid=0x7fafae856000 nid=0x1abdb waiting for monitor 
> entry [0x7000102bc000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:734)
>   - waiting to lock <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:236)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:281)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2629)
>   - locked <0x00072b625a90> (a 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2833)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}
> preClose() has the object monitor and is waiting for scanReferencesCount to 
> go down to 0. doPostScannerOpen() is trying to acquire the same lock so that 
> it can reduce the scanReferencesCount to 0.
> I think this bug was introduced in PHOENIX-3111 to solve other deadlocks. 
> FYI, [~rajeshbabu], [~sergey.soldatov], [~enis], [~lhofhansl].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029

[jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock

2017-08-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148556#comment-16148556
 ] 

Hadoop QA commented on PHOENIX-4131:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12884597/PHOENIX-4131.patch
  against master branch at commit 7d8b8430212fae117ac09faf6b7c22bf673e9073.
  ATTACHMENT ID: 12884597

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
67 warning messages.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.KeyOnlyIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.MultiCfQueryExecIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.QueryWithTableSampleIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ParallelIteratorsIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.TenantSpecificTablesDDLIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.SaltedViewIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.coprocessor.StatisticsCollectionRunTrackerIT
./phoenix-core/target/failsafe-reports/TEST-org.apache.phoenix.end2end.ExplainPlanWithStatsEnabledIT

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//testReport/
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/1331//console

This message is automatically generated.

> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can 
> deadlock
> 
>
> Key: PHOENIX-4131
> URL: https://issues.apache.org/jira/browse/PHOENIX-4131
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
> Attachments: PHOENIX-4131.patch
>
>
> On my local test run I saw that the tests were not completing because the 
> mini cluster couldn't shut down. So I took a jstack and discovered the 
> following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x7fafa6327000 
> nid=0x37b3f runnable [0x7000115f5000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
>   - locked <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRe

[jira] [Commented] (PHOENIX-4131) UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can deadlock

2017-08-29 Thread Rajeshbabu Chintaguntla (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16144862#comment-16144862
 ] 

Rajeshbabu Chintaguntla commented on PHOENIX-4131:
--

[~samarthjain] are you working on it or you want me to take  a look?

> UngroupedAggregateRegionObserver.preClose() and doPostScannerOpen() can 
> deadlock
> 
>
> Key: PHOENIX-4131
> URL: https://issues.apache.org/jira/browse/PHOENIX-4131
> Project: Phoenix
>  Issue Type: Bug
>Reporter: Samarth Jain
>Assignee: Samarth Jain
>
> On my local test run I saw that the tests were not completing because the 
> mini cluster couldn't shut down. So I took a jstack and discovered the 
> following deadlock:
> {code}
> "RS:0;samarthjai-wsm4:59006" #16265 prio=5 os_prio=31 tid=0x7fafa6327000 
> nid=0x37b3f runnable [0x7000115f5000]
>java.lang.Thread.State: RUNNABLE
>   at java.lang.Object.wait(Native Method)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.preClose(UngroupedAggregateRegionObserver.java:1201)
>   - locked <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.call(RegionCoprocessorHost.java:494)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preClose(RegionCoprocessorHost.java:490)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegion(HRegionServer.java:2843)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeRegionIgnoreErrors(HRegionServer.java:2805)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.closeUserRegions(HRegionServer.java:2423)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1052)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.runRegionServer(MiniHBaseCluster.java:157)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$000(MiniHBaseCluster.java:110)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$1.run(MiniHBaseCluster.java:141)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:360)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
>   at 
> org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:334)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.run(MiniHBaseCluster.java:139)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> {code}
> "RpcServer.FifoWFPBQ.default.handler=3,queue=0,port=59006" #16246 daemon 
> prio=5 os_prio=31 tid=0x7fafae856000 nid=0x1abdb waiting for monitor 
> entry [0x7000102bc000]
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at 
> org.apache.phoenix.coprocessor.UngroupedAggregateRegionObserver.doPostScannerOpen(UngroupedAggregateRegionObserver.java:734)
>   - waiting to lock <0x00072bc406b8> (a java.lang.Object)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.overrideDelegate(BaseScannerRegionObserver.java:236)
>   at 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder.nextRaw(BaseScannerRegionObserver.java:281)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2629)
>   - locked <0x00072b625a90> (a 
> org.apache.phoenix.coprocessor.BaseScannerRegionObserver$RegionScannerHolder)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2833)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34950)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2339)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> {code}
> preClose() has the object monitor and is waiting for scanReferencesCount to 
> go down to 0. doPostScannerOpen() is trying to acquire the same lock so that 
> it can reduce the scanReferencesCount to 0.
> I think this bug was introduced in PHOENIX-3111 to solve other deadlocks. 
> FYI, [~rajeshbabu], [~sergey.soldatov], [~enis], [~lhofhansl].



--
This message was sent by Atlassian JIRA
(