[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2016-12-22 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771772#comment-15771772
 ] 

Paul Rogers commented on DRILL-5156:


Continuing to investigate, it appears that the RPC threads hang around after 
shutting down the Drillbit if the debugger is stopped at the exception 
breakpoint. In particular, I ran the full test, with the 
{{IllegalStateException}} breakpoint. The breakpoint was hit in one tests. That 
test shuts down its Drillbit at the end. Then, another test started that 
created a new Drillbit. It seams that the first Drillbit did not wait for the 
RPC threads to exit; leaving orphaned threads (those stopped in the debugger.) 
Seems the Drillbit should refuse to exit until all child threads have completed.

> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2016-12-23 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772269#comment-15772269
 ] 

Paul Rogers commented on DRILL-5156:


The problem appears to be a bug in {{BootStrapContext}} which creates two 
thread pools, but does not close them. The two pools are for the "BitClient-n" 
and "BitServer-n" threads. During close, the {{BootStrapContext.close()}} 
method closes the allocator but leaves the threads running.

Since they are left running, the BitClient thread attempts to use the (now 
closed) allocator and triggers the {{IllegalStateException}}. This behavior is 
easy to see by setting the breakpoint described above. Leave the thread stopped 
at that breakpoint. The rest of the Drillbit shuts down around the suspended 
thread, showing that the Drillbit did not wait for the thread.

The fix is simple:

{code}
  public void close() {
try {
  loop2.shutdownGracefully(0, 0, TimeUnit.SECONDS);
} catch ( Exception e ) {
  logger.warn("Failure During Bit-Client shutdown.", e);
}
try {
  loop.shutdownGracefully(0, 0, TimeUnit.SECONDS);
} catch ( Exception e ) {
  logger.warn("Failure During Bit-Server shutdown.", e);
}
...
{code}

After this fix, the test case runs fine with no {{IllegalStateExceptions}}.

> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2016-12-23 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773608#comment-15773608
 ] 

Paul Rogers commented on DRILL-5156:


Also seeing a similar problem in {{FragmentContext.close()}} in the unit test 
{{TestConvertFunctions#testConvertFromConvertToInt}}. This test fails with the 
Snappy library issue. Then, when tearing down, we get an 
{{IllegalStateException}} in the {{OperatorContextImpl.close()}} method here:

{code}
if (allocator != null) {
  allocator.close(); // Error here
}
{code}

Likely, again, the thread is not being closed properly before the memory 
allocator is released.

> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2016-12-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15782140#comment-15782140
 ] 

ASF GitHub Bot commented on DRILL-5156:
---

GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/709

DRILL-5156: BootStrapContext should close threads

The Bit-Client thread (that's the thread name) finds a closed allocator in 
TestDrillbitResilience unit test. This fix (along with DRILL-5157) eliminates 
two run-time problems seen in this unit tests.

BootStrapContext creates two thread pools, but does not close them. This 
allows the code running in the threads to attempt to access their allocators 
after the allocator is closed. This fix ensures that the
thread pools are closed to avoid the issue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5156

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/709.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #709


commit 73607c33c78dd89373412c8e4c70356fa19f81fd
Author: Paul Rogers 
Date:   2016-12-28T01:45:16Z

DRILL-5156: BootStrapContext should close threads

Bit-Client thread finds closed allocator in TestDrillbitResilience unit
test. This fix (along with DRILL-5157) eliminates two run-time problems
seen in this unit tests.

BootStrapContext creates two thread pools, but does not close them.
This allows the code running in the threads to attempt to access their
allocators after the allocator is closed. This fix ensures that the
thread pools are closed to avoid the issue.




> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2017-01-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795944#comment-15795944
 ] 

ASF GitHub Bot commented on DRILL-5156:
---

Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/709
  
This change may not be that simple, but I could be wrong. I tried to do 
something similar as part of [PR 
429](https://github.com/apache/drill/pull/429/commits/0394f4ca5aed142bb2ba0b192f3588cfda7b).

The close happens elsewhere. The "loop" is actually closed as part of 

[BasicServer#close](https://github.com/apache/drill/blob/master/exec/rpc/src/main/java/org/apache/drill/exec/rpc/BasicServer.java#L218).
 But it will be closed multiple times (because there may be multiple instances 
of sub-classes of BasicServer, and all use the same loop), and looks like 
"loop2" is not closed anywhere. The changes in PR 429 close the loops exactly 
once, but I do not recollect why the PR is not merged.


> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498342#comment-16498342
 ] 

ASF GitHub Bot commented on DRILL-5156:
---

ilooner commented on issue #709: DRILL-5156: BootStrapContext should close 
threads
URL: https://github.com/apache/drill/pull/709#issuecomment-393959637
 
 
   @paul-rogers is this fix still valid? Or can we close this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498388#comment-16498388
 ] 

ASF GitHub Bot commented on DRILL-5156:
---

paul-rogers closed pull request #709: DRILL-5156: BootStrapContext should close 
threads
URL: https://github.com/apache/drill/pull/709
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/server/BootStrapContext.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/server/BootStrapContext.java
index c498185046..dc0a392ba1 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/server/BootStrapContext.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/server/BootStrapContext.java
@@ -123,6 +123,16 @@ public ScanResult getClasspathScan() {
 
   @Override
   public void close() {
+try {
+  loop2.shutdownGracefully(0, 0, TimeUnit.SECONDS);
+} catch ( Exception e ) {
+  logger.warn("Failure During Bit-Client shutdown.", e);
+}
+try {
+  loop.shutdownGracefully(0, 0, TimeUnit.SECONDS);
+} catch ( Exception e ) {
+  logger.warn("Failure During Bit-Server shutdown.", e);
+}
 try {
   DrillMetrics.resetMetrics();
 } catch (Error | Exception e) {


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5156) Bit-Client thread finds closed allocator in TestDrillbitResilience unit test

2018-06-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498387#comment-16498387
 ] 

ASF GitHub Bot commented on DRILL-5156:
---

paul-rogers commented on issue #709: DRILL-5156: BootStrapContext should close 
threads
URL: https://github.com/apache/drill/pull/709#issuecomment-393971351
 
 
   We can close this. The origin was seeing a resource leak. There was debate. 
There have been other related fixes.
   
   If the leak still exists, folks will find it and we can devise a fix based 
on current code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bit-Client thread finds closed allocator in TestDrillbitResilience unit test
> 
>
> Key: DRILL-5156
> URL: https://issues.apache.org/jira/browse/DRILL-5156
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Minor
>
> RPC thread attempts to access a closed allocator during the 
> {{TestDrillbitResilience}} unit test.
> Set a Java exception breakpoint for {{IllegalStateException}}. Run the 
> {{TestDrillbitResilience}} unit tests.
> You will see quite a few exceptions, including the following in a thread 
> called BitClient-1:
> {code}
> RootAllocator(BaseAllocator).assertOpen() line 109
> RootAllocator(BaseAllocator).buffer(int) line 191
> DrillByteBufAllocator.buffer(int) line 49
> DrillByteBufAllocator.ioBuffer(int) line 64
> AdaptiveRecvByteBufAllocatpr$HandleImpl.allocate(ByteBufAllocator) line 104
> NioSocketChannel$NioSocketChannelUnsafe(...).read() line 117
> ...
> NioEventLoop.run() line 354
> {code}
> The test continues (then fails for some other reason), which is why this is 
> marked as minor. Still, it seems odd that the client thread should attempt to 
> access a closed allocator.
> At this point, it is not clear how we got into this state. The test itself is 
> waiting for a response from the server in the {{tailsAfterMSorterSorting}} 
> test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)