[jira] [Commented] (DRILL-1866) Tests that include limit sporadically fail when run as part of entire test suite on Linux
[ https://issues.apache.org/jira/browse/DRILL-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517635#comment-14517635 ] Steven Phillips commented on DRILL-1866: Are you still seeing this problem? Tests that include limit sporadically fail when run as part of entire test suite on Linux - Key: DRILL-1866 URL: https://issues.apache.org/jira/browse/DRILL-1866 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Jacques Nadeau Assignee: Steven Phillips Priority: Critical Fix For: 1.0.0 Seems to be a timing issue where memory is not being released as part of limit cancellation of a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1866) Tests that include limit sporadically fail when run as part of entire test suite on Linux
[ https://issues.apache.org/jira/browse/DRILL-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390816#comment-14390816 ] Chris Westin commented on DRILL-1866: - We've seen these caused by the client disconnecting before the query has wrapped up its processing, and the test tries to shutdown the Drillbit immediately, and that goes through before the query processing has been cleaned up -- the allocators complain because the query cleanup hasn't had a chance to close them yet. There's code in Drillbit now that won't allow Drillbit.close() to proceed until all currently executing fragments complete; that stopped this from happening. So this probably just needs to be verified. Tests that include limit sporadically fail when run as part of entire test suite on Linux - Key: DRILL-1866 URL: https://issues.apache.org/jira/browse/DRILL-1866 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Jacques Nadeau Assignee: Steven Phillips Priority: Critical Fix For: 0.9.0 Seems to be a timing issue where memory is not being released as part of limit cancellation of a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1866) Tests that include limit sporadically fail when run as part of entire test suite on Linux
[ https://issues.apache.org/jira/browse/DRILL-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292229#comment-14292229 ] Daniel Barclay (Drill/MapR) commented on DRILL-1866: For future reference for debugging this: The workaround mentioned above is the calls to nextUntilEnd() in TestViews and TestJdbcDistQUery, which read to the end of the results in order to suppress query cancelation (the cancelation that happens if a new query is made before a ResultSet has read enough of the results). To try to reproduce the problem, disable the calls to nextUntilEnd(). (For me, in my environment, the tests seemed to fail reliably (several test methods would fail in a run), but which test methods would fail was not deterministic.) The underlying problem seems to be a race condition involved in cancelation and/or early termination (as for a LIMIT clause) of a query. Something (overall-query processing?) proceeds with closing a TopLevelAllocator before a subordinate something (fragment processing?) has closed a child allocator taken from that TopLevelAllocator. It possibly involves Foreman, its cancelExecutingFragments(), QueryManager, FragmentExecutor, its closeOutResources(), ScreenRoot, its stop(), LimitRecordBatch's innerNext(), ScreenRoot.innerNext(), etc. Tests that include limit sporadically fail when run as part of entire test suite on Linux - Key: DRILL-1866 URL: https://issues.apache.org/jira/browse/DRILL-1866 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Jacques Nadeau Assignee: Daniel Barclay (Drill/MapR) Priority: Critical Fix For: 0.8.0 Seems to be a timing issue where memory is not being released as part of limit cancellation of a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1866) Tests that include limit sporadically fail when run as part of entire test suite on Linux
[ https://issues.apache.org/jira/browse/DRILL-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286626#comment-14286626 ] Parth Chandra commented on DRILL-1866: -- Can you try this one out again. Several fixes for Limit were commited since this issue was logged. Tests that include limit sporadically fail when run as part of entire test suite on Linux - Key: DRILL-1866 URL: https://issues.apache.org/jira/browse/DRILL-1866 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Jacques Nadeau Assignee: Daniel Barclay (Drill/MapR) Priority: Critical Fix For: 0.8.0 Seems to be a timing issue where memory is not being released as part of limit cancellation of a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-1866) Tests that include limit sporadically fail when run as part of entire test suite on Linux
[ https://issues.apache.org/jira/browse/DRILL-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286825#comment-14286825 ] Daniel Barclay (Drill/MapR) commented on DRILL-1866: Yes, the problem (a race condition somewhere in canceling queries and/or fragments and updating state) still seems to exist. (My in-progress change for DRILL-1735 (a big chain starting with closing JDBC connections) includes a workaround change to two tests to suppress the race-condition failures. Trying a version with that workaround removed seems to show the same symptom (child allocators not closed by the time a top-level allocator is closed).) Tests that include limit sporadically fail when run as part of entire test suite on Linux - Key: DRILL-1866 URL: https://issues.apache.org/jira/browse/DRILL-1866 Project: Apache Drill Issue Type: Bug Components: Tools, Build Test Reporter: Jacques Nadeau Assignee: Daniel Barclay (Drill/MapR) Priority: Critical Fix For: 0.8.0 Seems to be a timing issue where memory is not being released as part of limit cancellation of a query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)