[ https://issues.apache.org/jira/browse/DRILL-8480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831903#comment-17831903 ]
ASF GitHub Bot commented on DRILL-8480: --------------------------------------- rymarm opened a new pull request, #2897: URL: https://github.com/apache/drill/pull/2897 # [DRILL-8480](https://issues.apache.org/jira/browse/DRILL-8480): Make Nested Loop Join operator properly process empty batches and batches with new schema ## Description Nested Loop Join operator (`NestedLoopJoinBatch`, `NestedLoopJoin`) unproperly handles batch iteration outcome `OK` with 0 records. Drill design of the processing of batches involves 5 states: * `NONE` (batch can have only 0 records) * `OK` (batch can have 0+ records) * `OK_NEW_SCHEMA` (batch can have 0+ records) * `NOT_YET` (undefined) * `EMIT` (batch can have 0+ records) The Nested Loop Join operator in some circumstances could receive `OK` outcome with 0 records, and instead of requesting the next batch, the operator stops data processing and returns `NONE` outcome to upstream batches(operators) without freeing resources of underlying batches. ## Documentation - ## Testing Manual testing with a file from the Jira ticket [DRILL-8480](https://issues.apache.org/jira/browse/DRILL-8480) > Cleanup before finished. 0 out of 1 streams have finished > --------------------------------------------------------- > > Key: DRILL-8480 > URL: https://issues.apache.org/jira/browse/DRILL-8480 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.21.1 > Reporter: Maksym Rymar > Assignee: Maksym Rymar > Priority: Major > Attachments: 1a349ff1-d1f9-62bf-ed8c-26346c548005.sys.drill, > tableWithNumber2.parquet > > > Drill fails to execute a query with the following exception: > {code:java} > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > IllegalStateException: Cleanup before finished. 0 out of 1 streams have > finished > Fragment: 1:0 > Please, refer to logs for more information. > [Error Id: 270da8f4-0bb6-4985-bf4f-34853138881c on > compute7.vmcluster.com:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:395) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:245) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:362) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: java.lang.IllegalStateException: Cleanup before finished. 0 out of > 1 streams have finished > at > org.apache.drill.exec.work.batch.BaseRawBatchBuffer.close(BaseRawBatchBuffer.java:111) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:71) > at > org.apache.drill.exec.work.batch.AbstractDataCollector.close(AbstractDataCollector.java:121) > at > org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:91) > at > org.apache.drill.exec.work.batch.IncomingBuffers.close(IncomingBuffers.java:144) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:567) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:417) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:240) > ... 5 common frames omitted > Suppressed: java.lang.IllegalStateException: Cleanup before finished. > 0 out of 1 streams have finished > ... 15 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (32768) > Allocator(op:1:0:8:UnorderedReceiver) 1000000/32768/32768/10000000000 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.BaseOperatorContext.close(BaseOperatorContext.java:159) > at > org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:77) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:571) > ... 7 common frames omitted > Suppressed: java.lang.IllegalStateException: Memory was leaked by > query. Memory leaked: (1016640) > Allocator(frag:1:0) 30000000/1016640/30016640/90715827882 > (res/actual/peak/limit) > at > org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) > at > org.apache.drill.exec.ops.FragmentContextImpl.suppressingClose(FragmentContextImpl.java:581) > at > org.apache.drill.exec.ops.FragmentContextImpl.close(FragmentContextImpl.java:574) > ... 7 common frames omitted {code} > Steps to reproduce: > 1.Enable unequal join: > {code:java} > alter session set `planner.enable_nljoin_for_scalar_only`=false; {code} > 2. Disable join optimization to prevent Drill from flipping sides of > join that may break the query execution because the NestedLoopJoin operator > that executes unequal joins supports only the left join. > {code:java} > alter session set `planner.enable_join_optimization`=false; {code} > 3. Execute join one side which is UNION ALL with DISTINCT: > {code:java} > SELECT * > FROM ( > ( > SELECT DISTINCT > log_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) > UNION ALL > ( > SELECT DISTINCT > log_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) > ) t1 > LEFT JOIN ( > SELECT > log_number AS server_number > FROM > dfs.tmp.`tableWithNumber2.parquet` > ) t3 > ON ( > t3.server_number >= t1.log_number > ) > {code} > {{Parquet file for the reproduce: [^tableWithNumber2.parquet]. It contains 10 > 000 rows, with a single column of random double values. The file was > generated by Drill:}} > {code:java} > apache drill> select *, sqlTypeOf(log_number) as log_number_column_type from > dfs.tmp.`tableWithNumber2.parquet` limit 10; > +------------+------------------------+ > | log_number | log_number_column_type | > +------------+------------------------+ > | 4.0 | DOUBLE | > | 5.0 | DOUBLE | > | 4.0 | DOUBLE | > | 3.0 | DOUBLE | > | 4.0 | DOUBLE | > | 0.0 | DOUBLE | > | 0.0 | DOUBLE | > | 8.0 | DOUBLE | > | 3.0 | DOUBLE | > | 4.0 | DOUBLE | > +------------+------------------------+ > 10 rows selected (0.175 seconds) {code} > Also attaching the profile of the failed query. -- This message was sent by Atlassian Jira (v8.20.10#820010)