[jira] [Commented] (DRILL-8372) Unfreed buffers when running a LIMIT 0 query over delimited text

ASF GitHub Bot (Jira) Wed, 25 Jan 2023 22:54:06 -0800


    [ 
https://issues.apache.org/jira/browse/DRILL-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17680906#comment-17680906
 ]


ASF GitHub Bot commented on DRILL-8372:
---------------------------------------

paul-rogers commented on code in PR #2728:
URL: https://github.com/apache/drill/pull/2728#discussion_r1087483902


##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java:
##########
@@ -75,7 +75,7 @@ public IterOutcome innerNext() {
         upStream = next(incoming);
       }
       // If EMIT that means leaf operator is UNNEST, in this case refresh the 
limit states and return EMIT.
-      if (upStream == EMIT) {
+      if (upStream == EMIT || upStream == NONE) {

Review Comment:
   This doesn't seem to be quite the right solution. This block of code is for 
a very particular case: an UNNEST.
   
   Expanding this code, look at the top of the loop:
   
   ```java
       if (!first && !needMoreRecords(numberOfRecords)) {
   ```
   
   With a LIMIT 0, we hit the limit on the first batch. I'm not quite sure why 
the `!first` is in place. Maybe history would tell us. Perhaps the right answer 
is something like:
   
   ```java
   if ( !needMoreRecords(numberOfRecords)) {
        outgoingSv.setRecordCount(0);
        VectorAccessibleUtilities.clear(incoming);
        return super.innerNext();
   }
   if (!first) {
     ...
   ```
   
   I suspect that the logic actually needs more analysis. What does it do on 
the first batch now? What does `super.innerNext()` do, and do we want that if 
we've reached the limit?
   
   Generally, the debugger is the best way to sort this out. Try a LIMIT 0, a 
LIMIT n where n < size of the first batch, LIMIT n where n > batch size && n < 
2 * batch size, etc.





> Unfreed buffers when running a LIMIT 0 query over delimited text
> ----------------------------------------------------------------
>
>                 Key: DRILL-8372
>                 URL: https://issues.apache.org/jira/browse/DRILL-8372
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text &amp; CSV
>    Affects Versions: 1.21.0
>            Reporter: James Turton
>            Assignee: James Turton
>            Priority: Major
>             Fix For: 1.21.0
>
>
> With the following data layout
>  
> {code:java}
> /tmp/foo/bar:
> large_csv.csvh
> /tmp/foo/boo:
> large_csv.csvh
> {code}
> a LIMIT 0 query over it results in unfreed buffer errors.
> {code:java}
> apache drill (dfs.tmp)> select * from `foo` limit 0;
> Error: SYSTEM ERROR: IllegalStateException: Allocator[op:0:0:4:EasySubScan] 
> closed with outstanding buffers allocated (3).
> Allocator(op:0:0:4:EasySubScan) 1000000/299008/3182592/10000000000 
> (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 3
>     ledger[113] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 
> 262144, references: 1, life: 277785186322881..0, allocatorManager: [109, 
> life: 277785186258906..0] holds 1 buffers.
>         DrillBuf[142], udle: [110 0..262144]
>     ledger[114] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 
> 32768, references: 1, life: 277785186463824..0, allocatorManager: [110, life: 
> 277785186414654..0] holds 1 buffers.
>         DrillBuf[143], udle: [111 0..32768]
>     ledger[112] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 4096, 
> references: 1, life: 277785186046095..0, allocatorManager: [108, life: 
> 277785185921147..0] holds 1 buffers.
>         DrillBuf[141], udle: [109 0..4096]
>   reservations: 0 {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (DRILL-8372) Unfreed buffers when running a LIMIT 0 query over delimited text

Reply via email to