[ 
https://issues.apache.org/jira/browse/DRILL-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882786#comment-17882786
 ] 

ASF GitHub Bot commented on DRILL-8511:
---------------------------------------

rymarm opened a new pull request, #2943:
URL: https://github.com/apache/drill/pull/2943

   # [DRILL-8511](https://issues.apache.org/jira/browse/DRILL-8511): Overflow 
appeared when the batch reached rows limit
   
   ## Description
   
   The size-aware scan framework fails to end the batch.
   
   Framework tries to reallocate the vector on batch end, due to a hidden, 
minor bug in `BitColumnWriter` - which in general is not notable, but in a 
specific case, when the initial vector allocation size limit is exceeded and a 
reader reaches the batch row size limit.
   
   `BitColumnWriter` uses instead of a write index a value count and this 
causes unexpected vector reallocation (look at the changes).
   
   ## Documentation
   No changes required.
   
   ## Testing
   Manual tests
   




> Overflow appeared when the batch reached rows limit
> ---------------------------------------------------
>
>                 Key: DRILL-8511
>                 URL: https://issues.apache.org/jira/browse/DRILL-8511
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.21.2
>            Reporter: Maksym Rymar
>            Assignee: Maksym Rymar
>            Priority: Major
>         Attachments: complex.zip
>
>
>  
> Drill fails to read a JSON file with the exception: 
> {{java.lang.IllegalStateException: Unexpected state: FULL_BATCH:}}
> {code:java}
> Caused by: java.lang.IllegalStateException: Unexpected state: FULL_BATCH
>         at 
> org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.overflowed(ResultSetLoaderImpl.java:639)
>         at 
> org.apache.drill.exec.physical.resultSet.impl.ColumnState$PrimitiveColumnState.overflowed(ColumnState.java:73)
>         at 
> org.apache.drill.exec.vector.accessor.writer.BaseScalarWriter.overflowed(BaseScalarWriter.java:214)
>         at 
> org.apache.drill.exec.vector.accessor.writer.AbstractFixedWidthWriter.resize(AbstractFixedWidthWriter.java:249)
>         at 
> org.apache.drill.exec.vector.accessor.writer.BitColumnWriter.prepareWrite(BitColumnWriter.java:77)
>         at 
> org.apache.drill.exec.vector.accessor.writer.BitColumnWriter.setValueCount(BitColumnWriter.java:87)
>         at 
> org.apache.drill.exec.vector.accessor.writer.AbstractFixedWidthWriter.endWrite(AbstractFixedWidthWriter.java:299)
>         at 
> org.apache.drill.exec.vector.accessor.writer.NullableScalarWriter.endWrite(NullableScalarWriter.java:298)
>         at 
> org.apache.drill.exec.vector.accessor.writer.AbstractTupleWriter.endWrite(AbstractTupleWriter.java:366)
>         at 
> org.apache.drill.exec.physical.resultSet.impl.RowSetLoaderImpl.endBatch(RowSetLoaderImpl.java:101)
>         at 
> org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.harvestNormalBatch(ResultSetLoaderImpl.java:730)
>         at 
> org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.harvest(ResultSetLoaderImpl.java:700)
>         at 
> org.apache.drill.exec.physical.impl.scan.project.ReaderSchemaOrchestrator.endBatch(ReaderSchemaOrchestrator.java:137)
>         at 
> org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next(ShimBatchReader.java:148)
>         at 
> org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch(ReaderState.java:400)
>         at 
> org.apache.drill.exec.physical.impl.scan.ReaderState.next(ReaderState.java:361)
>         at 
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction(ScanOperatorExec.java:270)
>         at 
> org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next(ScanOperatorExec.java:242)
>         at 
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext(OperatorDriver.java:201)
>         at 
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.start(OperatorDriver.java:179)
>         at 
> org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next(OperatorDriver.java:129)
>         at 
> org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next(OperatorRecordBatch.java:149)
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:101)
>         at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:59)
>         at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:93)
>         at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:161)
>         at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:103)
>         at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
>         at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.lambda$run$0(FragmentExecutor.java:324)
>         at .......(:0)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2012)
>         at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:313)
>         at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>         at .......(:0) {code}
> Overflow appeared when the batch reached the rows limit with JSON reader.
> To reproduce the issue - execute the following query against the attached 
> file:
>  
> {code:java}
> SELECT id, 
>          gbyi, 
>          gbyt, 
>          fl, 
>          nul, 
>          bool, 
>          str, 
>          sia, 
>          sfa, 
>          soa, 
>          ooa, 
>          oooi, 
>          ooof, 
>          ooos, 
>          oooa 
>   FROM   dfs.tmp.`complex.json` {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to