Hi Charles, Looks like I forgot the extra newlines needed to make my e-mail provider work with Apache's mailer. Let me try again.
In the "classic" readers, each reader picks some number of rows per batch, often 1K, 4K, 4000, etc. The idea is that, on average, this row count will give us a decently-sized record batch. EVF uses a different approach. It tallies the total space used by the batch, and the space taken by each vector, and it will decide when the batch is full. (You can also set a row count limit. By default, the limit is 64K, the maximum row count allowed in a Drill batch.) So, if you are converting from a "classic" reader to EVF, you will want to remove the reader-imposed row count limit. (Or, if there is reason to do so, you can pass that limit to EVF and let EVF enforce it.) For example, convert code that looks like this: for (int rowCount = 0; rowCount < MAX_BATCH_SIZE; rowCount++) { // Load a row } Into code that looks like this: RowSetLoader rowWriter = // get from EVF per examples while (! rowWriter.isFull()) { // Load a row rowWriter.save(); } Note that the "save" is needed because EVF let's you discard a row: you can load it, check it, and decide to skip it. This will, eventually, allow us to push filtering down to the reader. For now, you just need to call save(). Thanks, - Paul On Tuesday, September 24, 2019, 09:56:34 AM PDT, Paul Rogers <par0...@yahoo.com.INVALID> wrote: So the usual pattern is: while (! rowWriter.isFull()) { // Load the row rowWriter.save();} Is it the case that PCAP is trying to force the row count to, say 4K or 8K or whatever? If so, ignore that count. The error is telling you that at least one vector has reached 16 MB in size (or you've reached the row count limit, if you set that.) Thanks, - Paul On Monday, September 23, 2019, 08:09:38 PM PDT, Charles Givre <cgi...@gmail.com> wrote: Ok... so I have yet another question relating to the EVF. I'm working on a project to improve (hopefully) the PCAP plugin with the ultimate goal being to include parsed PCAP packet data. In any event, I've run into a snag. In one unit test, I'm getting the error below when I call rowWriter.save(). I suspect what is happening is that the TupleWriter is not full when it starts writing the row, but by the time it is finished writing the fields, the batch is full. Does that even make sense? Here's a link to the offending code <https://github.com/cgivre/drill/blob/caa69e7f27f68aeedaa28902596a2250ab32cd84/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapBatchReader.java#L232>: Any suggestions? Thanks! --C [Error Id: bf6446fe-95d5-4ba7-bf7f-32e8a5dd4d1e on 192.168.1.21:31010] (java.lang.IllegalStateException) Unexpected state: FULL_BATCH org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.saveRow():530 org.apache.drill.exec.physical.resultSet.impl.RowSetLoaderImpl.save():73 org.apache.drill.exec.store.pcap.PcapBatchReader.addDataToTable():232 org.apache.drill.exec.store.pcap.PcapBatchReader.parsePcapFilesAndPutItToTable():174 org.apache.drill.exec.store.pcap.PcapBatchReader.next():102 org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132 org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():412 org.apache.drill.exec.physical.impl.scan.ReaderState.next():369 org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():261 org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():232 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():196 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.start():174 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():124 org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():148 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.record.AbstractRecordBatch.next():126 org.apache.drill.exec.record.AbstractRecordBatch.next():116 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 org.apache.drill.exec.record.AbstractRecordBatch.next():186 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1746 org.apache.drill.exec.work.fragment.FragmentExecutor.run():283 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 at org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:60) ~[classes/:na] at org.apache.drill.exec.client.DrillClient$ListHoldingResultsListener.getResults(DrillClient.java:881) ~[classes/:na] at org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:583) ~[classes/:na] at org.apache.drill.test.BaseTestQuery.testRunAndReturn(BaseTestQuery.java:340) ~[test-classes/:na] at org.apache.drill.test.BaseTestQuery.testSqlWithResults(BaseTestQuery.java:321) ~[test-classes/:na] at org.apache.drill.exec.store.pcap.TestPcapRecordReader.runSQLWithResults(TestPcapRecordReader.java:90) ~[test-classes/:na] at org.apache.drill.exec.store.pcap.TestPcapRecordReader.runSQLVerifyCount(TestPcapRecordReader.java:85) ~[test-classes/:na] at org.apache.drill.exec.store.pcap.TestPcapRecordReader.testCorruptPCAPQuery(TestPcapRecordReader.java:47) ~[test-classes/:na] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_211] Caused by: org.apache.drill.common.exceptions.UserRemoteException: EXECUTION_ERROR ERROR: Unexpected state: FULL_BATCH Read failed for reader PcapBatchReader Fragment 0:0 [Error Id: bf6446fe-95d5-4ba7-bf7f-32e8a5dd4d1e on 192.168.1.21:31010] (java.lang.IllegalStateException) Unexpected state: FULL_BATCH org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.saveRow():530 org.apache.drill.exec.physical.resultSet.impl.RowSetLoaderImpl.save():73 org.apache.drill.exec.store.pcap.PcapBatchReader.addDataToTable():232 org.apache.drill.exec.store.pcap.PcapBatchReader.parsePcapFilesAndPutItToTable():174 org.apache.drill.exec.store.pcap.PcapBatchReader.next():102 org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132 org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():412 org.apache.drill.exec.physical.impl.scan.ReaderState.next():369 org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():261 org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():232 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():196 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.start():174 org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():124 org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():148 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.record.AbstractRecordBatch.next():126 org.apache.drill.exec.record.AbstractRecordBatch.next():116 org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141 org.apache.drill.exec.record.AbstractRecordBatch.next():186 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237 org.apache.drill.exec.physical.impl.BaseRootExec.next():104 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 org.apache.drill.exec.physical.impl.BaseRootExec.next():94 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296 org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283 java.security.AccessController.doPrivileged():-2 javax.security.auth.Subject.doAs():422 org.apache.hadoop.security.UserGroupInformation.doAs():1746 org.apache.drill.exec.work.fragment.FragmentExecutor.run():283 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748