Hi Charles,

Looks like I forgot the extra newlines needed to make my e-mail provider work 
with Apache's mailer. Let me try again.

In the "classic" readers, each reader picks some number of rows per batch, 
often 1K, 4K, 4000, etc. The idea is that, on average, this row count will give 
us a decently-sized record batch.

EVF uses a different approach. It tallies the total space used by the batch, 
and the space taken by each vector, and it will decide when the batch is full. 
(You can also set a row count limit. By default, the limit is 64K, the maximum 
row count allowed in a Drill batch.)

So, if you are converting from a "classic" reader to EVF, you will want to 
remove the reader-imposed row count limit. (Or, if there is reason to do so, 
you can pass that limit to EVF and let EVF enforce it.)

For example, convert code that looks like this:

for (int rowCount = 0; rowCount < MAX_BATCH_SIZE; rowCount++) {
  // Load a row
}

Into code that looks like this:

RowSetLoader rowWriter = // get from EVF per examples
while (! rowWriter.isFull()) {
   // Load a row
    rowWriter.save();
}

Note that the "save" is needed because EVF let's you discard a row: you can 
load it, check it, and decide to skip it. This will, eventually, allow us to 
push filtering down to the reader. For now, you just need to call save().


Thanks,
- Paul

 

    On Tuesday, September 24, 2019, 09:56:34 AM PDT, Paul Rogers 
<par0...@yahoo.com.INVALID> wrote:  
 
 So the usual pattern is:
while (! rowWriter.isFull()) {  // Load the row  rowWriter.save();}
Is it the case that PCAP is trying to force the row count to, say 4K or 8K or 
whatever? If so, ignore that count.
The error is telling you that at least one vector has reached 16 MB in size (or 
you've reached the row count limit, if you set that.)
Thanks,
- Paul

 

    On Monday, September 23, 2019, 08:09:38 PM PDT, Charles Givre 
<cgi...@gmail.com> wrote:  
 
 Ok... so I have yet another question relating to the EVF.  I'm working on a 
project to improve (hopefully) the PCAP plugin with the ultimate goal being to 
include parsed PCAP packet data.  In any event, I've run into a snag.  In one 
unit test, I'm getting the error below when I call rowWriter.save().  I suspect 
what is happening is that the TupleWriter is not full when it starts writing 
the row, but by the time it is finished writing the fields, the batch is full.  
Does that even make sense?

Here's a link to the offending code 
<https://github.com/cgivre/drill/blob/caa69e7f27f68aeedaa28902596a2250ab32cd84/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapBatchReader.java#L232>:
 


Any suggestions?
Thanks!
--C




[Error Id: bf6446fe-95d5-4ba7-bf7f-32e8a5dd4d1e on 192.168.1.21:31010]

  (java.lang.IllegalStateException) Unexpected state: FULL_BATCH
    
org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.saveRow():530
    org.apache.drill.exec.physical.resultSet.impl.RowSetLoaderImpl.save():73
    org.apache.drill.exec.store.pcap.PcapBatchReader.addDataToTable():232
    
org.apache.drill.exec.store.pcap.PcapBatchReader.parsePcapFilesAndPutItToTable():174
    org.apache.drill.exec.store.pcap.PcapBatchReader.next():102
    
org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132
    org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():412
    org.apache.drill.exec.physical.impl.scan.ReaderState.next():369
    org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():261
    org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():232
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():196
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.start():174
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():124
    org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():148
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.record.AbstractRecordBatch.next():126
    org.apache.drill.exec.record.AbstractRecordBatch.next():116
    org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
    org.apache.drill.exec.record.AbstractRecordBatch.next():186
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.physical.impl.BaseRootExec.next():104
    org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
    org.apache.drill.exec.physical.impl.BaseRootExec.next():94
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1746
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():283
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748

    at 
org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:60) 
~[classes/:na]
    at 
org.apache.drill.exec.client.DrillClient$ListHoldingResultsListener.getResults(DrillClient.java:881)
 ~[classes/:na]
    at org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:583) 
~[classes/:na]
    at 
org.apache.drill.test.BaseTestQuery.testRunAndReturn(BaseTestQuery.java:340) 
~[test-classes/:na]
    at 
org.apache.drill.test.BaseTestQuery.testSqlWithResults(BaseTestQuery.java:321) 
~[test-classes/:na]
    at 
org.apache.drill.exec.store.pcap.TestPcapRecordReader.runSQLWithResults(TestPcapRecordReader.java:90)
 ~[test-classes/:na]
    at 
org.apache.drill.exec.store.pcap.TestPcapRecordReader.runSQLVerifyCount(TestPcapRecordReader.java:85)
 ~[test-classes/:na]
    at 
org.apache.drill.exec.store.pcap.TestPcapRecordReader.testCorruptPCAPQuery(TestPcapRecordReader.java:47)
 ~[test-classes/:na]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_211]
Caused by: org.apache.drill.common.exceptions.UserRemoteException: 
EXECUTION_ERROR ERROR: Unexpected state: FULL_BATCH

Read failed for reader PcapBatchReader
Fragment 0:0

[Error Id: bf6446fe-95d5-4ba7-bf7f-32e8a5dd4d1e on 192.168.1.21:31010]

  (java.lang.IllegalStateException) Unexpected state: FULL_BATCH
    
org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.saveRow():530
    org.apache.drill.exec.physical.resultSet.impl.RowSetLoaderImpl.save():73
    org.apache.drill.exec.store.pcap.PcapBatchReader.addDataToTable():232
    
org.apache.drill.exec.store.pcap.PcapBatchReader.parsePcapFilesAndPutItToTable():174
    org.apache.drill.exec.store.pcap.PcapBatchReader.next():102
    
org.apache.drill.exec.physical.impl.scan.framework.ShimBatchReader.next():132
    org.apache.drill.exec.physical.impl.scan.ReaderState.readBatch():412
    org.apache.drill.exec.physical.impl.scan.ReaderState.next():369
    org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.nextAction():261
    org.apache.drill.exec.physical.impl.scan.ScanOperatorExec.next():232
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.doNext():196
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.start():174
    org.apache.drill.exec.physical.impl.protocol.OperatorDriver.next():124
    org.apache.drill.exec.physical.impl.protocol.OperatorRecordBatch.next():148
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.record.AbstractRecordBatch.next():126
    org.apache.drill.exec.record.AbstractRecordBatch.next():116
    org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
    
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():141
    org.apache.drill.exec.record.AbstractRecordBatch.next():186
    
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():237
    org.apache.drill.exec.physical.impl.BaseRootExec.next():104
    org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
    org.apache.drill.exec.physical.impl.BaseRootExec.next():94
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():296
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():283
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1746
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():283
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1149
    java.util.concurrent.ThreadPoolExecutor$Worker.run():624
    java.lang.Thread.run():748
    

Reply via email to