vdiravka commented on pull request #2364: URL: https://github.com/apache/drill/pull/2364#issuecomment-1030429509
Hi @paul-rogers I have rebased the branch to master branch. And in separate new commit removed the hack, which hid the schema change in the `HashAggTemplate` (and actually one row is missing in query result, just actually test case doesn't check it). Thanks for explanation how vectors is working, it helped me. It is clear now, that schema is changing due to `RepeatedMapVector` [can't be obtained from the cache](https://github.com/apache/drill/blob/317f164791bbbe8f937eb452b49e92c34f1c0333/exec/java-exec/src/main/java/org/apache/drill/exec/physical/resultSet/impl/ColumnBuilder.java#L220): ``` // Don't get the map vector from the vector cache. Map vectors may // have content that varies from batch to batch. Only the leaf // vectors can be cached. ``` Obtaining vector from cache here leads to errors in this and other test cases: `mapVector = (RepeatedMapVector) parent.vectorCache().vectorFor(mapColSchema.schema());` ``` org.apache.drill.common.exceptions.UserRemoteException: EXECUTION_ERROR ERROR: null Read failed for reader: JsonBatchReader .... Caused by: java.lang.AssertionError: at org.apache.drill.exec.physical.resultSet.impl.TupleState$MapState.addOutputColumn(TupleState.java:475) at org.apache.drill.exec.physical.resultSet.impl.ColumnState.buildOutput(ColumnState.java:321) at org.apache.drill.exec.physical.resultSet.impl.TupleState.updateOutput(TupleState.java:206) at org.apache.drill.exec.physical.resultSet.impl.TupleState.updateOutput(TupleState.java:217) at org.apache.drill.exec.physical.resultSet.impl.TupleState$RowState.updateOutput(TupleState.java:430) at org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl.harvest(ResultSetLoaderImpl.java:716) ``` So as for me looks like we need to implement supporting schema change for hashAgg operator or obtaining `RepeatedMapVector` from the cache. I lean towards the latter. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
