I spent some time over the weekend altering Drill's storage-kudu to use Kudu's predicate pushdown api. Everything worked great as long as I performed flat filtered selects (eg. SELECT .. FROM .. WHERE ..") but whenever I tested aggregate queries, they would succeed sometimes, then fail other times -- using the exact same queries.
The failures were always like below. After searching around, I came across a number of jiras, like https://issues.apache.org/jira/browse/DRILL-2602 that imply Drill can't handle sorts/aggregate queries on "changing schemas". This was confusing to me because I was testing with a single table/single schema, which leaves me wondering if "changing schema" means the unknown type of the aggregate itself? Meaning, SELECT SUM(a),b FROM t GROUP BY a; where field a is an INT64, Drill can't figure out how to deal with SUM(a) because it may exceed the scale of INT64? If someone could clarify this for me I'd really appreciate it. I'm really hoping my above understanding is not correct and it's just a problem with the Vector handling in storage-kudu, because otherwise it seems that Drill's aggregation capabilities are rather limited. Errors: java.lang.IllegalStateException: Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.drill.exec.vector.BigIntVector, field= campaign_id(BIGINT:REQUIRED) at org.apache.drill.exec.record.VectorContainer.getValueAccessorById( VectorContainer.java:321) at org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById( RecordBatchLoader.java:179) OR Error: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing schemas.