I spent some time over the weekend altering Drill's storage-kudu to use
Kudu's predicate pushdown api. Everything worked great as long as I
performed flat filtered selects (eg. SELECT .. FROM .. WHERE ..") but
whenever I tested aggregate queries, they would succeed sometimes, then
fail other times -- using the exact same queries.

The failures were always like below. After searching around, I came across
a number of jiras, like https://issues.apache.org/jira/browse/DRILL-2602
that imply Drill can't handle sorts/aggregate queries on "changing
schemas". This was confusing to me because I was testing with a single
table/single schema, which leaves me wondering if "changing schema" means
the unknown type of the aggregate itself? Meaning,  SELECT SUM(a),b FROM t
GROUP BY a; where field a is an INT64, Drill can't figure out how to deal
with SUM(a) because it may exceed the scale of INT64?

If someone could clarify this for me I'd really appreciate it. I'm really
hoping my above understanding is not correct and it's just a problem with
the Vector handling in storage-kudu, because otherwise it seems that
Drill's aggregation capabilities are rather limited.

Errors:

java.lang.IllegalStateException: Failure while reading vector.  Expected
vector class of org.apache.drill.exec.vector.NullableIntVector but was
holding vector class org.apache.drill.exec.vector.BigIntVector, field=
campaign_id(BIGINT:REQUIRED)
at org.apache.drill.exec.record.VectorContainer.getValueAccessorById(
VectorContainer.java:321)
at org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById(
RecordBatchLoader.java:179)

OR

Error: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts
with changing schemas.

Reply via email to