> I¹m getting an error in Hive when executing a query on a table in ORC >format.
This is not an ORC bug, this looks like a vectorization issue. Can you try comparing both query plans (³explain <query>²) for the Execution mode: vectorized markers? TextFile queries are not vectorized today, since you cannot find if any column is marked as isRepeating=true in a row-major format. > SELECT CONCAT(TO_DATE(datetime), '-'), SUM(gpa) FROM students_orc >GROUP BY CONCAT(TO_DATE(datetime), '-); ... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unsuported >vector output type: StringGroup > at >org.apache.hadoop.hive.ql.exec.vector.VectorColumnSetInfo.addKey(VectorCol >umnSetInfo.java:139) > at >org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.compileKey >WrapperBatch(VectorHashKeyWrapperBatch.java:521) > at >org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.initializeOp(V >ectorGroupByOperator.java:786) The correct fix would be to handle this query pattern for vectorization (or automatically disable vectorization, like it has to do for Unions). Can you log a bug on Apache JIRA against the correct version of hive which threw this error up? Cheers, Gopal