All, Moving this forward, I'll submit a resolution to the Apache board for the next meeting.
One of the concerns that has been mentioned is how to deal with the vectorization and SARG APIs. I'd like to propose that we pull the minimal set of classes in a new Hive module named "storage-api". This module will include VectorizedRowBatch, the various ColumnVector classes, and the SARG classes. It will form the start of an API that high performance storage formats can use to integrate with Hive. Both ORC and Parquet can use the new API to support vectorization and SARGs without performance destroying shims. I'll create a jira to discuss the idea. Thanks! Owen
