All,

Moving this forward, I'll submit a resolution to the Apache board for the
next meeting.

One of the concerns that has been mentioned is how to deal with the
vectorization and SARG APIs. I'd like to propose that we pull the minimal
set of classes in a new Hive module named "storage-api". This module will
include VectorizedRowBatch, the various ColumnVector classes, and the SARG
classes. It will form the start of an API that high performance storage
formats can use to integrate with Hive. Both ORC and Parquet can use the
new API to support vectorization and SARGs without performance destroying
shims. I'll create a jira to discuss the idea.

Thanks!
   Owen

Reply via email to