+1 for making it a subproject with separate (preferably shorter) release cycle. 
The module in itself is too small for a separate project. Also having a faster 
release cycle will resolve circular dependency and will help other projects 
make use of vectorization, sarg, bloom filter etc.

For version management, how about adding another version after patch version 
i.e sub-project version? 
Example: 2.2.0.[0] will be storage api’s release version. Hive will always 
depend on 2.2.0-SNAPSHOT. I think maven will let us release modules with 
different versions. 
https://dev.c-ware.de/confluence/display/PUBLIC/Releasing+modules+of+a+multi-module+project+with+independent+version+numbers

Thanks
Prasanth 

> On Aug 17, 2016, at 10:46 AM, Alan Gates <alanfga...@gmail.com> wrote:
> 
> +1 for making the API clean and easy for other projects to work with.  A few 
> questions:
> 
> 1) Would this also make it easier for Parquet and others to implement Hive’s 
> ACID interfaces?
> 
> 2) Would we make any attempt to coordinate version numbers between Hive and 
> the storage module, or would a given version of Hive just depend on a given 
> version of the storage module?
> 
> Alan.
> 
>> On Aug 15, 2016, at 17:01, Owen O'Malley <omal...@apache.org> wrote:
>> 
>> All,
>> 
>> As part of moving ORC out of Hive, we pulled all of the vectorization
>> storage and sarg classes into a separate module, which is named
>> storage-api.  Although it is currently only used by ORC, it could be used
>> by Parquet or Avro if they wanted to make a fast vectorized reader that
>> read directly in to Hive's VectorizedRowBatch without needing a shim or
>> data copy. Note that this is in many ways similar to pulling the Arrow
>> project out of Drill.
>> 
>> This unfortunately still leaves us with a circular dependency between Hive
>> and ORC. I'd hoped that storage-api wouldn't change that much, but that
>> doesn't seem to be happening. As a result, ORC ends up shipping its own
>> fork of storage-api.
>> 
>> Although we could make a new project for just the storage-api, I think it
>> would be better to make it a subproject of Hive that is released
>> independently.
>> 
>> What do others think?
>> 
>>  Owen
> 
> 

Reply via email to