If I understood Allen's #2 comment, we are moving existing ORC code out of
Hive and make it a separate project, which I definitely missed. Since
existing Hive PMC has governance on the code, I would expect it's still the
case even after the spinoff. Obviously the proposal doesn't reflect this.

Thanks,
Xuefu

On Fri, Apr 3, 2015 at 12:51 PM, Alan Gates <alanfga...@gmail.com> wrote:

> A couple of points:
>
> 1) ORC isn't going into the incubator.  The proposal before the board is
> for it to go straight to TLP.  There's no graduation to depend on.
> 2) As currently proposed Hive would not depend on ORC to build.  Hive
> users who wished to used ORC would obviously need to pull in ORC artifacts
> in addition to Hive.  Given this I don't think it makes any sense to fork
> ORC and have it in both places.  This actually seems the worse outcome, as
> the two will inevitably diverge.
>
> Alan.
>
>   Xuefu Zhang <xzh...@cloudera.com>
>  April 3, 2015 at 6:41
> I actually have a different thought to share along the same line.
>
> ORC is not a subproject in Hive. I'm not sure if it's the best we can do by
> making a surgery on Hive in order to make ORC a TLP, Not only may this
> bring instability to Hive, but also it also makes Hive depend an incubating
> project. Not every project graduates(, though I do wish ORC a success as
> TLP), some of them fail.
>
> Instead, I like the idea of forking Hive ORC as TLP and Hive keeps whatever
> it has. This way, the new project can do whatever it wants, and Hive
> community probably doesn't care and has no saying to it. Once ORC as a TLP
> graduates, Hive community can decide whether to go along with it and if so
> how to integrate with it.
>
> I think this will subside the current controversy, help ORC proceed faster
> as a TLP, and leave the decision to the near future.
>
> Thanks,
> Xuefu
>
>
>   Szehon Ho <sze...@cloudera.com>
>  April 2, 2015 at 23:54
> I also agree with this goal.
>
> As such, I think we should first see the proposal (JIRA?) for the
> storage-api refactoring and other related work of Orc separating as TLP
> before the actual separation happens, to make sure the separation is not
> done in a way taking us further from this goal. It may very well be this
> refactoring moves us closer to the goal, but seeing the proposal first
> would give a lot of clarity.
>
> Thanks
> Szehon
>
> On Thu, Apr 2, 2015 at 10:20 PM, Edward Capriolo <edlinuxg...@gmail.com>
> <edlinuxg...@gmail.com>
>
>   Edward Capriolo <edlinuxg...@gmail.com>
>  April 2, 2015 at 22:20
> To reiterate, one thing I want to avoid is having hive rely on code that
> sits in several tiny silos across Apache projects, or Apache Licensed but
> not ASF projects. Hive is a mature TLP with a large number of committers
> and it would not be a good situation if often work gets bottle necked
> because changes had to be made across two projects simultaneously to commit
> a feature. Especially if the two projects do not share the same committer
> list.
>
> I think if could be done perfectly things like ORC, Parquet, whatever would
> be <provided> scope dependencies, meaning the project can be built without
> a particular piece but as a hole the project still works. (That might be
> easier said than done :)
>
>
>   Nick Dimiduk <ndimi...@gmail.com>
>  April 1, 2015 at 11:51
> I think the storage-api would be very helpful for HBase integration as
> well.
>
>
>   Owen O'Malley <omal...@apache.org>
>  April 1, 2015 at 11:22
>
>
>
>>
>> What I'd like to see here is well defined interfaces in Hive so that any
>> storage format that wants can implement them.  Hopefully that means things
>> like interfaces and utility classes for acid, sargs, and vectorization move
>> into this new Hive module storage-api.  Then Orc, Parquet, etc. can depend
>> on this module without needing to pull in all of Hive.
>>
>> Then Hive contributors would only be forced to make changes in Orc when
>> they want to implement something in Orc.
>>
>
> Agreed. The goal of the new module keep a clean separation between the
> code for ORC and Hive so that vectorization, sargs, and acid are kept in
> Hive and are not moved to or duplicated in the ORC project.
>
> .. Owen
>
>

Reply via email to