+1 for Howl as an incubator project. 

-----Original Message-----
From: Alan Gates [mailto:ga...@yahoo-inc.com] 
Sent: Wednesday, February 02, 2011 9:17 PM
To: user@pig.apache.org
Cc: u...@hive.apache.org
Subject: Re: [VOTE] Sponsoring Howl as an Apache Incubator project

Edward,

I understand your concern with having a copy of the metastore code in Howl.  
However, let's separate code from governance.  The reason Howl has a copy of 
Hive's metastore is not because we're proposing it for the Incubator, it is 
because in the course of developing it over the last six months we've found 
that Howl development needs to move much faster than Hive development can.  
This is appropriate, since Hive is a mature product and has at least one large 
customer that runs code in production very soon after it is checked in.  Thus 
the Hive community is rightly cautious about checking in changes to the 
metastore.  Howl, on the other hand, is new and innovating quickly, so it likes 
to get things checked in quickly.  Over the last six months every patch Howl  
has made to the Hive metastore code has made it back into Hive code.   
But it generally takes a few weeks or more to get in.

Whether Howl is a Hive subproject or an Incubator project it faces the same 
dilemma. The only other alternative that was suggested was to have Howl extern 
the metastore code from Hive and keep its patches in its build and apply them 
at build time.  But this is very fragile, since any changes in the Hive 
metastore code could invalidate all those patches.  We know that this is not 
sustainable in the long run, which is why the proposal calls out the need to 
resolve this one way or another as the project matures.

As far as reaching an end state where Hive and Howl are not compatible, we 
would view that as a failure for Howl.  The goal for Howl is to be a metastore 
for Pig, MapReduce, and Hive, not just 2 out 3.  So we have a strong motivation 
to maintain that compatibility.

In terms of governance, given that we have significant contributions coming 
from members of the Pig team, the Hive team, and the core Hadoop team it seemed 
that giving Howl its own space in the Incubator made more sense than adding it 
as a subproject of any one of those teams.

Alan.

On Feb 2, 2011, at 3:11 PM, Edward Capriolo wrote:

> On Wed, Feb 2, 2011 at 5:08 PM, Jeff Hammerbacher 
> <ham...@cloudera.com> wrote:
>> Awesome! Huge +1.
>>
>> On Wed, Feb 2, 2011 at 1:18 PM, Alan Gates <ga...@yahoo-inc.com>
>> wrote:
>>
>>> Howl is a table management system built to provide metadata and 
>>> storage management across data processing tools in Hadoop (Pig, 
>>> Hive, MapReduce, ...).  You can learn more details at 
>>> http://wiki.apache.org/pig/ Howl.  For the last six months the code 
>>> has been hosted at github.  The Howl team would like to move the 
>>> project into the Apache Incubator.  You can see the proposal for the 
>>> project at http://wiki.apache.org/incubator/HowlProposal
>>> .
>>>
>>> In order to be accepted as an Incubator project Howl needs a 
>>> Sponsoring project.  I propose that we, the Pig project, sponsor 
>>> Howl.  By sponsoring Howl we are saying that we believe it is a good 
>>> fit for the ASF and that we will assist the Howl project to succeed.  
>>> You can read full details of sponsoring a project at 
>>> http://incubator.apache.org/incubation/Roles_and_Responsibilities.ht
>>> ml#Sponsor
>>> .
>>>
>>> Our bylaws don't explicitly cover such a vote, but I think lazy 
>>> majority should be reasonable.  All votes are welcome, PMC member 
>>> votes will be binding.
>>>
>>> Clearly I'm +1.
>>>
>>> Alan.
>>>
>>
>
> I do think it is a great idea that hive/pig/ and map reduce share a 
> meta store. However I am not sure I agree with the approach. IMHO Howl 
> should be a hive sub project.
>
> "The initial release of Howl will allow interoperability of data 
> between Pig, Map Reduce, and Hive"
> I believe the "The initial release of Howl should support hive"
> at this point hive should remove the /metastore code from inside hive 
> and depend on howl.
>
> I say this because hive is very actively reworking the metastore right 
> now for security, a new type of views, and indexes. I feel if the 
> metastore branches from the hive as howl getting the two entities back 
> together will be difficult. Having 99% of the same code base shared 
> between hive and howl but not having compatibility between the two is 
> my fear.

Reply via email to