Re: [VOTE] Move Chukwa to incubator

2010-06-22 Thread Bernd Fondermann
On Mon, Jun 21, 2010 at 19:29, Eric Yang ey...@yahoo-inc.com wrote:
 Please vote as to whether you think Chukwa should move to Apache incubator.

 The proposal is posted at:

 http://wiki.apache.org/incubator/ChukwaProposal

It's best practice to post the full proposal to the list, to have a
snapshot archived.

Chukwa Proposal

Abstract

Chukwa is a log collection and analysis framework base on Hadoop Map/Reduce.

Proposal

Chukwa will develop a open source data collection system for
monitoring large distributed systems. Chukwa is built on top of the
Hadoop Distributed File System (HDFS) and Map/Reduce framework and
inherits Hadoop’s scalability and robustness. Chukwa also includes a
flexible and powerful toolkit for displaying, monitoring and analyzing
results to make the best use of the collected data.

Background

Apache Hadoop, lacks a good procedure to monitor and troubleshoot
large distributed systems. Chukwa was initially developed at Yahoo Inc
headed by Mac Yang, Sunnyvale in 2008. Chukwa was designed as a
reference implementation for monitoring large distributed system on
top of Hadoop. Since 2009 major parts of the development comes from
Internet community contribution. Chukwa is current a Hadoop
subproject.

Rationale

The maintainers and developers of Chukwa are interested in joining the
Apache Software Foundation top level project for several reasons:

* Apache provide a great community for open source software
development environment.
* It might open the door for sharing ideas or cooperation with
other Apache projects, such as Avro and Hadoop.
* Chukwa would like to benefit from Apache's infrastructure.

Initial Goals

Though the bulk of Chukwa initial development is complete and the
framework is running stable, there are still some large areas for
future development. Some area we hope to focus on in Apache:

* Improve Chukwa Demux map/reduce Job
* Refine automated log analysis algorithms
* Remove dependency on relational database for reporting

Current Status

Meritocracy

The initial developers are very familiar with meritocratic open source
development, both at Apache and elsewhere. Apache was chosen
specifically because the initial developers want to encourage this
style of development for the project.

Community

Chukwa is used in many organization which are interested in the
advancement of the Chukwa development. Many of these have at least one
developer that joined the Chukwa mailing list and so the mailing list
is the most important communication platform. The Chukwa community
encourages suggestions and contributions from any potential user and
developer.

Core Developers

The initial set of Chukwa committers includes folks from the Hadoop
communities. We have varying degrees of experience with Apache-style
open source development.

Alignment

Chukwa is a framework for Apache Hadoop. This is why Apache Hadoop is
the most important dependency for Chukwa. And Chukwa is also a
particularly good fit for Apache due to integration potential with
other projects specifically Avro and Log4j.

Known Risks

Orphaned products

Most of the active developers would like to become Chukwa Committers
or PMC Members and have long term interest to develop/maintain and use
the code.

Inexperience with Open Source

Chukwa was started as an open source contribute project to Hadoop in
2008. Many of the committers have experience working on open source
projects and there are also at least one developer which has
experience as committer on other Apache projects.

Homogenous Developers

As mentioned above, the current list of committers includes developers
from at least two different companies plus many independent
volunteers.

Reliance on Salaried Developers

At this time, many of the code comes from different companies like RAD
Lab. Because RAD Lab is a research facility, many of the work is done
by students working on their diploma thesis.

Relationships with Other Apache Products

At this time, the only dependency to other Apache projects is Apache
Hadoop. When dependency on relational database is removed, Avro will
become the standard serialization framework for Chukwa.

A Excessive Fascination with the Apache Brand

The Chukwa project exist quite successful on their own and could
continue on that path with no problems at all. We expect the Apache
top level project brand could help to increase the visibility of the
project and so maybe more developers could be interested in the
project.

Documentation

*

  The existing project page could be found here:
http://hadoop.apache.org/chukwa
*

  The Chukwa Architecture:
http://hadoop.apache.org/chukwa/docs/current/design.html
*

  The Chukwa mailing list with archive:
http://hadoop.apache.org/chukwa/mailing_lists.html

Initial Source

Source and Intellectual Property Submission Plan

The complete Chukwa code is under Apache Software License 2. The
complete codebase is already hosted in ASF Repository.

External 

Re: [VOTE] Move Chukwa to incubator

2010-06-22 Thread Bernd Fondermann
On Mon, Jun 21, 2010 at 23:37, William A. Rowe Jr. wr...@rowe-clan.net wrote:
 On 6/21/2010 12:29 PM, Eric Yang wrote:
 Please vote as to whether you think Chukwa should move to Apache incubator.

 The proposal is posted at:

 http://wiki.apache.org/incubator/ChukwaProposal

 +1

+1

Added myself as a mentor.

  Bernd

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Move Chukwa to incubator

2010-06-22 Thread William A. Rowe Jr.
On 6/22/2010 2:42 AM, ant elder wrote:
 On Mon, Jun 21, 2010 at 8:09 PM, William A. Rowe Jr.
 wr...@rowe-clan.net wrote:
 On 6/21/2010 1:31 PM, Owen O'Malley wrote:

 On Jun 21, 2010, at 11:06 AM, Mattmann, Chris A (388J) wrote:

 Chukwa has been around for a while now and from my (albeit limited)
 impression, pretty successful. What's the rationale for going the
 Incubator route rather than putting up a Board TLP resolution? Just
 wanted to check, thanks guys!

 The problem is that none of the Chukwa PMC members have been on any
 Apache PMCs before. My belief is that having training wheels for a bit
 would be a good thing.

 And the podling's committee itself seeks the extra guidance as they become
 a self-managing committee, so the mentors all agreed with this proposal.
 If anything, it makes checking off the graduation matrix much simpler as
 they are already committers, we already have the IP vetting when the code
 came into Hadoop.  We should obviously re-review the grants and trademark
 assignments during incubation.

 
 I'm not totally convinced by that reasoning, wouldn't it be simpler to
 just go directly to TLP and have those listed here as mentors agree to
 help out by being on the initial PMC?
 
 If it does incubate what would be delaying its graduation? Its already
 got everything we list in the incubator docs - diverse committers,
 done several releases etc.
 
 The current proposal doesn't use the incubator naming for the mailing
 lists and svn location, from past discussions here it should really be
 using the incubator naming unless its a very special case. Is this a
 special case?

If that is the desire of the incubator, to refer this project to a TLP,
I'm happy to serve on that pmc for the first 6 mos - 1 yr, whatever it
takes for the project to become comfortable with all of the aspects.

Note these committers were not hadoop pmc members, in fact it was this
disjoint arrangement that pointed out the need to fork this subproject
into its own entity, under the direction of its own community.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Move Chukwa to incubator

2010-06-22 Thread Bernd Fondermann
On Tue, Jun 22, 2010 at 09:42, ant elder ant.el...@gmail.com wrote:
 On Mon, Jun 21, 2010 at 8:09 PM, William A. Rowe Jr.
 wr...@rowe-clan.net wrote:
 On 6/21/2010 1:31 PM, Owen O'Malley wrote:

 On Jun 21, 2010, at 11:06 AM, Mattmann, Chris A (388J) wrote:

 Chukwa has been around for a while now and from my (albeit limited)
 impression, pretty successful. What's the rationale for going the
 Incubator route rather than putting up a Board TLP resolution? Just
 wanted to check, thanks guys!

 The problem is that none of the Chukwa PMC members have been on any
 Apache PMCs before. My belief is that having training wheels for a bit
 would be a good thing.

 And the podling's committee itself seeks the extra guidance as they become
 a self-managing committee, so the mentors all agreed with this proposal.
 If anything, it makes checking off the graduation matrix much simpler as
 they are already committers, we already have the IP vetting when the code
 came into Hadoop.  We should obviously re-review the grants and trademark
 assignments during incubation.


 I'm not totally convinced by that reasoning, wouldn't it be simpler to
 just go directly to TLP and have those listed here as mentors agree to
 help out by being on the initial PMC?

Maybe simpler, but better?
I've only been involved in this process since yesterday or so, but I
trust those who set out going the Incubator road.
And the project doesn't loose anything with following it, even if it's a detour.
The Incubator has more eyeballs than any other PMC and has tools to
prepare projects to go TLP.
We have far more projects eager to go TLP ASAP without properly
preparing their PMCness than those openly saying we want to learn how
to do it right first.

 If it does incubate what would be delaying its graduation? Its already
 got everything we list in the incubator docs - diverse committers,
 done several releases etc.

IIUC, the only issue right now is that the committers are hesistant to
go TLP because they've never been on a PMC before.

 The current proposal doesn't use the incubator naming for the mailing
 lists and svn location, from past discussions here it should really be
 using the incubator naming unless its a very special case. Is this a
 special case?

Good catch. I think the Incubator nomenclature should apply to Chukwa as well.

  Bernd

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Move Chukwa to incubator

2010-06-22 Thread Eric Yang
Besides DOAP file and the incubator nomenclature, I may need help identify
the addition responsibilities for Apache PMC.  One problem, Chukwa community
did not have a vote for PMC Chair because we are not sure what is the right
process for this.  Meanwhile, I have been writing quarterly report like any
other Apache project, only recipient of the report is different.

Chukwa releases have been voted by Chukwa community which is similar to
Hadoop releases, and managed incremental changes using patches and
committers.  Code audit has been performed by the committers to ensure we
don't bring in license incompatible libraries into Chukwa.

Owen O'Malley had trained us these procedures roughly two years ago, and we
have been executing the same process ever since.

Chukwa has a community of exist user base of 35 people.  It would be nice to
make Chukwa a special case to skip incubator nomenclature.  This would ease
the migration path for the existing Chukwa community.

Regards,
Eric


On 6/22/10 7:11 AM, Greg Reddin gred...@gmail.com wrote:

 On Tue, Jun 22, 2010 at 3:40 AM, Bernd Fondermann
 bernd.fonderm...@googlemail.com wrote:
 IIUC, the only issue right now is that the committers are hesistant to
 go TLP because they've never been on a PMC before.
 
 The current proposal doesn't use the incubator naming for the mailing
 lists and svn location, from past discussions here it should really be
 using the incubator naming unless its a very special case. Is this a
 special case?
 
 Good catch. I think the Incubator nomenclature should apply to Chukwa as
 well.
 
 It seems to me that it would save everyone some work if they went
 straight with the TLP nomenclature. If they only need a short time in
 the Incubator to learn how to be a PMC, then maybe the Incubator
 nomenclature is not necessary and just creates more work for infra,
 PMC, and users when they graduate.
 
 Greg
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org