Re: [VOTE] Move Chukwa to incubator
On Mon, Jun 21, 2010 at 19:29, Eric Yang ey...@yahoo-inc.com wrote: Please vote as to whether you think Chukwa should move to Apache incubator. The proposal is posted at: http://wiki.apache.org/incubator/ChukwaProposal It's best practice to post the full proposal to the list, to have a snapshot archived. Chukwa Proposal Abstract Chukwa is a log collection and analysis framework base on Hadoop Map/Reduce. Proposal Chukwa will develop a open source data collection system for monitoring large distributed systems. Chukwa is built on top of the Hadoop Distributed File System (HDFS) and Map/Reduce framework and inherits Hadoop’s scalability and robustness. Chukwa also includes a flexible and powerful toolkit for displaying, monitoring and analyzing results to make the best use of the collected data. Background Apache Hadoop, lacks a good procedure to monitor and troubleshoot large distributed systems. Chukwa was initially developed at Yahoo Inc headed by Mac Yang, Sunnyvale in 2008. Chukwa was designed as a reference implementation for monitoring large distributed system on top of Hadoop. Since 2009 major parts of the development comes from Internet community contribution. Chukwa is current a Hadoop subproject. Rationale The maintainers and developers of Chukwa are interested in joining the Apache Software Foundation top level project for several reasons: * Apache provide a great community for open source software development environment. * It might open the door for sharing ideas or cooperation with other Apache projects, such as Avro and Hadoop. * Chukwa would like to benefit from Apache's infrastructure. Initial Goals Though the bulk of Chukwa initial development is complete and the framework is running stable, there are still some large areas for future development. Some area we hope to focus on in Apache: * Improve Chukwa Demux map/reduce Job * Refine automated log analysis algorithms * Remove dependency on relational database for reporting Current Status Meritocracy The initial developers are very familiar with meritocratic open source development, both at Apache and elsewhere. Apache was chosen specifically because the initial developers want to encourage this style of development for the project. Community Chukwa is used in many organization which are interested in the advancement of the Chukwa development. Many of these have at least one developer that joined the Chukwa mailing list and so the mailing list is the most important communication platform. The Chukwa community encourages suggestions and contributions from any potential user and developer. Core Developers The initial set of Chukwa committers includes folks from the Hadoop communities. We have varying degrees of experience with Apache-style open source development. Alignment Chukwa is a framework for Apache Hadoop. This is why Apache Hadoop is the most important dependency for Chukwa. And Chukwa is also a particularly good fit for Apache due to integration potential with other projects specifically Avro and Log4j. Known Risks Orphaned products Most of the active developers would like to become Chukwa Committers or PMC Members and have long term interest to develop/maintain and use the code. Inexperience with Open Source Chukwa was started as an open source contribute project to Hadoop in 2008. Many of the committers have experience working on open source projects and there are also at least one developer which has experience as committer on other Apache projects. Homogenous Developers As mentioned above, the current list of committers includes developers from at least two different companies plus many independent volunteers. Reliance on Salaried Developers At this time, many of the code comes from different companies like RAD Lab. Because RAD Lab is a research facility, many of the work is done by students working on their diploma thesis. Relationships with Other Apache Products At this time, the only dependency to other Apache projects is Apache Hadoop. When dependency on relational database is removed, Avro will become the standard serialization framework for Chukwa. A Excessive Fascination with the Apache Brand The Chukwa project exist quite successful on their own and could continue on that path with no problems at all. We expect the Apache top level project brand could help to increase the visibility of the project and so maybe more developers could be interested in the project. Documentation * The existing project page could be found here: http://hadoop.apache.org/chukwa * The Chukwa Architecture: http://hadoop.apache.org/chukwa/docs/current/design.html * The Chukwa mailing list with archive: http://hadoop.apache.org/chukwa/mailing_lists.html Initial Source Source and Intellectual Property Submission Plan The complete Chukwa code is under Apache Software License 2. The complete codebase is already hosted in ASF Repository. External
Re: [VOTE] Move Chukwa to incubator
On Mon, Jun 21, 2010 at 23:37, William A. Rowe Jr. wr...@rowe-clan.net wrote: On 6/21/2010 12:29 PM, Eric Yang wrote: Please vote as to whether you think Chukwa should move to Apache incubator. The proposal is posted at: http://wiki.apache.org/incubator/ChukwaProposal +1 +1 Added myself as a mentor. Bernd - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Move Chukwa to incubator
On 6/22/2010 2:42 AM, ant elder wrote: On Mon, Jun 21, 2010 at 8:09 PM, William A. Rowe Jr. wr...@rowe-clan.net wrote: On 6/21/2010 1:31 PM, Owen O'Malley wrote: On Jun 21, 2010, at 11:06 AM, Mattmann, Chris A (388J) wrote: Chukwa has been around for a while now and from my (albeit limited) impression, pretty successful. What's the rationale for going the Incubator route rather than putting up a Board TLP resolution? Just wanted to check, thanks guys! The problem is that none of the Chukwa PMC members have been on any Apache PMCs before. My belief is that having training wheels for a bit would be a good thing. And the podling's committee itself seeks the extra guidance as they become a self-managing committee, so the mentors all agreed with this proposal. If anything, it makes checking off the graduation matrix much simpler as they are already committers, we already have the IP vetting when the code came into Hadoop. We should obviously re-review the grants and trademark assignments during incubation. I'm not totally convinced by that reasoning, wouldn't it be simpler to just go directly to TLP and have those listed here as mentors agree to help out by being on the initial PMC? If it does incubate what would be delaying its graduation? Its already got everything we list in the incubator docs - diverse committers, done several releases etc. The current proposal doesn't use the incubator naming for the mailing lists and svn location, from past discussions here it should really be using the incubator naming unless its a very special case. Is this a special case? If that is the desire of the incubator, to refer this project to a TLP, I'm happy to serve on that pmc for the first 6 mos - 1 yr, whatever it takes for the project to become comfortable with all of the aspects. Note these committers were not hadoop pmc members, in fact it was this disjoint arrangement that pointed out the need to fork this subproject into its own entity, under the direction of its own community. - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Move Chukwa to incubator
On Tue, Jun 22, 2010 at 09:42, ant elder ant.el...@gmail.com wrote: On Mon, Jun 21, 2010 at 8:09 PM, William A. Rowe Jr. wr...@rowe-clan.net wrote: On 6/21/2010 1:31 PM, Owen O'Malley wrote: On Jun 21, 2010, at 11:06 AM, Mattmann, Chris A (388J) wrote: Chukwa has been around for a while now and from my (albeit limited) impression, pretty successful. What's the rationale for going the Incubator route rather than putting up a Board TLP resolution? Just wanted to check, thanks guys! The problem is that none of the Chukwa PMC members have been on any Apache PMCs before. My belief is that having training wheels for a bit would be a good thing. And the podling's committee itself seeks the extra guidance as they become a self-managing committee, so the mentors all agreed with this proposal. If anything, it makes checking off the graduation matrix much simpler as they are already committers, we already have the IP vetting when the code came into Hadoop. We should obviously re-review the grants and trademark assignments during incubation. I'm not totally convinced by that reasoning, wouldn't it be simpler to just go directly to TLP and have those listed here as mentors agree to help out by being on the initial PMC? Maybe simpler, but better? I've only been involved in this process since yesterday or so, but I trust those who set out going the Incubator road. And the project doesn't loose anything with following it, even if it's a detour. The Incubator has more eyeballs than any other PMC and has tools to prepare projects to go TLP. We have far more projects eager to go TLP ASAP without properly preparing their PMCness than those openly saying we want to learn how to do it right first. If it does incubate what would be delaying its graduation? Its already got everything we list in the incubator docs - diverse committers, done several releases etc. IIUC, the only issue right now is that the committers are hesistant to go TLP because they've never been on a PMC before. The current proposal doesn't use the incubator naming for the mailing lists and svn location, from past discussions here it should really be using the incubator naming unless its a very special case. Is this a special case? Good catch. I think the Incubator nomenclature should apply to Chukwa as well. Bernd - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Move Chukwa to incubator
Besides DOAP file and the incubator nomenclature, I may need help identify the addition responsibilities for Apache PMC. One problem, Chukwa community did not have a vote for PMC Chair because we are not sure what is the right process for this. Meanwhile, I have been writing quarterly report like any other Apache project, only recipient of the report is different. Chukwa releases have been voted by Chukwa community which is similar to Hadoop releases, and managed incremental changes using patches and committers. Code audit has been performed by the committers to ensure we don't bring in license incompatible libraries into Chukwa. Owen O'Malley had trained us these procedures roughly two years ago, and we have been executing the same process ever since. Chukwa has a community of exist user base of 35 people. It would be nice to make Chukwa a special case to skip incubator nomenclature. This would ease the migration path for the existing Chukwa community. Regards, Eric On 6/22/10 7:11 AM, Greg Reddin gred...@gmail.com wrote: On Tue, Jun 22, 2010 at 3:40 AM, Bernd Fondermann bernd.fonderm...@googlemail.com wrote: IIUC, the only issue right now is that the committers are hesistant to go TLP because they've never been on a PMC before. The current proposal doesn't use the incubator naming for the mailing lists and svn location, from past discussions here it should really be using the incubator naming unless its a very special case. Is this a special case? Good catch. I think the Incubator nomenclature should apply to Chukwa as well. It seems to me that it would save everyone some work if they went straight with the TLP nomenclature. If they only need a short time in the Incubator to learn how to be a PMC, then maybe the Incubator nomenclature is not necessary and just creates more work for infra, PMC, and users when they graduate. Greg - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org