Fwd: [VOTE] Apache OODT 0.1-incubating release

2010-10-19 Thread David M Woollard
Apologies for cross-posting. 

Please see below for the Apache OODT 0.1-incubating vote currently being held 
on the oodt-dev list.  I'd like to call a VOTE.

[ ] +1 Release the packages as Apache OODT 0.1-incubating
[ ] -1 Do not release the packages because...

Thanks!

Dave Woollard



Begin forwarded message:

 From: David M Woollard wooll...@jpl.nasa.gov
 Date: October 19, 2010 5:07:13 PM PDT
 To: oodt-...@incubator.apache.org oodt-...@incubator.apache.org
 Subject: [VOTE] Apache OODT 0.1-incubating release
 Reply-To: oodt-...@incubator.apache.org oodt-...@incubator.apache.org
 
 Hi Folks,
 
 I am proud to announce the first candidate for the Apache OODT 0.1-incubating 
 release. The source code is at:
 
 http://people.apache.org/~woollard/apache-oodt-0.1-incubating/rc1/
 
 For more detailed information, see the included CHANGES.txt file for details 
 on
 release contents and latest changes. The release was made using the OODT 
 release process, documented on the Wiki here:
 
 http://s.apache.org/05
 
 The release was made from the OODT 0.1-incubating branch (r1024310) at:
 
 https://svn.apache.org/repos/asf/incubator/oodt/branches/0.1-incubating
 
 Please vote on releasing these packages as Apache OODT 0.1-incubating. The 
 vote is
 open for the next 72 hours.
 
 Only votes from Incubator PMC are binding, but folks are welcome to check the
 release candidate and voice their approval or disapproval. The vote passes
 if at least three binding +1 votes are cast.
 
 [ ] +1 Release the packages as Apache OODT version
 
 [ ] -1 Do not release the packages because...
 
 Thanks!
 
 Dave Woollard
 
 P.S. Here is my +1.
 



Re: [VOTE] Accept Gora into the Apache Incubator

2010-09-19 Thread David M Woollard
+1 (non-binding)
-
David M. Woollard, Software Engineer
Data Management Systems and Technologies Group (388J)
NASA Jet Propulsion Laboratory, Pasadena, CA, 91109, USA
Office: 171-243D  Phone: (818) 354-4291

Anybody who wants to make a revolution shouldn't grab a gun. 
Just go and start working to change the world by using science 
and technology.-Stanford Ovshinsky





On Sep 19, 2010, at 8:21 PM, Mattmann, Chris A (388J) wrote:

 Hi Folks,
 
 Over the past week or so we've been discussing the Gora project and bringing
 it into the Apache Incubator [1]. It's time to call a VOTE thread on the
 issue. Please VOTE below:
 
 [ ] +1 Accept Gora into the Apache Incubator.
 [ ] +0 Don't care.
 [ ] -1 Don't accept Gora into the Apache Incubator because...
 
 I'll leave the VOTE open for the remainder of the week (ending on 9/24).
 Here's my +1 (IPMC binding).
 
 [1] http://s.apache.org/MPw
 
 Cheers,
 Chris
 
 P.S. The wiki text for the proposal is pasted below.
 
 --
 Gora Proposal for Apache Incubation
 
 Abstract
 Gora is an ORM framework for column stores such as Apache HBase and Apache
 Cassandra with a specific focus on Hadoop.
 
 Proposal
 Although there are various excellent ORM frameworks for relational
 databases, data modeling in NoSQL data stores differ profoundly from their
 relational cousins. Moreover, data-model agnostic frameworks such as JDO are
 not sufficient for use cases, where one needs to use the full power of the
 data models in column stores. Gora fills this gap by giving the user an
 easy-to-use ORM framework with data store specific mappings and built in
 Apache Hadoop support.
 
 The overall goal for Gora is to become the standard data representation and
 persistence framework for big data. The roadmap of Gora can be grouped as
 follows.
 * Data Persistence : Persisting objects to Column stores such as HBase,
 Cassandra, Hypertable; key-value stores such as Voldermort, Redis, etc; SQL
 databases, such as MySQL, HSQLDB, flat files in local file system of Hadoop
 HDFS.
 * Data Access : An easy to use Java-friendly common API for accessing the
 data regardless of its location.
 * Indexing : Persisting objects to Lucene and Solr indexes,
 accessing/querying the data with Gora API.
 * Analysis : Accesing the data and making analysis through adapters for
 Apache Pig, Apache Hive and Cascading
 * MapReduce http://wiki.apache.org/incubator/MapReduce  support :
 Out-of-the-box and extensive MapReduce
 http://wiki.apache.org/incubator/MapReduce  (Apache Hadoop) support for
 data in the data store.
 
 Background
 ORM stands for Object Relation Mapping. It is a technology which abstacts
 the persistency layer (mostly Relational Databases) so that plain domain
 level objects can be used, without the cumbersome effort to save/load the
 data to and from the database. Gora differs from current solutions in that:
 * Gora is specially focussed at NoSQL data stores, but also has limited
 support for SQL databases
 * The main use case for Gora is to access/analyze big data using Hadoop.
 * Gora uses Avro for bean definition, not byte code enhancement or
 annotations
 * Object-to-data store mappings are backend specific, so that full data
 model can be utilized.
 * Gora is simple since it ignores complex SQL mappings
 * Gora will support persistence, indexing and anaysis of data, using Pig,
 Lucene, Hive, etc
 Rationale
 ORM frameworks are nothing new. But with the explosion of data generated in
 Terabytes and even Petabytes, NoSQL data stores are gaining ever-increasing
 popularity. Coupled with limited support to already-proven Apache Hadoop
 support in current ORM frameworks, there was a need for a new project.
 
 Gora is currently hosted at Github. However, Gora has ties to ASF in many
 ways. As detailed in the proposal section, Gora will be a high level client
 for many Apache projects and subprojects including Hadoop(common, hdfs, and
 mapreduce), HBase, Cassandra, Avro, Lucene, Solr, Pig, and Hive. Gora
 already uses Hadoop, HBase, Cassandra and Avro. Moreover, Gora started its
 life inside Apache Nutch project, and now Nutch trunk uses Gora as a
 library. Even more, the initial set of committers are all ASF members.
 Therefore, we think that Apache will be an excellent home for Gora.
 
 Initial Goals
 Initial goals for Gora can be summarized as:
 * Iron out the remaining issues with HBase, Cassandra and SQL support.
 * Make the first release before the end of the year.
 * Improve documentation
 * Support for Cascading
 Current Status
 Meritocracy
 Current commit rights belong to the initial list of committers four of who
 are also ASF members. All the developers have extensive experience with
 Apache projects. We honor the meritocracy policy of ASF foundation.
 
 Community
 Gora’s community mostly overlap with that of Nutch, Hadoop, HBase, Avro and
 Cassandra. We  have a small community for now (5 initial committers, 18

Re: [PROPOSAL] Gora to enter Incubator

2010-09-14 Thread David M Woollard
+1 non-binding. 

As someone who has wrestled with this before, sounds like a worthwhile 
abstraction layer... I'm happy to help. 

-Dave



On Sep 13, 2010, at 6:10 AM, Enis Soztutar wrote:

 Hi all,
 
 We would like to announce the Proposal for Gora, an ORM for Colum Stores,
 for the Apache Incubation. We believe that Gora can find a nice home at
 Apache.
 
 Wiki of the proposal can be found at
 http://wiki.apache.org/incubator/GoraProposal
 
 The proposal is as below.
 
 
 = Gora Proposal for Apache Incubation =
 
 == Abstract ==
 Gora is an ORM framework for column stores such as Apache HBase and Apache
 Cassandra with a specific focus on Hadoop.
 
 == Proposal ==
 Although there are various excellent ORM frameworks for relational
 databases, data modeling in NoSQL data stores differ profoundly from their
 relational cousins. Moreover, data-model agnostic frameworks such as JDO are
 not sufficient for use cases, where one needs to use the full power of the
 data models in column stores. Gora fills this gap by giving the user an
 easy-to-use ORM framework with data store specific mappings and built in
 Apache Hadoop support.
 
 The overall goal for Gora is to become the standard data representation and
 persistence framework for big data. The roadmap of Gora can be grouped as
 follows.
 
 * Data Persistence : Persisting objects to Column stores such as HBase,
 Cassandra, Hypertable; key-value stores such as Voldermort, Redis, etc; SQL
 databases, such as MySQL, HSQLDB, flat files in local file system of Hadoop
 HDFS.
 * Data Access : An easy to use Java-friendly common API for accessing the
 data regardless of its location.
 * Indexing : Persisting objects to Lucene and Solr indexes,
 accessing/querying the data with Gora API.
 * Analysis : Accesing the data and making analysis through adapters for
 Apache Pig, Apache Hive and Cascading
 * MapReduce support : Out-of-the-box and extensive MapReduce (Apache
 Hadoop) support for data in the data store.
 
 == Background ==
 ORM stands for Object Relation Mapping. It is a technology which abstacts
 the persistency layer
 (mostly Relational Databases) so that plain domain level objects can be
 used, without the cumbersome effort to save/load the data to and from the
 database. Gora differs from current solutions in that:
 * Gora is specially focussed at NoSQL data stores, but also has limited
 support for SQL databases
 * The main use case for Gora is to access/analyze big data using Hadoop.
 * Gora uses Avro for bean definition, not byte code enhancement or
 annotations
 * Object-to-data store mappings are backend specific, so that full data
 model can be utilized.
 * Gora is simple since it ignores complex SQL mappings
 * Gora will support persistence, indexing and anaysis of data, using Pig,
 Lucene, Hive, etc
 
 == Rationale ==
 ORM frameworks are nothing new. But with the explosion of data generated in
 Terabytes and even Petabytes, NoSQL data stores are gaining ever-increasing
 popularity. Coupled with limited support to already-proven Apache Hadoop
 support in current ORM frameworks, there was a need for a new project.
 
 Gora is currently hosted at Github. However, Gora has ties to ASF in many
 ways. As detailed in the proposal section, Gora will be a high level client
 for many Apache projects and subprojects including Hadoop(common, hdfs, and
 mapreduce), HBase, Cassandra, Avro, Lucene, Solr, Pig, and Hive. Gora
 already uses Hadoop, HBase, Cassandra and Avro. Moreover, Gora started its
 life inside Apache Nutch project, and now Nutch trunk uses Gora as a
 library. Even more, the initial set of committers are all ASF members.
 Therefore, we think that Apache will be an excellent home for Gora.
 
 == Initial Goals ==
 Initial goals for Gora can be summarized as:
 * Iron out the remaining issues with HBase, Cassandra and SQL support.
 * Make the first release before the end of the year.
 * Improve documentation
 * Support for Cascading
 
 == Current Status ==
 === Meritocracy ===
 Current commit rights belong to the initial list of committers four of who
 are also ASF members. All the developers have extensive experience with
 Apache projects. We honor the meritocracy policy of ASF foundation.
 
 === Community ===
 Gora’s community mostly overlap with that of Nutch, Hadoop, HBase, Avro and
 Cassandra. We
 have a small community for now (5 initial committers, 18 people tracking the
 project at Github), but have been piggybacking the Nutch community for a
 while. If Gora is accepted to Apache Incubator, we expect more traction.
 Moreover, with the increasing popularity of NoSQL databases, we expect more
 users.
 
 === Core Developers ===
 Gora was started by the initial code base inside Apache Nutch by Doğacan
 Güney. Then Enis Söztutar has refactored and re-architected the project out
 of Nutch. Later Julien Nioche, Andrzej Bialecki and Doğacan has ported Nutch
 to use the newly formed project. Later, Sertan Alkan has joined. Doğacan and
 Julien are 

Re: [DISCUSS] OODT Podling Incubator Experiment (was Re: Radical revamp (was: an experiment))

2010-08-17 Thread David M Woollard
Sorry if I'm late to the party, but my 2 cents...

The more I read about this, the more I latch onto Justin's Observers notion. 
As a non-Apache Member, non-IPMC, PPMC member for OODT, I feel like I am 
qualified to vote on a release in the sense that I am closer to the code than 
Justin (sorry to pick on you, but I think I'm just parroting what you have been 
saying), but I also would love to have more experienced hands looking at other 
aspects (most notably in my mind are the various legal aspects). 

In the end, I think that it takes both of these types of input to get what I'd 
call an informed vote. But all of this discussion in my mind hinges on the 
fundamental problems... good mentors and the notion of etiquette, both of which 
have previously been mentioned on all of these intwined threads. 

Realistically, as long as an individual podling is open to the entire 
incubation community, you will find some rules hawk that really believes by 
invoking article 237 of document XYZ they are helping to instruct in the Apache 
way. It's in cases where this happens that I would ask a mentor (someone I know 
who has even a slight investment in my podling's success), to sort the wheat 
from the chaff. Also MHO, but it strikes me that being part of the community, 
rather than in some sort of over-lord position, is more in line with the flat 
structure that is an important part of the Apache way.

 If we ran it with the intention that the PMC is there to solely
 provide non-technical oversight and that the PPMC does the actual
 work, I think that's something I could live with and address my
 concerns in the overall process.


+1. Being the kind of person who likes to trust people, I'm fine with a 
informal agreement. If you feel like you can contribute technically, then I 
would love to hear what you had to say and if you just want to comment on 
process, I think that's A-OK too. IMO, as long as you have taken some step to 
be part of the specific podling, then you get to say anything you want (you are 
part of the community). 

Like Chris, I would be up for trying something with OODT. Any proposal that we 
can work, even if just by general agreement, where we can logically divide 
technical oversight from non-technical and also protect us from random 
drive-bys gets my vote.  

-Dave


On Aug 16, 2010, at 10:37 PM, Justin Erenkrantz wrote:

 On Mon, Aug 16, 2010 at 10:20 PM, Greg Stein gst...@gmail.com wrote:
 You know when to vote and *how* to vote. I see no reason to deny your vote.
 
 Of course.  It's always seemed awkward if you can't contribute
 technically to suddenly have a binding vote.  I'm sure if I *wanted*
 to learn how to build something with Maven, I could.  But, why?  =)
 
 So, it makes me leery on being forced to cast a vote on a release - on
 par with those who have actually tested it and know something about
 the codebase.  The standard that I force myself to adhere to on
 Subversion and httpd for example would be something that I'd fall
 short with in OODT.
 
 The (only) problem to arise would be if OODT was at the minimum of (3)
 ASF Members, and your vote was required. With Chris becoming a Member,
 OODT is at 5 Members that could comprise the mini/pseudo TLP that I
 propose. (maybe there are others interested, but I have zero insight
 into this community)
 
 Sure.  It's just that Chris and I have discussed the pain points in
 the Incubation process, so we're on the lookout for making it easier
 on us.  =)  Plus, the experience with Subversion also showed me where
 things break down too.
 
 I'm not sure that I'm reading the above properly, but... whatevs.
 Under my proposed TLP-based approach, the PMC would be comprised of:
 justin, jean, ross, ian, chris. The current committers (who are also
 on the PPMC, presumably) would be invited to the private@ list, but
 would not be on the PMC. Thus, they would have non-binding votes
 across all project decisions. But that should not be a problem as
 those PMC members also understand how to build and listen to
 consensus. If there are issues in the community, then the difference
 between binding and non-binding votes makes *zero* difference.
 
 If we ran it with the intention that the PMC is there to solely
 provide non-technical oversight and that the PPMC does the actual
 work, I think that's something I could live with and address my
 concerns in the overall process.
 
 I don't think this is at odds with what you are saying nor would it
 run afoul of any corporate structures.  It could just be the informal
 agreement between the PMC members that the PPMC should be the ones
 making the technical decisions.  (If some other set of mentors wanted
 to run it differently, they could.  But, this separation is one I
 could live with myself.)
 
 The (podling) project/PMC would report directly to the Board. No more
 peanut gallery, or a second-guessing group.
 
 Right.  The listed members of the PMC are on the line dealing with the
 Board.  (Hmm...would