RE: [PROPOSAL] MetaModel for the Apache Incubator

2013-06-07 Thread Kasper Sørensen
Okay, got it!

We looked at it and it looks fine. We would need to make explicit in exhibit A 
that it's "excluding xBaseJ code", since that will be removed from the project 
when it moves to Apache. Other than that we're happy to sign the SGA.

Kasper


From: Matt Franklin [m.ben.frank...@gmail.com]
Sent: 06 June 2013 17:05
To: general@incubator.apache.org
Subject: Re: [PROPOSAL] MetaModel for the Apache Incubator

On Thursday, June 6, 2013, Kasper Sørensen wrote:

> Human Inference confirms to be entitled to transfer the MetaModel to the
> Apache Foundation.
>
> One potentially stupid question though: What is an SGA? Our legal dept.
> was not aware at least :-)


Software Grant Agreement [1].  Essentially, the legal framework for
granting shared copyright to the ASF for the donated code.

[1]:http://www.apache.org/licenses/software-grant.txt



>
> -Original Message-
> From: Kasper Sørensen [mailto:kasper.soren...@humaninference.com]
> Sent: 30. maj 2013 21:34
> To: general@incubator.apache.org
> Subject: RE: [PROPOSAL] MetaModel for the Apache Incubator
>
> We'll make sure the legal implifications are in place. As I am not our
> legal representative I would really try not to make an official statement
> on the matter, but I suspect it will not be an issue and since the decision
> to donate the project to Apache has gone through our management team, these
> topics are covered from our legal side as well. I will push back into the
> organization to get a more formal statement for the incubation of course.
>
> Kasper
> 
> From: Matt Franklin [m.ben.frank...@gmail.com]
> Sent: 30 May 2013 16:33
> To: general@incubator.apache.org
> Subject: Re: [PROPOSAL] MetaModel for the Apache Incubator
>
> On Thu, May 30, 2013 at 1:44 AM, Henry Saputra  >wrote:
>
> > Hi Arvind,
> >
> > For concern 1 I will let the lead engineer Kasper to answer. I believe
> > if the contributors already signed copyright agreement to Human
> > Inference then it should be fine since it means all code contributions
> > belong to Human Inference which  will then be transferred to ASF.
> >
> > Someone might want to help clarifying this if I am mistaken.
> >
>
> As long as Human Inference holds the copyright for ALL code, I am pretty
> sure this is fine.  A single SGA from Human Inference should suffice.
>
>
> >
> > As for concern 2, I have scanned the master pom.xml's
> > dependencyManagement section and looks like all dependencies are
> > Apache 2.0 friendly. Again, Kasper could help verify if this is the case.
> >
> >
> > - Henry
> >
> >
> >
> > On Wed, May 29, 2013 at 10:31 PM, Arvind Prabhakar  > >wrote:
> >
> > > Henry,
> > >
> > > Thank you for submitting this proposal. I am very glad to be a
> > > mentor for this project and look forward to working with you and the
> > > broader community. I have a couple of comments with regards to the
> > > stated
> > proposal
> > > -
> > >
> > > First - as noted in the proposal MetaModel has been an open source
> > project
> > > with contributions coming from various corners of the world. Given
> > > this development model, do the individual contributors hold
> > > copyright over
> > their
> > > contributed code? If so, you will likely need their consent in order
> > > to provide this code to the Incubator for the purposes of starting
> > > this project.
> > >
> > > Second - I noticed that the proposal calls out the LGPL dependency
> > > that will be removed before sourcing the initial drop. Along the
> > > same lines, I urge you to go through the the legal FAQ [1] to make
> > > sure that there are
> > no
> > > other dependencies that merit removal or special handling.
> > >
> > > [1] http://www.apache.org/legal/resolved.html
> > >
> > > Regards,
> > > Arvind Prabhakar
> > >
> > >
> > > On Tue, May 28, 2013 at 11:20 AM, Henry Saputra
> > >  > > >wrote:
> > >
> > > > Dear ASF members,
> > > >
> > > > We would like to propose MetaModel for the incubator.
> > > >
> > > > Matt Franklin will be the Champion for this project and the
> > > > proposal
> > > draft
> > > > is available at:
> > > >
> > > > https://wiki.apache.org/incubator/MetaModelProposal
> > > >
> > > > Looking forward to all of your suggestions and feedback.
> > > >
> > > > Thanks,
> > > >
> > > > Henry Saputra
> > > >
>


--
Sent from a mobile device. Please excuse typos or brevity.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Noah Slater
+1 (binding)


On 6 June 2013 23:30, Henry Saputra  wrote:

> Hi All,
>
> I'd like to call a VOTE for acceptance of MetaModel into the Apache
> incubator.
> The vote will close on June 12, 2013 at 6:00 PM (PST).
>
> [] +1 Accept MetaModel into the Apache incubator
> [] +0 Don't care.
> [] -1 Don't accept MetaModel into the incubator because...
>
> Full proposal is pasted at the bottom on this email, and the corresponding
> wiki
> is:
> http://wiki.apache.org/incubator/MetaModelProposal.
>
> Only VOTEs from Incubator PMC members are binding, but all are welcome to
> express their thoughts.
>
> Thanks,
>
> Henry Saputra
> Champion for Apache MetaModel
>
>
> P.S. Here's my +1 (binding)
>
>
> -
>
> = MetaModel – uniform data access across datastores =
>
> Proposal for Apache Incubator
>
> == Abstract ==
>
> MetaModel is a data access framework, providing a common interface for
> exploration and querying of different types of datastores.
>
> == Proposal ==
>
> MetaModel provides a uniform meta-model for exploring and querying the
> structure of datastores, covering but not limited to relational databases,
> various data file formats, NoSQL databases, Salesforce.com, SugarCRM and
> more. The scope of the project is to stay domain-agnostic, so the
> meta-model will be concerned with schemas, tables, columns, rows,
> relationships etc.
>
> On top of this meta-model a rich querying API is provided which resembles
> SQL, but built using compiler-checked Java language constructs. For
> datastores that do not have a native SQL-compatible query engine, the
> MetaModel project also includes an abstract Java-based query engine
> implementation which individual datastore-modules can adapt to fit the
> concrete datastore.
>
> === Background ===
>
> The MetaModel project was initially developed by eobject.dk to service the
> DataCleaner application (http://datacleaner.org). The main requirement was
> to perform data querying and modification operations on a wide range of
> quite different datastores. Furthermore a programmatic query model was
> needed in order to allow different components to influence the query plan.
>
> In 2009, Human Inference acquired the eobjects projects including
> MetaModel. Since then MetaModel has been put to extensive use in the Human
> Inference products. The open source nature of the project was reinforced,
> leading to a significant growth in the community.
>
> MetaModel has successfully been used in a number of other open source
> projects as well as mission critical commercial software from Human
> Inference. Currently MetaModel is hosted at http://metamodel.eobjects.org.
>
> === Rationale ===
>
> Different types of datastores have different characteristics, which always
> lead to the interfaces for these being different from one another.
> Standards like JDBC and the SQL language attempt to standardize data
> access, but for some datastore types like flat files, spreadsheets, NoSQL
> databases and more, such standards are not even implementable.
>
> Specialization in interfaces obviously has merit for optimized usage, but
> for integration tools, batch applications and or generic data modification
> tools, this myriad of specialized interfaces is a big pain. Furthermore,
> being able to query every datastore with a basic set of SQL-like features
> can be a great productivity boost for a wide range of applications.
>
> === Initial goals ===
>
> MetaModel is already a stable project, so initial goals are more oriented
> towards an adaption to the Apache ecosystem than about functional changes.
>
> We are constantly adding more datastore types to the portfolio, but the
> core modules have not had drastic changes for some time.
>
> Our focus will be on making ties with other Apache projects (such as POI,
> Gora, HBase and CouchDB) and potentially renaming the ‘MetaModel’ project
> to something more rememberable.
> This includes comply with Apache Software Foundation license for third
> party dependencies.
>
> == Current status ==
>
> === Meritocracy ===
>
> We intend to do everything we can to encourage a meritocracy in the
> development of MetaModel. Currently most important development and design
> decisions have been made at Human Inference, but with an open window for
> anyone to participate on mailing lists and discussion forums. We believe
> that the approach going forward should be more encouraging by sharing all
> the design ideas and discussions in the open, not only just the topics that
> have been “dragged” into the open by third parties.  We believe that
> meritocracy will be further stimulated by granting the control of the
> project to an independent committee.
>
> === Community ===
>
> The community around MetaModel already exists, but we believe it will grow
> substantially by becoming an Apache project. With MetaModel used in a wide
> range of both open and closed source application, both at Human Inference
> (HIquality MDM), it’s

RE: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Manuel van den Berg
[X] +1 Accept MetaModel into the Apache incubator

Looking forward to taking MetaModel to the next level.

 - Manuel

> -Original Message-
> From: Henry Saputra [mailto:henry.sapu...@gmail.com]
> Sent: Friday, June 07, 2013 0:31
> To: general@incubator.apache.org
> Subject: [VOTE] Accept Apache MetaModel into the Apache incubator
> 
> Hi All,
> 
> I'd like to call a VOTE for acceptance of MetaModel into the Apache
> incubator.
> The vote will close on June 12, 2013 at 6:00 PM (PST).
> 
> [] +1 Accept MetaModel into the Apache incubator
> [] +0 Don't care.
> [] -1 Don't accept MetaModel into the incubator because...
> 
> Full proposal is pasted at the bottom on this email, and the corresponding
> wiki
> is:
> http://wiki.apache.org/incubator/MetaModelProposal.
> 
> Only VOTEs from Incubator PMC members are binding, but all are welcome
> to
> express their thoughts.
> 
> Thanks,
> 
> Henry Saputra
> Champion for Apache MetaModel
> 
> 
> P.S. Here's my +1 (binding)
> 
> 
> -
> 
> = MetaModel – uniform data access across datastores =
> 
> Proposal for Apache Incubator
> 
> == Abstract ==
> 
> MetaModel is a data access framework, providing a common interface for
> exploration and querying of different types of datastores.
> 
> == Proposal ==
> 
> MetaModel provides a uniform meta-model for exploring and querying the
> structure of datastores, covering but not limited to relational databases,
> various data file formats, NoSQL databases, Salesforce.com, SugarCRM and
> more. The scope of the project is to stay domain-agnostic, so the
> meta-model will be concerned with schemas, tables, columns, rows,
> relationships etc.
> 
> On top of this meta-model a rich querying API is provided which resembles
> SQL, but built using compiler-checked Java language constructs. For
> datastores that do not have a native SQL-compatible query engine, the
> MetaModel project also includes an abstract Java-based query engine
> implementation which individual datastore-modules can adapt to fit the
> concrete datastore.
> 
> === Background ===
> 
> The MetaModel project was initially developed by eobject.dk to service the
> DataCleaner application (http://datacleaner.org). The main requirement was
> to perform data querying and modification operations on a wide range of
> quite different datastores. Furthermore a programmatic query model was
> needed in order to allow different components to influence the query plan.
> 
> In 2009, Human Inference acquired the eobjects projects including
> MetaModel. Since then MetaModel has been put to extensive use in the
> Human
> Inference products. The open source nature of the project was reinforced,
> leading to a significant growth in the community.
> 
> MetaModel has successfully been used in a number of other open source
> projects as well as mission critical commercial software from Human
> Inference. Currently MetaModel is hosted at http://metamodel.eobjects.org.
> 
> === Rationale ===
> 
> Different types of datastores have different characteristics, which always
> lead to the interfaces for these being different from one another.
> Standards like JDBC and the SQL language attempt to standardize data
> access, but for some datastore types like flat files, spreadsheets, NoSQL
> databases and more, such standards are not even implementable.
> 
> Specialization in interfaces obviously has merit for optimized usage, but
> for integration tools, batch applications and or generic data modification
> tools, this myriad of specialized interfaces is a big pain. Furthermore,
> being able to query every datastore with a basic set of SQL-like features
> can be a great productivity boost for a wide range of applications.
> 
> === Initial goals ===
> 
> MetaModel is already a stable project, so initial goals are more oriented
> towards an adaption to the Apache ecosystem than about functional
> changes.
> 
> We are constantly adding more datastore types to the portfolio, but the
> core modules have not had drastic changes for some time.
> 
> Our focus will be on making ties with other Apache projects (such as POI,
> Gora, HBase and CouchDB) and potentially renaming the ‘MetaModel’
> project
> to something more rememberable.
> This includes comply with Apache Software Foundation license for third
> party dependencies.
> 
> == Current status ==
> 
> === Meritocracy ===
> 
> We intend to do everything we can to encourage a meritocracy in the
> development of MetaModel. Currently most important development and
> design
> decisions have been made at Human Inference, but with an open window for
> anyone to participate on mailing lists and discussion forums. We believe
> that the approach going forward should be more encouraging by sharing all
> the design ideas and discussions in the open, not only just the topics that
> have been “dragged” into the open by third parties.  We believe that
> meritocracy will be further stimulated by granting the control

Re: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Joe Brockmeier
On Thu, Jun 6, 2013, at 05:30 PM, Henry Saputra wrote:
> [] +1 Accept MetaModel into the Apache incubator
> [] +0 Don't care.
> [] -1 Don't accept MetaModel into the incubator because...

+1 (binding)

Best,

jzb
-- 
Joe Brockmeier
j...@zonker.net
Twitter: @jzb
http://www.dissociatedpress.net/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Suresh Marru
+ 1 (binding)

All the best,
Suresh
On Jun 6, 2013, at 6:30 PM, Henry Saputra  wrote:

> Hi All,
> 
> I'd like to call a VOTE for acceptance of MetaModel into the Apache
> incubator.
> The vote will close on June 12, 2013 at 6:00 PM (PST).
> 
> [] +1 Accept MetaModel into the Apache incubator
> [] +0 Don't care.
> [] -1 Don't accept MetaModel into the incubator because...
> 
> Full proposal is pasted at the bottom on this email, and the corresponding 
> wiki
> is:
> http://wiki.apache.org/incubator/MetaModelProposal.
> 
> Only VOTEs from Incubator PMC members are binding, but all are welcome to
> express their thoughts.
> 
> Thanks,
> 
> Henry Saputra
> Champion for Apache MetaModel
> 
> 
> P.S. Here's my +1 (binding)
> 
> 
> -
> 
> = MetaModel – uniform data access across datastores =
> 
> Proposal for Apache Incubator
> 
> == Abstract ==
> 
> MetaModel is a data access framework, providing a common interface for
> exploration and querying of different types of datastores.
> 
> == Proposal ==
> 
> MetaModel provides a uniform meta-model for exploring and querying the
> structure of datastores, covering but not limited to relational databases,
> various data file formats, NoSQL databases, Salesforce.com, SugarCRM and
> more. The scope of the project is to stay domain-agnostic, so the
> meta-model will be concerned with schemas, tables, columns, rows,
> relationships etc.
> 
> On top of this meta-model a rich querying API is provided which resembles
> SQL, but built using compiler-checked Java language constructs. For
> datastores that do not have a native SQL-compatible query engine, the
> MetaModel project also includes an abstract Java-based query engine
> implementation which individual datastore-modules can adapt to fit the
> concrete datastore.
> 
> === Background ===
> 
> The MetaModel project was initially developed by eobject.dk to service the
> DataCleaner application (http://datacleaner.org). The main requirement was
> to perform data querying and modification operations on a wide range of
> quite different datastores. Furthermore a programmatic query model was
> needed in order to allow different components to influence the query plan.
> 
> In 2009, Human Inference acquired the eobjects projects including
> MetaModel. Since then MetaModel has been put to extensive use in the Human
> Inference products. The open source nature of the project was reinforced,
> leading to a significant growth in the community.
> 
> MetaModel has successfully been used in a number of other open source
> projects as well as mission critical commercial software from Human
> Inference. Currently MetaModel is hosted at http://metamodel.eobjects.org.
> 
> === Rationale ===
> 
> Different types of datastores have different characteristics, which always
> lead to the interfaces for these being different from one another.
> Standards like JDBC and the SQL language attempt to standardize data
> access, but for some datastore types like flat files, spreadsheets, NoSQL
> databases and more, such standards are not even implementable.
> 
> Specialization in interfaces obviously has merit for optimized usage, but
> for integration tools, batch applications and or generic data modification
> tools, this myriad of specialized interfaces is a big pain. Furthermore,
> being able to query every datastore with a basic set of SQL-like features
> can be a great productivity boost for a wide range of applications.
> 
> === Initial goals ===
> 
> MetaModel is already a stable project, so initial goals are more oriented
> towards an adaption to the Apache ecosystem than about functional changes.
> 
> We are constantly adding more datastore types to the portfolio, but the
> core modules have not had drastic changes for some time.
> 
> Our focus will be on making ties with other Apache projects (such as POI,
> Gora, HBase and CouchDB) and potentially renaming the ‘MetaModel’ project
> to something more rememberable.
> This includes comply with Apache Software Foundation license for third
> party dependencies.
> 
> == Current status ==
> 
> === Meritocracy ===
> 
> We intend to do everything we can to encourage a meritocracy in the
> development of MetaModel. Currently most important development and design
> decisions have been made at Human Inference, but with an open window for
> anyone to participate on mailing lists and discussion forums. We believe
> that the approach going forward should be more encouraging by sharing all
> the design ideas and discussions in the open, not only just the topics that
> have been “dragged” into the open by third parties.  We believe that
> meritocracy will be further stimulated by granting the control of the
> project to an independent committee.
> 
> === Community ===
> 
> The community around MetaModel already exists, but we believe it will grow
> substantially by becoming an Apache project. With MetaModel used in a wide
> range of both open and closed source

Re: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Ted Dunning
+1 (binding)


On Fri, Jun 7, 2013 at 7:47 PM, Suresh Marru  wrote:

> + 1 (binding)
>
> All the best,
> Suresh
> On Jun 6, 2013, at 6:30 PM, Henry Saputra  wrote:
>
> > Hi All,
> >
> > I'd like to call a VOTE for acceptance of MetaModel into the Apache
> > incubator.
> > The vote will close on June 12, 2013 at 6:00 PM (PST).
> >
> > [] +1 Accept MetaModel into the Apache incubator
> > [] +0 Don't care.
> > [] -1 Don't accept MetaModel into the incubator because...
> >
> > Full proposal is pasted at the bottom on this email, and the
> corresponding wiki
> > is:
> > http://wiki.apache.org/incubator/MetaModelProposal.
> >
> > Only VOTEs from Incubator PMC members are binding, but all are welcome to
> > express their thoughts.
> >
> > Thanks,
> >
> > Henry Saputra
> > Champion for Apache MetaModel
> >
> >
> > P.S. Here's my +1 (binding)
> >
> >
> > -
> >
> > = MetaModel – uniform data access across datastores =
> >
> > Proposal for Apache Incubator
> >
> > == Abstract ==
> >
> > MetaModel is a data access framework, providing a common interface for
> > exploration and querying of different types of datastores.
> >
> > == Proposal ==
> >
> > MetaModel provides a uniform meta-model for exploring and querying the
> > structure of datastores, covering but not limited to relational
> databases,
> > various data file formats, NoSQL databases, Salesforce.com, SugarCRM and
> > more. The scope of the project is to stay domain-agnostic, so the
> > meta-model will be concerned with schemas, tables, columns, rows,
> > relationships etc.
> >
> > On top of this meta-model a rich querying API is provided which resembles
> > SQL, but built using compiler-checked Java language constructs. For
> > datastores that do not have a native SQL-compatible query engine, the
> > MetaModel project also includes an abstract Java-based query engine
> > implementation which individual datastore-modules can adapt to fit the
> > concrete datastore.
> >
> > === Background ===
> >
> > The MetaModel project was initially developed by eobject.dk to service
> the
> > DataCleaner application (http://datacleaner.org). The main requirement
> was
> > to perform data querying and modification operations on a wide range of
> > quite different datastores. Furthermore a programmatic query model was
> > needed in order to allow different components to influence the query
> plan.
> >
> > In 2009, Human Inference acquired the eobjects projects including
> > MetaModel. Since then MetaModel has been put to extensive use in the
> Human
> > Inference products. The open source nature of the project was reinforced,
> > leading to a significant growth in the community.
> >
> > MetaModel has successfully been used in a number of other open source
> > projects as well as mission critical commercial software from Human
> > Inference. Currently MetaModel is hosted at
> http://metamodel.eobjects.org.
> >
> > === Rationale ===
> >
> > Different types of datastores have different characteristics, which
> always
> > lead to the interfaces for these being different from one another.
> > Standards like JDBC and the SQL language attempt to standardize data
> > access, but for some datastore types like flat files, spreadsheets, NoSQL
> > databases and more, such standards are not even implementable.
> >
> > Specialization in interfaces obviously has merit for optimized usage, but
> > for integration tools, batch applications and or generic data
> modification
> > tools, this myriad of specialized interfaces is a big pain. Furthermore,
> > being able to query every datastore with a basic set of SQL-like features
> > can be a great productivity boost for a wide range of applications.
> >
> > === Initial goals ===
> >
> > MetaModel is already a stable project, so initial goals are more oriented
> > towards an adaption to the Apache ecosystem than about functional
> changes.
> >
> > We are constantly adding more datastore types to the portfolio, but the
> > core modules have not had drastic changes for some time.
> >
> > Our focus will be on making ties with other Apache projects (such as POI,
> > Gora, HBase and CouchDB) and potentially renaming the ‘MetaModel’ project
> > to something more rememberable.
> > This includes comply with Apache Software Foundation license for third
> > party dependencies.
> >
> > == Current status ==
> >
> > === Meritocracy ===
> >
> > We intend to do everything we can to encourage a meritocracy in the
> > development of MetaModel. Currently most important development and design
> > decisions have been made at Human Inference, but with an open window for
> > anyone to participate on mailing lists and discussion forums. We believe
> > that the approach going forward should be more encouraging by sharing all
> > the design ideas and discussions in the open, not only just the topics
> that
> > have been “dragged” into the open by third parties.  We believe that
> > meritocracy will be further stimulated b

Re: [VOTE] Accept Apache MetaModel into the Apache incubator

2013-06-07 Thread Rich Bowen

On 06/06/2013 06:30 PM, Henry Saputra wrote:

Hi All,

I'd like to call a VOTE for acceptance of MetaModel into the Apache
incubator.
The vote will close on June 12, 2013 at 6:00 PM (PST).

[] +1 Accept MetaModel into the Apache incubator
[] +0 Don't care.
[] -1 Don't accept MetaModel into the incubator because...





+1

--
Rich Bowen
rbo...@rcbowen.com
Shosholoza


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



June report

2013-06-07 Thread Benson Margulies
I'll button it up in the middle of tomorrow some time.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



[VOTE] Apache Spark for the Incubator

2013-06-07 Thread Mattmann, Chris A (398J)
Hi Folks,

OK discussion has died down, time to VOTE to accept Spark into the
Apache Incubator. I'll let the VOTE run for at least a week.

So far I've heard +1s from the following folks, so no need for them
to VOTE again unless they want to change their VOTE:

+1

Chris Mattmann*
Konstantin Boudnik
Henry Saputra*
Reynold Xin
Pei Chen
Roman Shaposhnik*
Suresh Marru*

* -indicates IPMC

[ ] +1 Accept Spark into the Apache Incubator.
[ ] +0 Don't care.
[ ] -1 Don't accept Spark into the Apache Incubator because..

Proposal text is below.

=== Abstract ===
Spark is an open source system for large-scale data analysis on clusters.

=== Proposal ===
Spark is an open source system for fast and flexible large-scale data
analysis. Spark provides a general purpose runtime that supports
low-latency execution in several forms. These include interactive
exploration of very large datasets, near real-time stream processing, and
ad-hoc SQL analytics (through higher layer extensions). Spark interfaces
with HDFS, HBase, Cassandra and several other storage storage layers, and
exposes APIs in Scala, Java and Python.
Background
Spark started as U.C. Berkeley research project, designed to efficiently
run machine learning algorithms on large datasets. Over time, it has
evolved into a general computing engine as outlined above. Spark¹s
developer community has also grown to include additional institutions,
such as universities, research labs, and corporations. Funding has been
provided by various institutions including the U.S. National Science
Foundation, DARPA, and a number of industry sponsors. See:
https://amplab.cs.berkeley.edu/sponsors/ for full details.

=== Rationale ===
As the number of contributors to Spark has grown, we have sought for a
long-term home for the project, and we believe the Apache foundation would
be a great fit. Spark is a natural fit for the Apache foundation: Spark
already interoperates with several existing Apache projects (HDFS, HBase,
Hive, Cassandra, Avro and Flume to name a few). The Spark team is familiar
with the Apache process and and subscribes to the Apache mission - the
team includes multiple Apache committers already. Finally, joining Apache
will help coordinate the development effort of the growing number of
organizations which contribute to Spark.

== Initial Goals ==
The initial goals will most likely be to move the existing codebase to
Apache and integrate with the Apache development process. Furthermore, we
plan for incremental development, and releases along with the Apache
guidelines.

=== Current Status ===
== Meritocracy ==
The Spark project already operates on meritocratic principles. Today,
Spark has several developers and has accepted multiple major patches from
outside of U.C. Berkeley. While this process has remained mostly informal
(we do not have an official committer list), an implicit organization
exists in which individuals who contribute major components act as
maintainers for those modules. If accepted, the Spark project would
include several of these participants as committers from the onset. We
will work to identify all committers and PPMC members for the project and
to operate under the ASF meritocratic principles.

=== Community ===
Acceptance into the Apache foundation would bolster the already strong
user and developer community around Spark. That community includes dozens
of contributors from several institutions, a meetup group with several
hundred members, and an active mailing list composed of hundreds of users.
Core Developers
The core developers of our project are listed in our contributors and
initial PPMC below. Though many exist at UC Berkeley, there is a
representative cross sampling of other organizations including Quantifind,
Microsoft, Yahoo!, ClearStory Data, Bizo, Intel, Tagged and Webtrends.


=== Alignment ===
Our proposed effort aligns with several ongoing BIGDATA and U.S. National
priority funding interests including the NSF and its Expeditions program,
and the DARPA XDATA project. Our industry partners and collaborators are
well aligned with our code base.

There are also a number of related Apache projects and dependencies, that
will be mentioned in the Relationships with Other Apache products section.

== Known Risks ==

=== Orphaned Products ===
Given the current level of investment in Spark - the risk of the project
being abandoned is minimal. There are several constituents who are highly
incentivized to continue development. The U.C. Berkeley AMPLab relies on
Spark as a platform for a large number of long-term research projects.
Several companies have build verticalized products which are tightly
dependent on Spark. Other companies have devoted significant internal
infrastructure investment in Spark.

=== Inexperience with Open Source ===
Spark has existed as a healthy open source project for several years.
During that time, Matei and others have curated an open-source community
successfully, attracting developers from a diverse group of co

Re: [VOTE] Apache Spark for the Incubator

2013-06-07 Thread Ted Dunning
+1


On Sat, Jun 8, 2013 at 7:34 AM, Mattmann, Chris A (398J) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Hi Folks,
>
> OK discussion has died down, time to VOTE to accept Spark into the
> Apache Incubator. I'll let the VOTE run for at least a week.
>
> So far I've heard +1s from the following folks, so no need for them
> to VOTE again unless they want to change their VOTE:
>
> +1
>
> Chris Mattmann*
> Konstantin Boudnik
> Henry Saputra*
> Reynold Xin
> Pei Chen
> Roman Shaposhnik*
> Suresh Marru*
>
> * -indicates IPMC
>
> [ ] +1 Accept Spark into the Apache Incubator.
> [ ] +0 Don't care.
> [ ] -1 Don't accept Spark into the Apache Incubator because..
>
> Proposal text is below.
>
> === Abstract ===
> Spark is an open source system for large-scale data analysis on clusters.
>
> === Proposal ===
> Spark is an open source system for fast and flexible large-scale data
> analysis. Spark provides a general purpose runtime that supports
> low-latency execution in several forms. These include interactive
> exploration of very large datasets, near real-time stream processing, and
> ad-hoc SQL analytics (through higher layer extensions). Spark interfaces
> with HDFS, HBase, Cassandra and several other storage storage layers, and
> exposes APIs in Scala, Java and Python.
> Background
> Spark started as U.C. Berkeley research project, designed to efficiently
> run machine learning algorithms on large datasets. Over time, it has
> evolved into a general computing engine as outlined above. Spark¹s
> developer community has also grown to include additional institutions,
> such as universities, research labs, and corporations. Funding has been
> provided by various institutions including the U.S. National Science
> Foundation, DARPA, and a number of industry sponsors. See:
> https://amplab.cs.berkeley.edu/sponsors/ for full details.
>
> === Rationale ===
> As the number of contributors to Spark has grown, we have sought for a
> long-term home for the project, and we believe the Apache foundation would
> be a great fit. Spark is a natural fit for the Apache foundation: Spark
> already interoperates with several existing Apache projects (HDFS, HBase,
> Hive, Cassandra, Avro and Flume to name a few). The Spark team is familiar
> with the Apache process and and subscribes to the Apache mission - the
> team includes multiple Apache committers already. Finally, joining Apache
> will help coordinate the development effort of the growing number of
> organizations which contribute to Spark.
>
> == Initial Goals ==
> The initial goals will most likely be to move the existing codebase to
> Apache and integrate with the Apache development process. Furthermore, we
> plan for incremental development, and releases along with the Apache
> guidelines.
>
> === Current Status ===
> == Meritocracy ==
> The Spark project already operates on meritocratic principles. Today,
> Spark has several developers and has accepted multiple major patches from
> outside of U.C. Berkeley. While this process has remained mostly informal
> (we do not have an official committer list), an implicit organization
> exists in which individuals who contribute major components act as
> maintainers for those modules. If accepted, the Spark project would
> include several of these participants as committers from the onset. We
> will work to identify all committers and PPMC members for the project and
> to operate under the ASF meritocratic principles.
>
> === Community ===
> Acceptance into the Apache foundation would bolster the already strong
> user and developer community around Spark. That community includes dozens
> of contributors from several institutions, a meetup group with several
> hundred members, and an active mailing list composed of hundreds of users.
> Core Developers
> The core developers of our project are listed in our contributors and
> initial PPMC below. Though many exist at UC Berkeley, there is a
> representative cross sampling of other organizations including Quantifind,
> Microsoft, Yahoo!, ClearStory Data, Bizo, Intel, Tagged and Webtrends.
>
>
> === Alignment ===
> Our proposed effort aligns with several ongoing BIGDATA and U.S. National
> priority funding interests including the NSF and its Expeditions program,
> and the DARPA XDATA project. Our industry partners and collaborators are
> well aligned with our code base.
>
> There are also a number of related Apache projects and dependencies, that
> will be mentioned in the Relationships with Other Apache products section.
>
> == Known Risks ==
>
> === Orphaned Products ===
> Given the current level of investment in Spark - the risk of the project
> being abandoned is minimal. There are several constituents who are highly
> incentivized to continue development. The U.C. Berkeley AMPLab relies on
> Spark as a platform for a large number of long-term research projects.
> Several companies have build verticalized products which are tightly
> dependent on Spark. Other companies have devoted sign

Re: [VOTE] Apache Spark for the Incubator

2013-06-07 Thread Scott Deboy
+1

On 6/7/13, Ted Dunning  wrote:
> +1
>
>
> On Sat, Jun 8, 2013 at 7:34 AM, Mattmann, Chris A (398J) <
> chris.a.mattm...@jpl.nasa.gov> wrote:
>
>> Hi Folks,
>>
>> OK discussion has died down, time to VOTE to accept Spark into the
>> Apache Incubator. I'll let the VOTE run for at least a week.
>>
>> So far I've heard +1s from the following folks, so no need for them
>> to VOTE again unless they want to change their VOTE:
>>
>> +1
>>
>> Chris Mattmann*
>> Konstantin Boudnik
>> Henry Saputra*
>> Reynold Xin
>> Pei Chen
>> Roman Shaposhnik*
>> Suresh Marru*
>>
>> * -indicates IPMC
>>
>> [ ] +1 Accept Spark into the Apache Incubator.
>> [ ] +0 Don't care.
>> [ ] -1 Don't accept Spark into the Apache Incubator because..
>>
>> Proposal text is below.
>>
>> === Abstract ===
>> Spark is an open source system for large-scale data analysis on clusters.
>>
>> === Proposal ===
>> Spark is an open source system for fast and flexible large-scale data
>> analysis. Spark provides a general purpose runtime that supports
>> low-latency execution in several forms. These include interactive
>> exploration of very large datasets, near real-time stream processing, and
>> ad-hoc SQL analytics (through higher layer extensions). Spark interfaces
>> with HDFS, HBase, Cassandra and several other storage storage layers, and
>> exposes APIs in Scala, Java and Python.
>> Background
>> Spark started as U.C. Berkeley research project, designed to efficiently
>> run machine learning algorithms on large datasets. Over time, it has
>> evolved into a general computing engine as outlined above. Spark¹s
>> developer community has also grown to include additional institutions,
>> such as universities, research labs, and corporations. Funding has been
>> provided by various institutions including the U.S. National Science
>> Foundation, DARPA, and a number of industry sponsors. See:
>> https://amplab.cs.berkeley.edu/sponsors/ for full details.
>>
>> === Rationale ===
>> As the number of contributors to Spark has grown, we have sought for a
>> long-term home for the project, and we believe the Apache foundation
>> would
>> be a great fit. Spark is a natural fit for the Apache foundation: Spark
>> already interoperates with several existing Apache projects (HDFS, HBase,
>> Hive, Cassandra, Avro and Flume to name a few). The Spark team is
>> familiar
>> with the Apache process and and subscribes to the Apache mission - the
>> team includes multiple Apache committers already. Finally, joining Apache
>> will help coordinate the development effort of the growing number of
>> organizations which contribute to Spark.
>>
>> == Initial Goals ==
>> The initial goals will most likely be to move the existing codebase to
>> Apache and integrate with the Apache development process. Furthermore, we
>> plan for incremental development, and releases along with the Apache
>> guidelines.
>>
>> === Current Status ===
>> == Meritocracy ==
>> The Spark project already operates on meritocratic principles. Today,
>> Spark has several developers and has accepted multiple major patches from
>> outside of U.C. Berkeley. While this process has remained mostly informal
>> (we do not have an official committer list), an implicit organization
>> exists in which individuals who contribute major components act as
>> maintainers for those modules. If accepted, the Spark project would
>> include several of these participants as committers from the onset. We
>> will work to identify all committers and PPMC members for the project and
>> to operate under the ASF meritocratic principles.
>>
>> === Community ===
>> Acceptance into the Apache foundation would bolster the already strong
>> user and developer community around Spark. That community includes dozens
>> of contributors from several institutions, a meetup group with several
>> hundred members, and an active mailing list composed of hundreds of
>> users.
>> Core Developers
>> The core developers of our project are listed in our contributors and
>> initial PPMC below. Though many exist at UC Berkeley, there is a
>> representative cross sampling of other organizations including
>> Quantifind,
>> Microsoft, Yahoo!, ClearStory Data, Bizo, Intel, Tagged and Webtrends.
>>
>>
>> === Alignment ===
>> Our proposed effort aligns with several ongoing BIGDATA and U.S. National
>> priority funding interests including the NSF and its Expeditions program,
>> and the DARPA XDATA project. Our industry partners and collaborators are
>> well aligned with our code base.
>>
>> There are also a number of related Apache projects and dependencies, that
>> will be mentioned in the Relationships with Other Apache products
>> section.
>>
>> == Known Risks ==
>>
>> === Orphaned Products ===
>> Given the current level of investment in Spark - the risk of the project
>> being abandoned is minimal. There are several constituents who are highly
>> incentivized to continue development. The U.C. Berkeley AMPLab relies on
>> Spark as a platform for a large nu