[RESULT] [VOTE] Accept MADlib into the Apache Incubator

2015-09-15 Thread Roman Shaposhnik
On Wed, Sep 9, 2015 at 7:37 PM, Roman Shaposhnik  wrote:
> Following the discussion earlier:
>http://s.apache.org/TE6
>
> I would like to call a VOTE for accepting
> MADlib community as a new ASF incubator
> project.
>
> The proposal is available at:
> https://wiki.apache.org/incubator/MADlibProposal
> and is also included at the bottom of this email.
>
> Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
>
>  [ ] +1 accept MADlib into the Apache Incubator
>  [ ] ±0
>  [ ] -1 because...

This vote is now closed and passes with 4 binding +1 votes,
15 non-binding +1 votes and no 0 or -1 votes.

Thanks to all who helped with the proposal and cast the vote!

Here's a vote tally:

Non-binding +1s:
  Atri Sharma
  Christian Tzolov
  Rahul Iyer
  Caleb Welton
  Frank McQuillan
  Srivatsan Ramanujam
  Chris Rawles
  Gregory Chase
  Gautam Muralidhar
  Don Bosco Durai
  dpop...@uvic.ca
  Kee Siong Ng
  Sarah Aerni
  Michael West
  AJ Welch

Binding +1s:
   Konstantin Boudnik
   Roman Shaposhnik
   Julian Hyde
   Ted Dunning

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-14 Thread West, Michael
+1

Mike West
503.276.1815

Count what is countable, measure what is measurable,
and what is not measureable, make measurable - Galileo


IMPORTANT NOTICE: This communication, including any attachment, contains 
information that may be confidential or privileged, and is intended solely for 
the entity or individual to whom it is addressed.  If you are not the intended 
recipient, you should delete this message and are hereby notified that any 
disclosure, copying, or distribution of this message is strictly prohibited.  
Nothing in this email, including any attachment, is intended to be a legally 
binding signature.


Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-14 Thread Sarah Aerni
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: >
> Following the discussion earlier: > http://s.apache.org/TE6 > > I would
like to call a VOTE for accepting > MADlib community as a new ASF incubator
> project. > > The proposal is available at: >
https://wiki.apache.org/incubator/MADlibProposal > and is also included at
the bottom of this email. > > Vote is open until at least Mon, 14 September
2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ]
±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is
an open-source library (licensed under 2-clause BSD license) > for scalable
in-database analytics. It provides data-parallel > implementations of
mathematical, statistical and machine learning > methods for structured and
unstructured data. The MADlib mission is to > foster widespread development
of scalable analytic skills, by > harnessing efforts from commercial
practice, academic research, and > open source development. > > MADlib
occupies a unique niche in the realm of data science and > machine learning
libraries since its SQL APIs can allow it to work on > a wide range of data
stores and SQL engines. > > == Proposal == > The current open source
community behind MADlib feels that aligning > itself with HAWQ's community,
governance model, infrastructure and > roadmap will allow the project to
accelerate adoption and community > growth. Given HAWQ's trajectory of
entering Apache Software Foundation > family as an Incubating project, we
feel that the best course of > action for MADlib is to follow a similar
route. > > MADlib and HAWQ are complementary technologies in that MADlib >
in-database analytical functions can run within the HAWQ execution >
engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It
is expected that contributors to MADlib will be cognizant of the > HAWQ ASF
project and may contribute to it as well. In short, > collaboration between
the two communities will make both projects more > vibrant and advance the
respective technologies in potentially novel > directions. > > Contributors
may also look at the HAWQ project as a starting port for > ports to other
parallel database engines. This proposal highly > encourages this type of
work as it would help to further realize the > original cross-platform goal
of MADlib as envisioned by its > originators. > > Thus, the goal of this
proposal is to bring the existing MADlib open > source community into ASF,
change the project's governance model to > the "Apache Way" and transition
the project's codebase and > infrastructure into ASF INFRA. The community
has agreed to transfer > the brand name "MADlib" to Apache Software
Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source
community is > submitting this proposal to transition source code and
associated > artifacts (documentation, web site content, wiki, etc.) to the
Apache > Software Foundation Incubator under the Apache License, Version
2.0 > and is asking Incubator PMC to established a MADlib incubating >
project. > > Currently MADlib uses a few category X licensed software tools
during > its build (mostly for generating documentation): > * doxypy 0.4.2
(GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an
exception for generated output) > We feel that this usage is compatible
with an overall project licensed > under the ALv2 and don't anticipate any
changes. > Our usage of LGPL library cern_root-5.34 is expected to go away
since > the 2 cern modules used are being entirely re-written > in MADlib >
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into >
its binary artifact seems to be consistent with > ASF recommendation for
managing "weak copyleft" dependencies. > > > == Background == > MADlib grew
out of discussions between database engine developers, > data scientists,
IT architects and academics interested in new > approaches to scalable,
sophisticated in-database analytics. These > discussions were written up in
a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software >
project began the following year as a collaboration between > researchers
at UC Berkeley and engineers and data scientists at > Pivotal (former
EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC
Berkeley, the > University of Wisconsin, and the University of Florida. The
project > was publicly documented in a paper at VLDB 2012 > (
http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today >
MADlib has contributors from around the world including both > individuals
and institutions. For example, recent contributions have > come from
Pivotal, Stanford University, and the University of Illinois > at Chicago.
> > MADlib was conceived from the outset as a free, open source library >
for all to use and contribute to. Since its inception, the community > has
steadily added new 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-14 Thread AJ Welch
+1


Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-13 Thread Ted Dunning
+1



On Fri, Sep 11, 2015 at 9:11 PM, Skip Intro  wrote:

> +1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: >
> > Following the discussion earlier: > http://s.apache.org/TE6 > > I would
> like to call a VOTE for accepting > MADlib community as a new ASF incubator
> > project. > > The proposal is available at: >
> https://wiki.apache.org/incubator/MADlibProposal > and is also included at
> the bottom of this email. > > Vote is open until at least Mon, 14 September
> 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ]
> ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is
> an open-source library (licensed under 2-clause BSD license) > for scalable
> in-database analytics. It provides data-parallel > implementations of
> mathematical, statistical and machine learning > methods for structured and
> unstructured data. The MADlib mission is to > foster widespread development
> of scalable analytic skills, by > harnessing efforts from commercial
> practice, academic research, and > open source development. > > MADlib
> occupies a unique niche in the realm of data science and > machine learning
> libraries since its SQL APIs can allow it to work on > a wide range of data
> stores and SQL engines. > > == Proposal == > The current open source
> community behind MADlib feels that aligning > itself with HAWQ's community,
> governance model, infrastructure and > roadmap will allow the project to
> accelerate adoption and community > growth. Given HAWQ's trajectory of
> entering Apache Software Foundation > family as an Incubating project, we
> feel that the best course of > action for MADlib is to follow a similar
> route. > > MADlib and HAWQ are complementary technologies in that MADlib >
> in-database analytical functions can run within the HAWQ execution >
> engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It
> is expected that contributors to MADlib will be cognizant of the > HAWQ ASF
> project and may contribute to it as well. In short, > collaboration between
> the two communities will make both projects more > vibrant and advance the
> respective technologies in potentially novel > directions. > > Contributors
> may also look at the HAWQ project as a starting port for > ports to other
> parallel database engines. This proposal highly > encourages this type of
> work as it would help to further realize the > original cross-platform goal
> of MADlib as envisioned by its > originators. > > Thus, the goal of this
> proposal is to bring the existing MADlib open > source community into ASF,
> change the project's governance model to > the "Apache Way" and transition
> the project's codebase and > infrastructure into ASF INFRA. The community
> has agreed to transfer > the brand name "MADlib" to Apache Software
> Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source
> community is > submitting this proposal to transition source code and
> associated > artifacts (documentation, web site content, wiki, etc.) to the
> Apache > Software Foundation Incubator under the Apache License, Version
> 2.0 > and is asking Incubator PMC to established a MADlib incubating >
> project. > > Currently MADlib uses a few category X licensed software tools
> during > its build (mostly for generating documentation): > * doxypy 0.4.2
> (GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an
> exception for generated output) > We feel that this usage is compatible
> with an overall project licensed > under the ALv2 and don't anticipate any
> changes. > Our usage of LGPL library cern_root-5.34 is expected to go away
> since > the 2 cern modules used are being entirely re-written > in MADlib >
> > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into >
> its binary artifact seems to be consistent with > ASF recommendation for
> managing "weak copyleft" dependencies. > > > == Background == > MADlib grew
> out of discussions between database engine developers, > data scientists,
> IT architects and academics interested in new > approaches to scalable,
> sophisticated in-database analytics. These > discussions were written up in
> a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis
> > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software >
> project began the following year as a collaboration between > researchers
> at UC Berkeley and engineers and data scientists at > Pivotal (former
> EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC
> Berkeley, the > University of Wisconsin, and the University of Florida. The
> project > was publicly documented in a paper at VLDB 2012 > (
> http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today >
> MADlib has contributors from around the world including both > individuals
> and institutions. For example, recent contributions have > come from
> Pivotal, Stanford University, and 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Gautam Muralidhar
+1 nonbinding

Sent from my iPhone

> On Sep 11, 2015, at 9:43 PM, Gregory Chase  wrote:
> 
> +1 nonbinding
> 
>> On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles  wrote:
>> 
>> --
>> Chris
> 
> 
> 
> -- 
> Greg Chase
> 
> Director of Big Data Communities
> http://www.pivotal.io/big-data
> 
> Pivotal Software
> http://www.pivotal.io/
> 
> 650-215-0477
> @GregChase
> Blog: http://geekmarketing.biz/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Gregory Chase
+1 nonbinding

On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles  wrote:

> --
> Chris
>



-- 
Greg Chase

Director of Big Data Communities
http://www.pivotal.io/big-data

Pivotal Software
http://www.pivotal.io/

650-215-0477
@GregChase
Blog: http://geekmarketing.biz/


Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Don Bosco Durai
+1 (non binding)


On 9/11/15, 10:02 AM, "Gautam Muralidhar" 
wrote:

>+1 nonbinding
>
>Sent from my iPhone
>
>> On Sep 11, 2015, at 9:43 PM, Gregory Chase  wrote:
>> 
>> +1 nonbinding
>> 
>>> On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles 
>>>wrote:
>>> 
>>> --
>>> Chris
>> 
>> 
>> 
>> -- 
>> Greg Chase
>> 
>> Director of Big Data Communities
>> http://www.pivotal.io/big-data
>> 
>> Pivotal Software
>> http://www.pivotal.io/
>> 
>> 650-215-0477
>> @GregChase
>> Blog: http://geekmarketing.biz/
>
>-
>To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>For additional commands, e-mail: general-h...@incubator.apache.org
>



-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Chris Rawles
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: >
> Following the discussion earlier: > http://s.apache.org/TE6 > > I would
like to call a VOTE for accepting > MADlib community as a new ASF incubator
> project. > > The proposal is available at: >
https://wiki.apache.org/incubator/MADlibProposal > and is also included at
the bottom of this email. > > Vote is open until at least Mon, 14 September
2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ]
±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is
an open-source library (licensed under 2-clause BSD license) > for scalable
in-database analytics. It provides data-parallel > implementations of
mathematical, statistical and machine learning > methods for structured and
unstructured data. The MADlib mission is to > foster widespread development
of scalable analytic skills, by > harnessing efforts from commercial
practice, academic research, and > open source development. > > MADlib
occupies a unique niche in the realm of data science and > machine learning
libraries since its SQL APIs can allow it to work on > a wide range of data
stores and SQL engines. > > == Proposal == > The current open source
community behind MADlib feels that aligning > itself with HAWQ's community,
governance model, infrastructure and > roadmap will allow the project to
accelerate adoption and community > growth. Given HAWQ's trajectory of
entering Apache Software Foundation > family as an Incubating project, we
feel that the best course of > action for MADlib is to follow a similar
route. > > MADlib and HAWQ are complementary technologies in that MADlib >
in-database analytical functions can run within the HAWQ execution >
engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It
is expected that contributors to MADlib will be cognizant of the > HAWQ ASF
project and may contribute to it as well. In short, > collaboration between
the two communities will make both projects more > vibrant and advance the
respective technologies in potentially novel > directions. > > Contributors
may also look at the HAWQ project as a starting port for > ports to other
parallel database engines. This proposal highly > encourages this type of
work as it would help to further realize the > original cross-platform goal
of MADlib as envisioned by its > originators. > > Thus, the goal of this
proposal is to bring the existing MADlib open > source community into ASF,
change the project's governance model to > the "Apache Way" and transition
the project's codebase and > infrastructure into ASF INFRA. The community
has agreed to transfer > the brand name "MADlib" to Apache Software
Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source
community is > submitting this proposal to transition source code and
associated > artifacts (documentation, web site content, wiki, etc.) to the
Apache > Software Foundation Incubator under the Apache License, Version
2.0 > and is asking Incubator PMC to established a MADlib incubating >
project. > > Currently MADlib uses a few category X licensed software tools
during > its build (mostly for generating documentation): > * doxypy 0.4.2
(GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an
exception for generated output) > We feel that this usage is compatible
with an overall project licensed > under the ALv2 and don't anticipate any
changes. > Our usage of LGPL library cern_root-5.34 is expected to go away
since > the 2 cern modules used are being entirely re-written > in MADlib >
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into >
its binary artifact seems to be consistent with > ASF recommendation for
managing "weak copyleft" dependencies. > > > == Background == > MADlib grew
out of discussions between database engine developers, > data scientists,
IT architects and academics interested in new > approaches to scalable,
sophisticated in-database analytics. These > discussions were written up in
a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software >
project began the following year as a collaboration between > researchers
at UC Berkeley and engineers and data scientists at > Pivotal (former
EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC
Berkeley, the > University of Wisconsin, and the University of Florida. The
project > was publicly documented in a paper at VLDB 2012 > (
http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today >
MADlib has contributors from around the world including both > individuals
and institutions. For example, recent contributions have > come from
Pivotal, Stanford University, and the University of Illinois > at Chicago.
> > MADlib was conceived from the outset as a free, open source library >
for all to use and contribute to. Since its inception, the community > has
steadily added new 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread dpopova
+1


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Kee Siong Ng
+1


> On 12 Sep 2015, at 5:00 am, Don Bosco Durai  wrote:
> 
> +1 (non binding)
> 
> 
> On 9/11/15, 10:02 AM, "Gautam Muralidhar" 
> wrote:
> 
>> +1 nonbinding
>> 
>> Sent from my iPhone
>> 
>>> On Sep 11, 2015, at 9:43 PM, Gregory Chase  wrote:
>>> 
>>> +1 nonbinding
>>> 
 On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles 
 wrote:
 
 --
 Chris
>>> 
>>> 
>>> 
>>> -- 
>>> Greg Chase
>>> 
>>> Director of Big Data Communities
>>> http://www.pivotal.io/big-data
>>> 
>>> Pivotal Software
>>> http://www.pivotal.io/
>>> 
>>> 650-215-0477
>>> @GregChase
>>> Blog: http://geekmarketing.biz/
>> 
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
> 
> 
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
> 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Skip Intro
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: >
> Following the discussion earlier: > http://s.apache.org/TE6 > > I would
like to call a VOTE for accepting > MADlib community as a new ASF incubator
> project. > > The proposal is available at: >
https://wiki.apache.org/incubator/MADlibProposal > and is also included at
the bottom of this email. > > Vote is open until at least Mon, 14 September
2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ]
±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is
an open-source library (licensed under 2-clause BSD license) > for scalable
in-database analytics. It provides data-parallel > implementations of
mathematical, statistical and machine learning > methods for structured and
unstructured data. The MADlib mission is to > foster widespread development
of scalable analytic skills, by > harnessing efforts from commercial
practice, academic research, and > open source development. > > MADlib
occupies a unique niche in the realm of data science and > machine learning
libraries since its SQL APIs can allow it to work on > a wide range of data
stores and SQL engines. > > == Proposal == > The current open source
community behind MADlib feels that aligning > itself with HAWQ's community,
governance model, infrastructure and > roadmap will allow the project to
accelerate adoption and community > growth. Given HAWQ's trajectory of
entering Apache Software Foundation > family as an Incubating project, we
feel that the best course of > action for MADlib is to follow a similar
route. > > MADlib and HAWQ are complementary technologies in that MADlib >
in-database analytical functions can run within the HAWQ execution >
engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It
is expected that contributors to MADlib will be cognizant of the > HAWQ ASF
project and may contribute to it as well. In short, > collaboration between
the two communities will make both projects more > vibrant and advance the
respective technologies in potentially novel > directions. > > Contributors
may also look at the HAWQ project as a starting port for > ports to other
parallel database engines. This proposal highly > encourages this type of
work as it would help to further realize the > original cross-platform goal
of MADlib as envisioned by its > originators. > > Thus, the goal of this
proposal is to bring the existing MADlib open > source community into ASF,
change the project's governance model to > the "Apache Way" and transition
the project's codebase and > infrastructure into ASF INFRA. The community
has agreed to transfer > the brand name "MADlib" to Apache Software
Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source
community is > submitting this proposal to transition source code and
associated > artifacts (documentation, web site content, wiki, etc.) to the
Apache > Software Foundation Incubator under the Apache License, Version
2.0 > and is asking Incubator PMC to established a MADlib incubating >
project. > > Currently MADlib uses a few category X licensed software tools
during > its build (mostly for generating documentation): > * doxypy 0.4.2
(GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an
exception for generated output) > We feel that this usage is compatible
with an overall project licensed > under the ALv2 and don't anticipate any
changes. > Our usage of LGPL library cern_root-5.34 is expected to go away
since > the 2 cern modules used are being entirely re-written > in MADlib >
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into >
its binary artifact seems to be consistent with > ASF recommendation for
managing "weak copyleft" dependencies. > > > == Background == > MADlib grew
out of discussions between database engine developers, > data scientists,
IT architects and academics interested in new > approaches to scalable,
sophisticated in-database analytics. These > discussions were written up in
a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software >
project began the following year as a collaboration between > researchers
at UC Berkeley and engineers and data scientists at > Pivotal (former
EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC
Berkeley, the > University of Wisconsin, and the University of Florida. The
project > was publicly documented in a paper at VLDB 2012 > (
http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today >
MADlib has contributors from around the world including both > individuals
and institutions. For example, recent contributions have > come from
Pivotal, Stanford University, and the University of Illinois > at Chicago.
> > MADlib was conceived from the outset as a free, open source library >
for all to use and contribute to. Since its inception, the community > has
steadily added new 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-11 Thread Chris Rawles
-- 
Chris


Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Rahul Iyer
+1 (non-binding)

On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
>
> Following the discussion earlier:
>http://s.apache.org/TE6
>
> I would like to call a VOTE for accepting
> MADlib community as a new ASF incubator
> project.
>
> The proposal is available at:
> https://wiki.apache.org/incubator/MADlibProposal
> and is also included at the bottom of this email.
>
> Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
>
>  [ ] +1 accept MADlib into the Apache Incubator
>  [ ] ±0
>  [ ] -1 because...
>
> Thanks,
> Roman.
>
> == Abstract ==
> MADlib is an open-source library (licensed under 2-clause BSD license)
> for scalable in-database analytics. It provides data-parallel
> implementations of mathematical, statistical and machine learning
> methods for structured and unstructured data. The MADlib mission is to
> foster widespread development of scalable analytic skills, by
> harnessing efforts from commercial practice, academic research, and
> open source development.
>
> MADlib occupies a unique niche in the realm of data science and
> machine learning libraries since its SQL APIs can allow it to work on
> a wide range of data stores and SQL engines.
>
> == Proposal ==
> The current open source community behind MADlib feels that aligning
> itself with HAWQ's community, governance model, infrastructure and
> roadmap will allow the project to accelerate adoption and community
> growth. Given HAWQ's trajectory of entering Apache Software Foundation
> family as an Incubating project, we feel that the best course of
> action for MADlib is to follow a similar route.
>
> MADlib and HAWQ are complementary technologies in that MADlib
> in-database analytical functions can run within the HAWQ execution
> engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> It is expected that contributors to MADlib will be cognizant of the
> HAWQ ASF project and may contribute to it as well.  In short,
> collaboration between the two communities will make both projects more
> vibrant and advance the respective technologies in potentially novel
> directions.
>
> Contributors may also look at the HAWQ project as a starting port for
> ports to other parallel database engines. This proposal highly
> encourages this type of work as it would help to further realize the
> original cross-platform goal of MADlib as envisioned by its
> originators.
>
> Thus, the goal of this proposal is to bring the existing MADlib open
> source community into ASF, change the project's governance model to
> the "Apache Way" and transition the project's codebase and
> infrastructure into ASF INFRA. The community has agreed to transfer
> the brand name "MADlib" to Apache Software Foundation as well.
>
> Pivotal Inc. on behalf of the MADlib open source community is
> submitting this proposal to transition source code and associated
> artifacts (documentation, web site content, wiki, etc.) to the Apache
> Software Foundation Incubator under the Apache License, Version 2.0
> and is asking Incubator PMC to established a MADlib incubating
> project.
>
> Currently MADlib uses a few category X licensed software tools during
> its build (mostly for generating documentation):
>* doxypy 0.4.2 (GPL)
>* doxygen 1.8.4 (GPL)
>* TikZ-UML
>* bison 2.4 (GPL, with an exception for generated output)
> We feel that this usage is compatible with an overall project licensed
> under the ALv2 and don't anticipate any changes.
> Our usage of LGPL library cern_root-5.34 is expected to go away since
> the 2 cern modules used are being entirely re-written
> in MADlib
>
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> its binary artifact seems to be consistent with
> ASF recommendation for managing "weak copyleft" dependencies.
>
>
> == Background ==
> MADlib grew out of discussions between database engine developers,
> data scientists, IT architects and academics interested in new
> approaches to scalable, sophisticated in-database analytics. These
> discussions were written up in a paper in VLDB 2009 that coined the
> term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> project began the following year as a collaboration between
> researchers at UC Berkeley and engineers and data scientists at
> Pivotal (former EMC/Greenplum).
>
> The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
> University of Wisconsin, and the University of Florida.  The project
> was publicly documented in a paper at VLDB 2012
> (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
> MADlib has contributors from around the world including both
> individuals and institutions.  For example, recent contributions have
> come from Pivotal, Stanford University, and the University of Illinois
> at Chicago.
>
> MADlib was conceived from the outset as a free, open source library
> for all to use and contribute to.  Since its inception, the 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Roman Shaposhnik
On Wed, Sep 9, 2015 at 7:37 PM, Roman Shaposhnik  wrote:
> Following the discussion earlier:
>http://s.apache.org/TE6
>
> I would like to call a VOTE for accepting
> MADlib community as a new ASF incubator
> project.
>
> The proposal is available at:
> https://wiki.apache.org/incubator/MADlibProposal
> and is also included at the bottom of this email.
>
> Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
>
>  [ ] +1 accept MADlib into the Apache Incubator
>  [ ] ±0
>  [ ] -1 because...

+1 (binding)

Thanks,
Roman.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Caleb Welton
+1 (non-binding)

On Thu, Sep 10, 2015 at 12:53 PM, Rahul Iyer  wrote:

> +1 (non-binding)
>
> On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
> >
> > Following the discussion earlier:
> >http://s.apache.org/TE6
> >
> > I would like to call a VOTE for accepting
> > MADlib community as a new ASF incubator
> > project.
> >
> > The proposal is available at:
> > https://wiki.apache.org/incubator/MADlibProposal
> > and is also included at the bottom of this email.
> >
> > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
> >
> >  [ ] +1 accept MADlib into the Apache Incubator
> >  [ ] ±0
> >  [ ] -1 because...
> >
> > Thanks,
> > Roman.
> >
> > == Abstract ==
> > MADlib is an open-source library (licensed under 2-clause BSD license)
> > for scalable in-database analytics. It provides data-parallel
> > implementations of mathematical, statistical and machine learning
> > methods for structured and unstructured data. The MADlib mission is to
> > foster widespread development of scalable analytic skills, by
> > harnessing efforts from commercial practice, academic research, and
> > open source development.
> >
> > MADlib occupies a unique niche in the realm of data science and
> > machine learning libraries since its SQL APIs can allow it to work on
> > a wide range of data stores and SQL engines.
> >
> > == Proposal ==
> > The current open source community behind MADlib feels that aligning
> > itself with HAWQ's community, governance model, infrastructure and
> > roadmap will allow the project to accelerate adoption and community
> > growth. Given HAWQ's trajectory of entering Apache Software Foundation
> > family as an Incubating project, we feel that the best course of
> > action for MADlib is to follow a similar route.
> >
> > MADlib and HAWQ are complementary technologies in that MADlib
> > in-database analytical functions can run within the HAWQ execution
> > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> > It is expected that contributors to MADlib will be cognizant of the
> > HAWQ ASF project and may contribute to it as well.  In short,
> > collaboration between the two communities will make both projects more
> > vibrant and advance the respective technologies in potentially novel
> > directions.
> >
> > Contributors may also look at the HAWQ project as a starting port for
> > ports to other parallel database engines. This proposal highly
> > encourages this type of work as it would help to further realize the
> > original cross-platform goal of MADlib as envisioned by its
> > originators.
> >
> > Thus, the goal of this proposal is to bring the existing MADlib open
> > source community into ASF, change the project's governance model to
> > the "Apache Way" and transition the project's codebase and
> > infrastructure into ASF INFRA. The community has agreed to transfer
> > the brand name "MADlib" to Apache Software Foundation as well.
> >
> > Pivotal Inc. on behalf of the MADlib open source community is
> > submitting this proposal to transition source code and associated
> > artifacts (documentation, web site content, wiki, etc.) to the Apache
> > Software Foundation Incubator under the Apache License, Version 2.0
> > and is asking Incubator PMC to established a MADlib incubating
> > project.
> >
> > Currently MADlib uses a few category X licensed software tools during
> > its build (mostly for generating documentation):
> >* doxypy 0.4.2 (GPL)
> >* doxygen 1.8.4 (GPL)
> >* TikZ-UML
> >* bison 2.4 (GPL, with an exception for generated output)
> > We feel that this usage is compatible with an overall project licensed
> > under the ALv2 and don't anticipate any changes.
> > Our usage of LGPL library cern_root-5.34 is expected to go away since
> > the 2 cern modules used are being entirely re-written
> > in MADlib
> >
> > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> > its binary artifact seems to be consistent with
> > ASF recommendation for managing "weak copyleft" dependencies.
> >
> >
> > == Background ==
> > MADlib grew out of discussions between database engine developers,
> > data scientists, IT architects and academics interested in new
> > approaches to scalable, sophisticated in-database analytics. These
> > discussions were written up in a paper in VLDB 2009 that coined the
> > term “MAD Skills” for data analysis
> > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> > project began the following year as a collaboration between
> > researchers at UC Berkeley and engineers and data scientists at
> > Pivotal (former EMC/Greenplum).
> >
> > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
> > University of Wisconsin, and the University of Florida.  The project
> > was publicly documented in a paper at VLDB 2012
> > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
> > MADlib has contributors from around the world including 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Srivatsan Ramanujam
+1 (non-binding).


On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
>
> Following the discussion earlier:
>http://s.apache.org/TE6
>
> I would like to call a VOTE for accepting
> MADlib community as a new ASF incubator
> project.
>
> The proposal is available at:
> https://wiki.apache.org/incubator/MADlibProposal
> and is also included at the bottom of this email.
>
> Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
>
>  [ ] +1 accept MADlib into the Apache Incubator
>  [ ] ±0
>  [ ] -1 because...
>
> Thanks,
> Roman.
>
> == Abstract ==
> MADlib is an open-source library (licensed under 2-clause BSD license)
> for scalable in-database analytics. It provides data-parallel
> implementations of mathematical, statistical and machine learning
> methods for structured and unstructured data. The MADlib mission is to
> foster widespread development of scalable analytic skills, by
> harnessing efforts from commercial practice, academic research, and
> open source development.
>
> MADlib occupies a unique niche in the realm of data science and
> machine learning libraries since its SQL APIs can allow it to work on
> a wide range of data stores and SQL engines.
>
> == Proposal ==
> The current open source community behind MADlib feels that aligning
> itself with HAWQ's community, governance model, infrastructure and
> roadmap will allow the project to accelerate adoption and community
> growth. Given HAWQ's trajectory of entering Apache Software Foundation
> family as an Incubating project, we feel that the best course of
> action for MADlib is to follow a similar route.
>
> MADlib and HAWQ are complementary technologies in that MADlib
> in-database analytical functions can run within the HAWQ execution
> engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> It is expected that contributors to MADlib will be cognizant of the
> HAWQ ASF project and may contribute to it as well.  In short,
> collaboration between the two communities will make both projects more
> vibrant and advance the respective technologies in potentially novel
> directions.
>
> Contributors may also look at the HAWQ project as a starting port for
> ports to other parallel database engines. This proposal highly
> encourages this type of work as it would help to further realize the
> original cross-platform goal of MADlib as envisioned by its
> originators.
>
> Thus, the goal of this proposal is to bring the existing MADlib open
> source community into ASF, change the project's governance model to
> the "Apache Way" and transition the project's codebase and
> infrastructure into ASF INFRA. The community has agreed to transfer
> the brand name "MADlib" to Apache Software Foundation as well.
>
> Pivotal Inc. on behalf of the MADlib open source community is
> submitting this proposal to transition source code and associated
> artifacts (documentation, web site content, wiki, etc.) to the Apache
> Software Foundation Incubator under the Apache License, Version 2.0
> and is asking Incubator PMC to established a MADlib incubating
> project.
>
> Currently MADlib uses a few category X licensed software tools during
> its build (mostly for generating documentation):
>* doxypy 0.4.2 (GPL)
>* doxygen 1.8.4 (GPL)
>* TikZ-UML
>* bison 2.4 (GPL, with an exception for generated output)
> We feel that this usage is compatible with an overall project licensed
> under the ALv2 and don't anticipate any changes.
> Our usage of LGPL library cern_root-5.34 is expected to go away since
> the 2 cern modules used are being entirely re-written
> in MADlib
>
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> its binary artifact seems to be consistent with
> ASF recommendation for managing "weak copyleft" dependencies.
>
>
> == Background ==
> MADlib grew out of discussions between database engine developers,
> data scientists, IT architects and academics interested in new
> approaches to scalable, sophisticated in-database analytics. These
> discussions were written up in a paper in VLDB 2009 that coined the
> term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> project began the following year as a collaboration between
> researchers at UC Berkeley and engineers and data scientists at
> Pivotal (former EMC/Greenplum).
>
> The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
> University of Wisconsin, and the University of Florida.  The project
> was publicly documented in a paper at VLDB 2012
> (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
> MADlib has contributors from around the world including both
> individuals and institutions.  For example, recent contributions have
> come from Pivotal, Stanford University, and the University of Illinois
> at Chicago.
>
> MADlib was conceived from the outset as a free, open source library
> for all to use and contribute to.  Since its inception, the 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Frank McQuillan
+1 (non-binding)

On Thu, Sep 10, 2015 at 3:57 PM, Caleb Welton  wrote:

> +1 (non-binding)
>
> On Thu, Sep 10, 2015 at 12:53 PM, Rahul Iyer  wrote:
>
> > +1 (non-binding)
> >
> > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
> > >
> > > Following the discussion earlier:
> > >http://s.apache.org/TE6
> > >
> > > I would like to call a VOTE for accepting
> > > MADlib community as a new ASF incubator
> > > project.
> > >
> > > The proposal is available at:
> > > https://wiki.apache.org/incubator/MADlibProposal
> > > and is also included at the bottom of this email.
> > >
> > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
> > >
> > >  [ ] +1 accept MADlib into the Apache Incubator
> > >  [ ] ±0
> > >  [ ] -1 because...
> > >
> > > Thanks,
> > > Roman.
> > >
> > > == Abstract ==
> > > MADlib is an open-source library (licensed under 2-clause BSD license)
> > > for scalable in-database analytics. It provides data-parallel
> > > implementations of mathematical, statistical and machine learning
> > > methods for structured and unstructured data. The MADlib mission is to
> > > foster widespread development of scalable analytic skills, by
> > > harnessing efforts from commercial practice, academic research, and
> > > open source development.
> > >
> > > MADlib occupies a unique niche in the realm of data science and
> > > machine learning libraries since its SQL APIs can allow it to work on
> > > a wide range of data stores and SQL engines.
> > >
> > > == Proposal ==
> > > The current open source community behind MADlib feels that aligning
> > > itself with HAWQ's community, governance model, infrastructure and
> > > roadmap will allow the project to accelerate adoption and community
> > > growth. Given HAWQ's trajectory of entering Apache Software Foundation
> > > family as an Incubating project, we feel that the best course of
> > > action for MADlib is to follow a similar route.
> > >
> > > MADlib and HAWQ are complementary technologies in that MADlib
> > > in-database analytical functions can run within the HAWQ execution
> > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> > > It is expected that contributors to MADlib will be cognizant of the
> > > HAWQ ASF project and may contribute to it as well.  In short,
> > > collaboration between the two communities will make both projects more
> > > vibrant and advance the respective technologies in potentially novel
> > > directions.
> > >
> > > Contributors may also look at the HAWQ project as a starting port for
> > > ports to other parallel database engines. This proposal highly
> > > encourages this type of work as it would help to further realize the
> > > original cross-platform goal of MADlib as envisioned by its
> > > originators.
> > >
> > > Thus, the goal of this proposal is to bring the existing MADlib open
> > > source community into ASF, change the project's governance model to
> > > the "Apache Way" and transition the project's codebase and
> > > infrastructure into ASF INFRA. The community has agreed to transfer
> > > the brand name "MADlib" to Apache Software Foundation as well.
> > >
> > > Pivotal Inc. on behalf of the MADlib open source community is
> > > submitting this proposal to transition source code and associated
> > > artifacts (documentation, web site content, wiki, etc.) to the Apache
> > > Software Foundation Incubator under the Apache License, Version 2.0
> > > and is asking Incubator PMC to established a MADlib incubating
> > > project.
> > >
> > > Currently MADlib uses a few category X licensed software tools during
> > > its build (mostly for generating documentation):
> > >* doxypy 0.4.2 (GPL)
> > >* doxygen 1.8.4 (GPL)
> > >* TikZ-UML
> > >* bison 2.4 (GPL, with an exception for generated output)
> > > We feel that this usage is compatible with an overall project licensed
> > > under the ALv2 and don't anticipate any changes.
> > > Our usage of LGPL library cern_root-5.34 is expected to go away since
> > > the 2 cern modules used are being entirely re-written
> > > in MADlib
> > >
> > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> > > its binary artifact seems to be consistent with
> > > ASF recommendation for managing "weak copyleft" dependencies.
> > >
> > >
> > > == Background ==
> > > MADlib grew out of discussions between database engine developers,
> > > data scientists, IT architects and academics interested in new
> > > approaches to scalable, sophisticated in-database analytics. These
> > > discussions were written up in a paper in VLDB 2009 that coined the
> > > term “MAD Skills” for data analysis
> > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> > > project began the following year as a collaboration between
> > > researchers at UC Berkeley and engineers and data scientists at
> > > Pivotal (former EMC/Greenplum).
> > >
> > > The initial MADlib codebase 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-10 Thread Julian Hyde
+1 (binding)

Julian


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-09 Thread Christian Tzolov
+1 (not binding)

On Thu, Sep 10, 2015 at 5:19 AM, Atri Sharma  wrote:

> +1
> On 10 Sep 2015 08:11, "Konstantin Boudnik"  wrote:
>
> > +1 (binding)
> >
> > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
> > > Following the discussion earlier:
> > >http://s.apache.org/TE6
> > >
> > > I would like to call a VOTE for accepting
> > > MADlib community as a new ASF incubator
> > > project.
> > >
> > > The proposal is available at:
> > > https://wiki.apache.org/incubator/MADlibProposal
> > > and is also included at the bottom of this email.
> > >
> > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
> > >
> > >  [ ] +1 accept MADlib into the Apache Incubator
> > >  [ ] ±0
> > >  [ ] -1 because...
> > >
> > > Thanks,
> > > Roman.
> > >
> > > == Abstract ==
> > > MADlib is an open-source library (licensed under 2-clause BSD license)
> > > for scalable in-database analytics. It provides data-parallel
> > > implementations of mathematical, statistical and machine learning
> > > methods for structured and unstructured data. The MADlib mission is to
> > > foster widespread development of scalable analytic skills, by
> > > harnessing efforts from commercial practice, academic research, and
> > > open source development.
> > >
> > > MADlib occupies a unique niche in the realm of data science and
> > > machine learning libraries since its SQL APIs can allow it to work on
> > > a wide range of data stores and SQL engines.
> > >
> > > == Proposal ==
> > > The current open source community behind MADlib feels that aligning
> > > itself with HAWQ's community, governance model, infrastructure and
> > > roadmap will allow the project to accelerate adoption and community
> > > growth. Given HAWQ's trajectory of entering Apache Software Foundation
> > > family as an Incubating project, we feel that the best course of
> > > action for MADlib is to follow a similar route.
> > >
> > > MADlib and HAWQ are complementary technologies in that MADlib
> > > in-database analytical functions can run within the HAWQ execution
> > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> > > It is expected that contributors to MADlib will be cognizant of the
> > > HAWQ ASF project and may contribute to it as well.  In short,
> > > collaboration between the two communities will make both projects more
> > > vibrant and advance the respective technologies in potentially novel
> > > directions.
> > >
> > > Contributors may also look at the HAWQ project as a starting port for
> > > ports to other parallel database engines. This proposal highly
> > > encourages this type of work as it would help to further realize the
> > > original cross-platform goal of MADlib as envisioned by its
> > > originators.
> > >
> > > Thus, the goal of this proposal is to bring the existing MADlib open
> > > source community into ASF, change the project's governance model to
> > > the "Apache Way" and transition the project's codebase and
> > > infrastructure into ASF INFRA. The community has agreed to transfer
> > > the brand name "MADlib" to Apache Software Foundation as well.
> > >
> > > Pivotal Inc. on behalf of the MADlib open source community is
> > > submitting this proposal to transition source code and associated
> > > artifacts (documentation, web site content, wiki, etc.) to the Apache
> > > Software Foundation Incubator under the Apache License, Version 2.0
> > > and is asking Incubator PMC to established a MADlib incubating
> > > project.
> > >
> > > Currently MADlib uses a few category X licensed software tools during
> > > its build (mostly for generating documentation):
> > >* doxypy 0.4.2 (GPL)
> > >* doxygen 1.8.4 (GPL)
> > >* TikZ-UML
> > >* bison 2.4 (GPL, with an exception for generated output)
> > > We feel that this usage is compatible with an overall project licensed
> > > under the ALv2 and don't anticipate any changes.
> > > Our usage of LGPL library cern_root-5.34 is expected to go away since
> > > the 2 cern modules used are being entirely re-written
> > > in MADlib
> > >
> > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> > > its binary artifact seems to be consistent with
> > > ASF recommendation for managing "weak copyleft" dependencies.
> > >
> > >
> > > == Background ==
> > > MADlib grew out of discussions between database engine developers,
> > > data scientists, IT architects and academics interested in new
> > > approaches to scalable, sophisticated in-database analytics. These
> > > discussions were written up in a paper in VLDB 2009 that coined the
> > > term “MAD Skills” for data analysis
> > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> > > project began the following year as a collaboration between
> > > researchers at UC Berkeley and engineers and data scientists at
> > > Pivotal (former EMC/Greenplum).
> > >
> > > The initial MADlib codebase came from EMC/Greenplum, UC 

[VOTE] Accept MADlib into the Apache Incubator

2015-09-09 Thread Roman Shaposhnik
Following the discussion earlier:
   http://s.apache.org/TE6

I would like to call a VOTE for accepting
MADlib community as a new ASF incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/MADlibProposal
and is also included at the bottom of this email.

Vote is open until at least Mon, 14 September 2015, 23:59:00 PST

 [ ] +1 accept MADlib into the Apache Incubator
 [ ] ±0
 [ ] -1 because...

Thanks,
Roman.

== Abstract ==
MADlib is an open-source library (licensed under 2-clause BSD license)
for scalable in-database analytics. It provides data-parallel
implementations of mathematical, statistical and machine learning
methods for structured and unstructured data. The MADlib mission is to
foster widespread development of scalable analytic skills, by
harnessing efforts from commercial practice, academic research, and
open source development.

MADlib occupies a unique niche in the realm of data science and
machine learning libraries since its SQL APIs can allow it to work on
a wide range of data stores and SQL engines.

== Proposal ==
The current open source community behind MADlib feels that aligning
itself with HAWQ's community, governance model, infrastructure and
roadmap will allow the project to accelerate adoption and community
growth. Given HAWQ's trajectory of entering Apache Software Foundation
family as an Incubating project, we feel that the best course of
action for MADlib is to follow a similar route.

MADlib and HAWQ are complementary technologies in that MADlib
in-database analytical functions can run within the HAWQ execution
engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
It is expected that contributors to MADlib will be cognizant of the
HAWQ ASF project and may contribute to it as well.  In short,
collaboration between the two communities will make both projects more
vibrant and advance the respective technologies in potentially novel
directions.

Contributors may also look at the HAWQ project as a starting port for
ports to other parallel database engines. This proposal highly
encourages this type of work as it would help to further realize the
original cross-platform goal of MADlib as envisioned by its
originators.

Thus, the goal of this proposal is to bring the existing MADlib open
source community into ASF, change the project's governance model to
the "Apache Way" and transition the project's codebase and
infrastructure into ASF INFRA. The community has agreed to transfer
the brand name "MADlib" to Apache Software Foundation as well.

Pivotal Inc. on behalf of the MADlib open source community is
submitting this proposal to transition source code and associated
artifacts (documentation, web site content, wiki, etc.) to the Apache
Software Foundation Incubator under the Apache License, Version 2.0
and is asking Incubator PMC to established a MADlib incubating
project.

Currently MADlib uses a few category X licensed software tools during
its build (mostly for generating documentation):
   * doxypy 0.4.2 (GPL)
   * doxygen 1.8.4 (GPL)
   * TikZ-UML
   * bison 2.4 (GPL, with an exception for generated output)
We feel that this usage is compatible with an overall project licensed
under the ALv2 and don't anticipate any changes.
Our usage of LGPL library cern_root-5.34 is expected to go away since
the 2 cern modules used are being entirely re-written
in MADlib

Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
its binary artifact seems to be consistent with
ASF recommendation for managing "weak copyleft" dependencies.


== Background ==
MADlib grew out of discussions between database engine developers,
data scientists, IT architects and academics interested in new
approaches to scalable, sophisticated in-database analytics. These
discussions were written up in a paper in VLDB 2009 that coined the
term “MAD Skills” for data analysis
(http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
project began the following year as a collaboration between
researchers at UC Berkeley and engineers and data scientists at
Pivotal (former EMC/Greenplum).

The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
University of Wisconsin, and the University of Florida.  The project
was publicly documented in a paper at VLDB 2012
(http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
MADlib has contributors from around the world including both
individuals and institutions.  For example, recent contributions have
come from Pivotal, Stanford University, and the University of Illinois
at Chicago.

MADlib was conceived from the outset as a free, open source library
for all to use and contribute to.  Since its inception, the community
has steadily added new methods in the areas of mathematics,
statistics, machine learning, and data transformation.  The current
library includes over 30 principle algorithms as well as many
additional operators and utility functions.

The methods in MADlib are designed 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-09 Thread Konstantin Boudnik
+1 (binding)

On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
> Following the discussion earlier:
>http://s.apache.org/TE6
> 
> I would like to call a VOTE for accepting
> MADlib community as a new ASF incubator
> project.
> 
> The proposal is available at:
> https://wiki.apache.org/incubator/MADlibProposal
> and is also included at the bottom of this email.
> 
> Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
> 
>  [ ] +1 accept MADlib into the Apache Incubator
>  [ ] ±0
>  [ ] -1 because...
> 
> Thanks,
> Roman.
> 
> == Abstract ==
> MADlib is an open-source library (licensed under 2-clause BSD license)
> for scalable in-database analytics. It provides data-parallel
> implementations of mathematical, statistical and machine learning
> methods for structured and unstructured data. The MADlib mission is to
> foster widespread development of scalable analytic skills, by
> harnessing efforts from commercial practice, academic research, and
> open source development.
> 
> MADlib occupies a unique niche in the realm of data science and
> machine learning libraries since its SQL APIs can allow it to work on
> a wide range of data stores and SQL engines.
> 
> == Proposal ==
> The current open source community behind MADlib feels that aligning
> itself with HAWQ's community, governance model, infrastructure and
> roadmap will allow the project to accelerate adoption and community
> growth. Given HAWQ's trajectory of entering Apache Software Foundation
> family as an Incubating project, we feel that the best course of
> action for MADlib is to follow a similar route.
> 
> MADlib and HAWQ are complementary technologies in that MADlib
> in-database analytical functions can run within the HAWQ execution
> engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> It is expected that contributors to MADlib will be cognizant of the
> HAWQ ASF project and may contribute to it as well.  In short,
> collaboration between the two communities will make both projects more
> vibrant and advance the respective technologies in potentially novel
> directions.
> 
> Contributors may also look at the HAWQ project as a starting port for
> ports to other parallel database engines. This proposal highly
> encourages this type of work as it would help to further realize the
> original cross-platform goal of MADlib as envisioned by its
> originators.
> 
> Thus, the goal of this proposal is to bring the existing MADlib open
> source community into ASF, change the project's governance model to
> the "Apache Way" and transition the project's codebase and
> infrastructure into ASF INFRA. The community has agreed to transfer
> the brand name "MADlib" to Apache Software Foundation as well.
> 
> Pivotal Inc. on behalf of the MADlib open source community is
> submitting this proposal to transition source code and associated
> artifacts (documentation, web site content, wiki, etc.) to the Apache
> Software Foundation Incubator under the Apache License, Version 2.0
> and is asking Incubator PMC to established a MADlib incubating
> project.
> 
> Currently MADlib uses a few category X licensed software tools during
> its build (mostly for generating documentation):
>* doxypy 0.4.2 (GPL)
>* doxygen 1.8.4 (GPL)
>* TikZ-UML
>* bison 2.4 (GPL, with an exception for generated output)
> We feel that this usage is compatible with an overall project licensed
> under the ALv2 and don't anticipate any changes.
> Our usage of LGPL library cern_root-5.34 is expected to go away since
> the 2 cern modules used are being entirely re-written
> in MADlib
> 
> Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> its binary artifact seems to be consistent with
> ASF recommendation for managing "weak copyleft" dependencies.
> 
> 
> == Background ==
> MADlib grew out of discussions between database engine developers,
> data scientists, IT architects and academics interested in new
> approaches to scalable, sophisticated in-database analytics. These
> discussions were written up in a paper in VLDB 2009 that coined the
> term “MAD Skills” for data analysis
> (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> project began the following year as a collaboration between
> researchers at UC Berkeley and engineers and data scientists at
> Pivotal (former EMC/Greenplum).
> 
> The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
> University of Wisconsin, and the University of Florida.  The project
> was publicly documented in a paper at VLDB 2012
> (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
> MADlib has contributors from around the world including both
> individuals and institutions.  For example, recent contributions have
> come from Pivotal, Stanford University, and the University of Illinois
> at Chicago.
> 
> MADlib was conceived from the outset as a free, open source library
> for all to use and contribute to.  Since its inception, 

Re: [VOTE] Accept MADlib into the Apache Incubator

2015-09-09 Thread Atri Sharma
+1
On 10 Sep 2015 08:11, "Konstantin Boudnik"  wrote:

> +1 (binding)
>
> On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote:
> > Following the discussion earlier:
> >http://s.apache.org/TE6
> >
> > I would like to call a VOTE for accepting
> > MADlib community as a new ASF incubator
> > project.
> >
> > The proposal is available at:
> > https://wiki.apache.org/incubator/MADlibProposal
> > and is also included at the bottom of this email.
> >
> > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST
> >
> >  [ ] +1 accept MADlib into the Apache Incubator
> >  [ ] ±0
> >  [ ] -1 because...
> >
> > Thanks,
> > Roman.
> >
> > == Abstract ==
> > MADlib is an open-source library (licensed under 2-clause BSD license)
> > for scalable in-database analytics. It provides data-parallel
> > implementations of mathematical, statistical and machine learning
> > methods for structured and unstructured data. The MADlib mission is to
> > foster widespread development of scalable analytic skills, by
> > harnessing efforts from commercial practice, academic research, and
> > open source development.
> >
> > MADlib occupies a unique niche in the realm of data science and
> > machine learning libraries since its SQL APIs can allow it to work on
> > a wide range of data stores and SQL engines.
> >
> > == Proposal ==
> > The current open source community behind MADlib feels that aligning
> > itself with HAWQ's community, governance model, infrastructure and
> > roadmap will allow the project to accelerate adoption and community
> > growth. Given HAWQ's trajectory of entering Apache Software Foundation
> > family as an Incubating project, we feel that the best course of
> > action for MADlib is to follow a similar route.
> >
> > MADlib and HAWQ are complementary technologies in that MADlib
> > in-database analytical functions can run within the HAWQ execution
> > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.)
> > It is expected that contributors to MADlib will be cognizant of the
> > HAWQ ASF project and may contribute to it as well.  In short,
> > collaboration between the two communities will make both projects more
> > vibrant and advance the respective technologies in potentially novel
> > directions.
> >
> > Contributors may also look at the HAWQ project as a starting port for
> > ports to other parallel database engines. This proposal highly
> > encourages this type of work as it would help to further realize the
> > original cross-platform goal of MADlib as envisioned by its
> > originators.
> >
> > Thus, the goal of this proposal is to bring the existing MADlib open
> > source community into ASF, change the project's governance model to
> > the "Apache Way" and transition the project's codebase and
> > infrastructure into ASF INFRA. The community has agreed to transfer
> > the brand name "MADlib" to Apache Software Foundation as well.
> >
> > Pivotal Inc. on behalf of the MADlib open source community is
> > submitting this proposal to transition source code and associated
> > artifacts (documentation, web site content, wiki, etc.) to the Apache
> > Software Foundation Incubator under the Apache License, Version 2.0
> > and is asking Incubator PMC to established a MADlib incubating
> > project.
> >
> > Currently MADlib uses a few category X licensed software tools during
> > its build (mostly for generating documentation):
> >* doxypy 0.4.2 (GPL)
> >* doxygen 1.8.4 (GPL)
> >* TikZ-UML
> >* bison 2.4 (GPL, with an exception for generated output)
> > We feel that this usage is compatible with an overall project licensed
> > under the ALv2 and don't anticipate any changes.
> > Our usage of LGPL library cern_root-5.34 is expected to go away since
> > the 2 cern modules used are being entirely re-written
> > in MADlib
> >
> > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into
> > its binary artifact seems to be consistent with
> > ASF recommendation for managing "weak copyleft" dependencies.
> >
> >
> > == Background ==
> > MADlib grew out of discussions between database engine developers,
> > data scientists, IT architects and academics interested in new
> > approaches to scalable, sophisticated in-database analytics. These
> > discussions were written up in a paper in VLDB 2009 that coined the
> > term “MAD Skills” for data analysis
> > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software
> > project began the following year as a collaboration between
> > researchers at UC Berkeley and engineers and data scientists at
> > Pivotal (former EMC/Greenplum).
> >
> > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the
> > University of Wisconsin, and the University of Florida.  The project
> > was publicly documented in a paper at VLDB 2012
> > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf).  Today
> > MADlib has contributors from around the world including both
> > individuals and