[RESULT] [VOTE] Accept MADlib into the Apache Incubator
On Wed, Sep 9, 2015 at 7:37 PM, Roman Shaposhnikwrote: > Following the discussion earlier: >http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... This vote is now closed and passes with 4 binding +1 votes, 15 non-binding +1 votes and no 0 or -1 votes. Thanks to all who helped with the proposal and cast the vote! Here's a vote tally: Non-binding +1s: Atri Sharma Christian Tzolov Rahul Iyer Caleb Welton Frank McQuillan Srivatsan Ramanujam Chris Rawles Gregory Chase Gautam Muralidhar Don Bosco Durai dpop...@uvic.ca Kee Siong Ng Sarah Aerni Michael West AJ Welch Binding +1s: Konstantin Boudnik Roman Shaposhnik Julian Hyde Ted Dunning Thanks, Roman. - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 Mike West 503.276.1815 Count what is countable, measure what is measurable, and what is not measureable, make measurable - Galileo IMPORTANT NOTICE: This communication, including any attachment, contains information that may be confidential or privileged, and is intended solely for the entity or individual to whom it is addressed. If you are not the intended recipient, you should delete this message and are hereby notified that any disclosure, copying, or distribution of this message is strictly prohibited. Nothing in this email, including any attachment, is intended to be a legally binding signature.
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: > http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): > * doxypy 0.4.2 (GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > ( http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception, the community > has steadily added new
Re: [VOTE] Accept MADlib into the Apache Incubator
+1
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 On Fri, Sep 11, 2015 at 9:11 PM, Skip Introwrote: > +1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > > Following the discussion earlier: > http://s.apache.org/TE6 > > I would > like to call a VOTE for accepting > MADlib community as a new ASF incubator > > project. > > The proposal is available at: > > https://wiki.apache.org/incubator/MADlibProposal > and is also included at > the bottom of this email. > > Vote is open until at least Mon, 14 September > 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] > ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is > an open-source library (licensed under 2-clause BSD license) > for scalable > in-database analytics. It provides data-parallel > implementations of > mathematical, statistical and machine learning > methods for structured and > unstructured data. The MADlib mission is to > foster widespread development > of scalable analytic skills, by > harnessing efforts from commercial > practice, academic research, and > open source development. > > MADlib > occupies a unique niche in the realm of data science and > machine learning > libraries since its SQL APIs can allow it to work on > a wide range of data > stores and SQL engines. > > == Proposal == > The current open source > community behind MADlib feels that aligning > itself with HAWQ's community, > governance model, infrastructure and > roadmap will allow the project to > accelerate adoption and community > growth. Given HAWQ's trajectory of > entering Apache Software Foundation > family as an Incubating project, we > feel that the best course of > action for MADlib is to follow a similar > route. > > MADlib and HAWQ are complementary technologies in that MADlib > > in-database analytical functions can run within the HAWQ execution > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It > is expected that contributors to MADlib will be cognizant of the > HAWQ ASF > project and may contribute to it as well. In short, > collaboration between > the two communities will make both projects more > vibrant and advance the > respective technologies in potentially novel > directions. > > Contributors > may also look at the HAWQ project as a starting port for > ports to other > parallel database engines. This proposal highly > encourages this type of > work as it would help to further realize the > original cross-platform goal > of MADlib as envisioned by its > originators. > > Thus, the goal of this > proposal is to bring the existing MADlib open > source community into ASF, > change the project's governance model to > the "Apache Way" and transition > the project's codebase and > infrastructure into ASF INFRA. The community > has agreed to transfer > the brand name "MADlib" to Apache Software > Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source > community is > submitting this proposal to transition source code and > associated > artifacts (documentation, web site content, wiki, etc.) to the > Apache > Software Foundation Incubator under the Apache License, Version > 2.0 > and is asking Incubator PMC to established a MADlib incubating > > project. > > Currently MADlib uses a few category X licensed software tools > during > its build (mostly for generating documentation): > * doxypy 0.4.2 > (GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an > exception for generated output) > We feel that this usage is compatible > with an overall project licensed > under the ALv2 and don't anticipate any > changes. > Our usage of LGPL library cern_root-5.34 is expected to go away > since > the 2 cern modules used are being entirely re-written > in MADlib > > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > > its binary artifact seems to be consistent with > ASF recommendation for > managing "weak copyleft" dependencies. > > > == Background == > MADlib grew > out of discussions between database engine developers, > data scientists, > IT architects and academics interested in new > approaches to scalable, > sophisticated in-database analytics. These > discussions were written up in > a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > > project began the following year as a collaboration between > researchers > at UC Berkeley and engineers and data scientists at > Pivotal (former > EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC > Berkeley, the > University of Wisconsin, and the University of Florida. The > project > was publicly documented in a paper at VLDB 2012 > ( > http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > > MADlib has contributors from around the world including both > individuals > and institutions. For example, recent contributions have > come from > Pivotal, Stanford University, and
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 nonbinding Sent from my iPhone > On Sep 11, 2015, at 9:43 PM, Gregory Chasewrote: > > +1 nonbinding > >> On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles wrote: >> >> -- >> Chris > > > > -- > Greg Chase > > Director of Big Data Communities > http://www.pivotal.io/big-data > > Pivotal Software > http://www.pivotal.io/ > > 650-215-0477 > @GregChase > Blog: http://geekmarketing.biz/ - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 nonbinding On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawleswrote: > -- > Chris > -- Greg Chase Director of Big Data Communities http://www.pivotal.io/big-data Pivotal Software http://www.pivotal.io/ 650-215-0477 @GregChase Blog: http://geekmarketing.biz/
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non binding) On 9/11/15, 10:02 AM, "Gautam Muralidhar"wrote: >+1 nonbinding > >Sent from my iPhone > >> On Sep 11, 2015, at 9:43 PM, Gregory Chase wrote: >> >> +1 nonbinding >> >>> On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles >>>wrote: >>> >>> -- >>> Chris >> >> >> >> -- >> Greg Chase >> >> Director of Big Data Communities >> http://www.pivotal.io/big-data >> >> Pivotal Software >> http://www.pivotal.io/ >> >> 650-215-0477 >> @GregChase >> Blog: http://geekmarketing.biz/ > >- >To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >For additional commands, e-mail: general-h...@incubator.apache.org > - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: > http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): > * doxypy 0.4.2 (GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > ( http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception, the community > has steadily added new
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 > On 12 Sep 2015, at 5:00 am, Don Bosco Duraiwrote: > > +1 (non binding) > > > On 9/11/15, 10:02 AM, "Gautam Muralidhar" > wrote: > >> +1 nonbinding >> >> Sent from my iPhone >> >>> On Sep 11, 2015, at 9:43 PM, Gregory Chase wrote: >>> >>> +1 nonbinding >>> On Fri, Sep 11, 2015 at 8:12 AM, Chris Rawles wrote: -- Chris >>> >>> >>> >>> -- >>> Greg Chase >>> >>> Director of Big Data Communities >>> http://www.pivotal.io/big-data >>> >>> Pivotal Software >>> http://www.pivotal.io/ >>> >>> 650-215-0477 >>> @GregChase >>> Blog: http://geekmarketing.biz/ >> >> - >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org > > > > - > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: > http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): > * doxypy 0.4.2 (GPL) > * doxygen 1.8.4 (GPL) > * TikZ-UML > * bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > ( http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception, the community > has steadily added new
Re: [VOTE] Accept MADlib into the Apache Incubator
-- Chris
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: >http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): >* doxypy 0.4.2 (GPL) >* doxygen 1.8.4 (GPL) >* TikZ-UML >* bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception, the
Re: [VOTE] Accept MADlib into the Apache Incubator
On Wed, Sep 9, 2015 at 7:37 PM, Roman Shaposhnikwrote: > Following the discussion earlier: >http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... +1 (binding) Thanks, Roman. - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Thu, Sep 10, 2015 at 12:53 PM, Rahul Iyerwrote: > +1 (non-binding) > > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > > > Following the discussion earlier: > >http://s.apache.org/TE6 > > > > I would like to call a VOTE for accepting > > MADlib community as a new ASF incubator > > project. > > > > The proposal is available at: > > https://wiki.apache.org/incubator/MADlibProposal > > and is also included at the bottom of this email. > > > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > > > [ ] +1 accept MADlib into the Apache Incubator > > [ ] ±0 > > [ ] -1 because... > > > > Thanks, > > Roman. > > > > == Abstract == > > MADlib is an open-source library (licensed under 2-clause BSD license) > > for scalable in-database analytics. It provides data-parallel > > implementations of mathematical, statistical and machine learning > > methods for structured and unstructured data. The MADlib mission is to > > foster widespread development of scalable analytic skills, by > > harnessing efforts from commercial practice, academic research, and > > open source development. > > > > MADlib occupies a unique niche in the realm of data science and > > machine learning libraries since its SQL APIs can allow it to work on > > a wide range of data stores and SQL engines. > > > > == Proposal == > > The current open source community behind MADlib feels that aligning > > itself with HAWQ's community, governance model, infrastructure and > > roadmap will allow the project to accelerate adoption and community > > growth. Given HAWQ's trajectory of entering Apache Software Foundation > > family as an Incubating project, we feel that the best course of > > action for MADlib is to follow a similar route. > > > > MADlib and HAWQ are complementary technologies in that MADlib > > in-database analytical functions can run within the HAWQ execution > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > > It is expected that contributors to MADlib will be cognizant of the > > HAWQ ASF project and may contribute to it as well. In short, > > collaboration between the two communities will make both projects more > > vibrant and advance the respective technologies in potentially novel > > directions. > > > > Contributors may also look at the HAWQ project as a starting port for > > ports to other parallel database engines. This proposal highly > > encourages this type of work as it would help to further realize the > > original cross-platform goal of MADlib as envisioned by its > > originators. > > > > Thus, the goal of this proposal is to bring the existing MADlib open > > source community into ASF, change the project's governance model to > > the "Apache Way" and transition the project's codebase and > > infrastructure into ASF INFRA. The community has agreed to transfer > > the brand name "MADlib" to Apache Software Foundation as well. > > > > Pivotal Inc. on behalf of the MADlib open source community is > > submitting this proposal to transition source code and associated > > artifacts (documentation, web site content, wiki, etc.) to the Apache > > Software Foundation Incubator under the Apache License, Version 2.0 > > and is asking Incubator PMC to established a MADlib incubating > > project. > > > > Currently MADlib uses a few category X licensed software tools during > > its build (mostly for generating documentation): > >* doxypy 0.4.2 (GPL) > >* doxygen 1.8.4 (GPL) > >* TikZ-UML > >* bison 2.4 (GPL, with an exception for generated output) > > We feel that this usage is compatible with an overall project licensed > > under the ALv2 and don't anticipate any changes. > > Our usage of LGPL library cern_root-5.34 is expected to go away since > > the 2 cern modules used are being entirely re-written > > in MADlib > > > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > > its binary artifact seems to be consistent with > > ASF recommendation for managing "weak copyleft" dependencies. > > > > > > == Background == > > MADlib grew out of discussions between database engine developers, > > data scientists, IT architects and academics interested in new > > approaches to scalable, sophisticated in-database analytics. These > > discussions were written up in a paper in VLDB 2009 that coined the > > term “MAD Skills” for data analysis > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > > project began the following year as a collaboration between > > researchers at UC Berkeley and engineers and data scientists at > > Pivotal (former EMC/Greenplum). > > > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > > University of Wisconsin, and the University of Florida. The project > > was publicly documented in a paper at VLDB 2012 > > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > > MADlib has contributors from around the world including
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding). On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: >http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): >* doxypy 0.4.2 (GPL) >* doxygen 1.8.4 (GPL) >* TikZ-UML >* bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception, the
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (non-binding) On Thu, Sep 10, 2015 at 3:57 PM, Caleb Weltonwrote: > +1 (non-binding) > > On Thu, Sep 10, 2015 at 12:53 PM, Rahul Iyer wrote: > > > +1 (non-binding) > > > > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > > > > > Following the discussion earlier: > > >http://s.apache.org/TE6 > > > > > > I would like to call a VOTE for accepting > > > MADlib community as a new ASF incubator > > > project. > > > > > > The proposal is available at: > > > https://wiki.apache.org/incubator/MADlibProposal > > > and is also included at the bottom of this email. > > > > > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > > > > > [ ] +1 accept MADlib into the Apache Incubator > > > [ ] ±0 > > > [ ] -1 because... > > > > > > Thanks, > > > Roman. > > > > > > == Abstract == > > > MADlib is an open-source library (licensed under 2-clause BSD license) > > > for scalable in-database analytics. It provides data-parallel > > > implementations of mathematical, statistical and machine learning > > > methods for structured and unstructured data. The MADlib mission is to > > > foster widespread development of scalable analytic skills, by > > > harnessing efforts from commercial practice, academic research, and > > > open source development. > > > > > > MADlib occupies a unique niche in the realm of data science and > > > machine learning libraries since its SQL APIs can allow it to work on > > > a wide range of data stores and SQL engines. > > > > > > == Proposal == > > > The current open source community behind MADlib feels that aligning > > > itself with HAWQ's community, governance model, infrastructure and > > > roadmap will allow the project to accelerate adoption and community > > > growth. Given HAWQ's trajectory of entering Apache Software Foundation > > > family as an Incubating project, we feel that the best course of > > > action for MADlib is to follow a similar route. > > > > > > MADlib and HAWQ are complementary technologies in that MADlib > > > in-database analytical functions can run within the HAWQ execution > > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > > > It is expected that contributors to MADlib will be cognizant of the > > > HAWQ ASF project and may contribute to it as well. In short, > > > collaboration between the two communities will make both projects more > > > vibrant and advance the respective technologies in potentially novel > > > directions. > > > > > > Contributors may also look at the HAWQ project as a starting port for > > > ports to other parallel database engines. This proposal highly > > > encourages this type of work as it would help to further realize the > > > original cross-platform goal of MADlib as envisioned by its > > > originators. > > > > > > Thus, the goal of this proposal is to bring the existing MADlib open > > > source community into ASF, change the project's governance model to > > > the "Apache Way" and transition the project's codebase and > > > infrastructure into ASF INFRA. The community has agreed to transfer > > > the brand name "MADlib" to Apache Software Foundation as well. > > > > > > Pivotal Inc. on behalf of the MADlib open source community is > > > submitting this proposal to transition source code and associated > > > artifacts (documentation, web site content, wiki, etc.) to the Apache > > > Software Foundation Incubator under the Apache License, Version 2.0 > > > and is asking Incubator PMC to established a MADlib incubating > > > project. > > > > > > Currently MADlib uses a few category X licensed software tools during > > > its build (mostly for generating documentation): > > >* doxypy 0.4.2 (GPL) > > >* doxygen 1.8.4 (GPL) > > >* TikZ-UML > > >* bison 2.4 (GPL, with an exception for generated output) > > > We feel that this usage is compatible with an overall project licensed > > > under the ALv2 and don't anticipate any changes. > > > Our usage of LGPL library cern_root-5.34 is expected to go away since > > > the 2 cern modules used are being entirely re-written > > > in MADlib > > > > > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > > > its binary artifact seems to be consistent with > > > ASF recommendation for managing "weak copyleft" dependencies. > > > > > > > > > == Background == > > > MADlib grew out of discussions between database engine developers, > > > data scientists, IT architects and academics interested in new > > > approaches to scalable, sophisticated in-database analytics. These > > > discussions were written up in a paper in VLDB 2009 that coined the > > > term “MAD Skills” for data analysis > > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > > > project began the following year as a collaboration between > > > researchers at UC Berkeley and engineers and data scientists at > > > Pivotal (former EMC/Greenplum). > > > > > > The initial MADlib codebase
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (binding) Julian - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (not binding) On Thu, Sep 10, 2015 at 5:19 AM, Atri Sharmawrote: > +1 > On 10 Sep 2015 08:11, "Konstantin Boudnik" wrote: > > > +1 (binding) > > > > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > > Following the discussion earlier: > > >http://s.apache.org/TE6 > > > > > > I would like to call a VOTE for accepting > > > MADlib community as a new ASF incubator > > > project. > > > > > > The proposal is available at: > > > https://wiki.apache.org/incubator/MADlibProposal > > > and is also included at the bottom of this email. > > > > > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > > > > > [ ] +1 accept MADlib into the Apache Incubator > > > [ ] ±0 > > > [ ] -1 because... > > > > > > Thanks, > > > Roman. > > > > > > == Abstract == > > > MADlib is an open-source library (licensed under 2-clause BSD license) > > > for scalable in-database analytics. It provides data-parallel > > > implementations of mathematical, statistical and machine learning > > > methods for structured and unstructured data. The MADlib mission is to > > > foster widespread development of scalable analytic skills, by > > > harnessing efforts from commercial practice, academic research, and > > > open source development. > > > > > > MADlib occupies a unique niche in the realm of data science and > > > machine learning libraries since its SQL APIs can allow it to work on > > > a wide range of data stores and SQL engines. > > > > > > == Proposal == > > > The current open source community behind MADlib feels that aligning > > > itself with HAWQ's community, governance model, infrastructure and > > > roadmap will allow the project to accelerate adoption and community > > > growth. Given HAWQ's trajectory of entering Apache Software Foundation > > > family as an Incubating project, we feel that the best course of > > > action for MADlib is to follow a similar route. > > > > > > MADlib and HAWQ are complementary technologies in that MADlib > > > in-database analytical functions can run within the HAWQ execution > > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > > > It is expected that contributors to MADlib will be cognizant of the > > > HAWQ ASF project and may contribute to it as well. In short, > > > collaboration between the two communities will make both projects more > > > vibrant and advance the respective technologies in potentially novel > > > directions. > > > > > > Contributors may also look at the HAWQ project as a starting port for > > > ports to other parallel database engines. This proposal highly > > > encourages this type of work as it would help to further realize the > > > original cross-platform goal of MADlib as envisioned by its > > > originators. > > > > > > Thus, the goal of this proposal is to bring the existing MADlib open > > > source community into ASF, change the project's governance model to > > > the "Apache Way" and transition the project's codebase and > > > infrastructure into ASF INFRA. The community has agreed to transfer > > > the brand name "MADlib" to Apache Software Foundation as well. > > > > > > Pivotal Inc. on behalf of the MADlib open source community is > > > submitting this proposal to transition source code and associated > > > artifacts (documentation, web site content, wiki, etc.) to the Apache > > > Software Foundation Incubator under the Apache License, Version 2.0 > > > and is asking Incubator PMC to established a MADlib incubating > > > project. > > > > > > Currently MADlib uses a few category X licensed software tools during > > > its build (mostly for generating documentation): > > >* doxypy 0.4.2 (GPL) > > >* doxygen 1.8.4 (GPL) > > >* TikZ-UML > > >* bison 2.4 (GPL, with an exception for generated output) > > > We feel that this usage is compatible with an overall project licensed > > > under the ALv2 and don't anticipate any changes. > > > Our usage of LGPL library cern_root-5.34 is expected to go away since > > > the 2 cern modules used are being entirely re-written > > > in MADlib > > > > > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > > > its binary artifact seems to be consistent with > > > ASF recommendation for managing "weak copyleft" dependencies. > > > > > > > > > == Background == > > > MADlib grew out of discussions between database engine developers, > > > data scientists, IT architects and academics interested in new > > > approaches to scalable, sophisticated in-database analytics. These > > > discussions were written up in a paper in VLDB 2009 that coined the > > > term “MAD Skills” for data analysis > > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > > > project began the following year as a collaboration between > > > researchers at UC Berkeley and engineers and data scientists at > > > Pivotal (former EMC/Greenplum). > > > > > > The initial MADlib codebase came from EMC/Greenplum, UC
[VOTE] Accept MADlib into the Apache Incubator
Following the discussion earlier: http://s.apache.org/TE6 I would like to call a VOTE for accepting MADlib community as a new ASF incubator project. The proposal is available at: https://wiki.apache.org/incubator/MADlibProposal and is also included at the bottom of this email. Vote is open until at least Mon, 14 September 2015, 23:59:00 PST [ ] +1 accept MADlib into the Apache Incubator [ ] ±0 [ ] -1 because... Thanks, Roman. == Abstract == MADlib is an open-source library (licensed under 2-clause BSD license) for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data. The MADlib mission is to foster widespread development of scalable analytic skills, by harnessing efforts from commercial practice, academic research, and open source development. MADlib occupies a unique niche in the realm of data science and machine learning libraries since its SQL APIs can allow it to work on a wide range of data stores and SQL engines. == Proposal == The current open source community behind MADlib feels that aligning itself with HAWQ's community, governance model, infrastructure and roadmap will allow the project to accelerate adoption and community growth. Given HAWQ's trajectory of entering Apache Software Foundation family as an Incubating project, we feel that the best course of action for MADlib is to follow a similar route. MADlib and HAWQ are complementary technologies in that MADlib in-database analytical functions can run within the HAWQ execution engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) It is expected that contributors to MADlib will be cognizant of the HAWQ ASF project and may contribute to it as well. In short, collaboration between the two communities will make both projects more vibrant and advance the respective technologies in potentially novel directions. Contributors may also look at the HAWQ project as a starting port for ports to other parallel database engines. This proposal highly encourages this type of work as it would help to further realize the original cross-platform goal of MADlib as envisioned by its originators. Thus, the goal of this proposal is to bring the existing MADlib open source community into ASF, change the project's governance model to the "Apache Way" and transition the project's codebase and infrastructure into ASF INFRA. The community has agreed to transfer the brand name "MADlib" to Apache Software Foundation as well. Pivotal Inc. on behalf of the MADlib open source community is submitting this proposal to transition source code and associated artifacts (documentation, web site content, wiki, etc.) to the Apache Software Foundation Incubator under the Apache License, Version 2.0 and is asking Incubator PMC to established a MADlib incubating project. Currently MADlib uses a few category X licensed software tools during its build (mostly for generating documentation): * doxypy 0.4.2 (GPL) * doxygen 1.8.4 (GPL) * TikZ-UML * bison 2.4 (GPL, with an exception for generated output) We feel that this usage is compatible with an overall project licensed under the ALv2 and don't anticipate any changes. Our usage of LGPL library cern_root-5.34 is expected to go away since the 2 cern modules used are being entirely re-written in MADlib Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into its binary artifact seems to be consistent with ASF recommendation for managing "weak copyleft" dependencies. == Background == MADlib grew out of discussions between database engine developers, data scientists, IT architects and academics interested in new approaches to scalable, sophisticated in-database analytics. These discussions were written up in a paper in VLDB 2009 that coined the term “MAD Skills” for data analysis (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software project began the following year as a collaboration between researchers at UC Berkeley and engineers and data scientists at Pivotal (former EMC/Greenplum). The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the University of Wisconsin, and the University of Florida. The project was publicly documented in a paper at VLDB 2012 (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today MADlib has contributors from around the world including both individuals and institutions. For example, recent contributions have come from Pivotal, Stanford University, and the University of Illinois at Chicago. MADlib was conceived from the outset as a free, open source library for all to use and contribute to. Since its inception, the community has steadily added new methods in the areas of mathematics, statistics, machine learning, and data transformation. The current library includes over 30 principle algorithms as well as many additional operators and utility functions. The methods in MADlib are designed
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 (binding) On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > Following the discussion earlier: >http://s.apache.org/TE6 > > I would like to call a VOTE for accepting > MADlib community as a new ASF incubator > project. > > The proposal is available at: > https://wiki.apache.org/incubator/MADlibProposal > and is also included at the bottom of this email. > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > [ ] +1 accept MADlib into the Apache Incubator > [ ] ±0 > [ ] -1 because... > > Thanks, > Roman. > > == Abstract == > MADlib is an open-source library (licensed under 2-clause BSD license) > for scalable in-database analytics. It provides data-parallel > implementations of mathematical, statistical and machine learning > methods for structured and unstructured data. The MADlib mission is to > foster widespread development of scalable analytic skills, by > harnessing efforts from commercial practice, academic research, and > open source development. > > MADlib occupies a unique niche in the realm of data science and > machine learning libraries since its SQL APIs can allow it to work on > a wide range of data stores and SQL engines. > > == Proposal == > The current open source community behind MADlib feels that aligning > itself with HAWQ's community, governance model, infrastructure and > roadmap will allow the project to accelerate adoption and community > growth. Given HAWQ's trajectory of entering Apache Software Foundation > family as an Incubating project, we feel that the best course of > action for MADlib is to follow a similar route. > > MADlib and HAWQ are complementary technologies in that MADlib > in-database analytical functions can run within the HAWQ execution > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > It is expected that contributors to MADlib will be cognizant of the > HAWQ ASF project and may contribute to it as well. In short, > collaboration between the two communities will make both projects more > vibrant and advance the respective technologies in potentially novel > directions. > > Contributors may also look at the HAWQ project as a starting port for > ports to other parallel database engines. This proposal highly > encourages this type of work as it would help to further realize the > original cross-platform goal of MADlib as envisioned by its > originators. > > Thus, the goal of this proposal is to bring the existing MADlib open > source community into ASF, change the project's governance model to > the "Apache Way" and transition the project's codebase and > infrastructure into ASF INFRA. The community has agreed to transfer > the brand name "MADlib" to Apache Software Foundation as well. > > Pivotal Inc. on behalf of the MADlib open source community is > submitting this proposal to transition source code and associated > artifacts (documentation, web site content, wiki, etc.) to the Apache > Software Foundation Incubator under the Apache License, Version 2.0 > and is asking Incubator PMC to established a MADlib incubating > project. > > Currently MADlib uses a few category X licensed software tools during > its build (mostly for generating documentation): >* doxypy 0.4.2 (GPL) >* doxygen 1.8.4 (GPL) >* TikZ-UML >* bison 2.4 (GPL, with an exception for generated output) > We feel that this usage is compatible with an overall project licensed > under the ALv2 and don't anticipate any changes. > Our usage of LGPL library cern_root-5.34 is expected to go away since > the 2 cern modules used are being entirely re-written > in MADlib > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > its binary artifact seems to be consistent with > ASF recommendation for managing "weak copyleft" dependencies. > > > == Background == > MADlib grew out of discussions between database engine developers, > data scientists, IT architects and academics interested in new > approaches to scalable, sophisticated in-database analytics. These > discussions were written up in a paper in VLDB 2009 that coined the > term “MAD Skills” for data analysis > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > project began the following year as a collaboration between > researchers at UC Berkeley and engineers and data scientists at > Pivotal (former EMC/Greenplum). > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > University of Wisconsin, and the University of Florida. The project > was publicly documented in a paper at VLDB 2012 > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > MADlib has contributors from around the world including both > individuals and institutions. For example, recent contributions have > come from Pivotal, Stanford University, and the University of Illinois > at Chicago. > > MADlib was conceived from the outset as a free, open source library > for all to use and contribute to. Since its inception,
Re: [VOTE] Accept MADlib into the Apache Incubator
+1 On 10 Sep 2015 08:11, "Konstantin Boudnik"wrote: > +1 (binding) > > On Wed, Sep 09, 2015 at 07:37PM, Roman Shaposhnik wrote: > > Following the discussion earlier: > >http://s.apache.org/TE6 > > > > I would like to call a VOTE for accepting > > MADlib community as a new ASF incubator > > project. > > > > The proposal is available at: > > https://wiki.apache.org/incubator/MADlibProposal > > and is also included at the bottom of this email. > > > > Vote is open until at least Mon, 14 September 2015, 23:59:00 PST > > > > [ ] +1 accept MADlib into the Apache Incubator > > [ ] ±0 > > [ ] -1 because... > > > > Thanks, > > Roman. > > > > == Abstract == > > MADlib is an open-source library (licensed under 2-clause BSD license) > > for scalable in-database analytics. It provides data-parallel > > implementations of mathematical, statistical and machine learning > > methods for structured and unstructured data. The MADlib mission is to > > foster widespread development of scalable analytic skills, by > > harnessing efforts from commercial practice, academic research, and > > open source development. > > > > MADlib occupies a unique niche in the realm of data science and > > machine learning libraries since its SQL APIs can allow it to work on > > a wide range of data stores and SQL engines. > > > > == Proposal == > > The current open source community behind MADlib feels that aligning > > itself with HAWQ's community, governance model, infrastructure and > > roadmap will allow the project to accelerate adoption and community > > growth. Given HAWQ's trajectory of entering Apache Software Foundation > > family as an Incubating project, we feel that the best course of > > action for MADlib is to follow a similar route. > > > > MADlib and HAWQ are complementary technologies in that MADlib > > in-database analytical functions can run within the HAWQ execution > > engine. (MADlib also runs on Greenplum Database and PostgreSQL today.) > > It is expected that contributors to MADlib will be cognizant of the > > HAWQ ASF project and may contribute to it as well. In short, > > collaboration between the two communities will make both projects more > > vibrant and advance the respective technologies in potentially novel > > directions. > > > > Contributors may also look at the HAWQ project as a starting port for > > ports to other parallel database engines. This proposal highly > > encourages this type of work as it would help to further realize the > > original cross-platform goal of MADlib as envisioned by its > > originators. > > > > Thus, the goal of this proposal is to bring the existing MADlib open > > source community into ASF, change the project's governance model to > > the "Apache Way" and transition the project's codebase and > > infrastructure into ASF INFRA. The community has agreed to transfer > > the brand name "MADlib" to Apache Software Foundation as well. > > > > Pivotal Inc. on behalf of the MADlib open source community is > > submitting this proposal to transition source code and associated > > artifacts (documentation, web site content, wiki, etc.) to the Apache > > Software Foundation Incubator under the Apache License, Version 2.0 > > and is asking Incubator PMC to established a MADlib incubating > > project. > > > > Currently MADlib uses a few category X licensed software tools during > > its build (mostly for generating documentation): > >* doxypy 0.4.2 (GPL) > >* doxygen 1.8.4 (GPL) > >* TikZ-UML > >* bison 2.4 (GPL, with an exception for generated output) > > We feel that this usage is compatible with an overall project licensed > > under the ALv2 and don't anticipate any changes. > > Our usage of LGPL library cern_root-5.34 is expected to go away since > > the 2 cern modules used are being entirely re-written > > in MADlib > > > > Finally, MADlib inclusion of MPL licensed library (eigen 3.2.2) into > > its binary artifact seems to be consistent with > > ASF recommendation for managing "weak copyleft" dependencies. > > > > > > == Background == > > MADlib grew out of discussions between database engine developers, > > data scientists, IT architects and academics interested in new > > approaches to scalable, sophisticated in-database analytics. These > > discussions were written up in a paper in VLDB 2009 that coined the > > term “MAD Skills” for data analysis > > (http://dl.acm.org/citation.cfm?id=1687576). The MADlib software > > project began the following year as a collaboration between > > researchers at UC Berkeley and engineers and data scientists at > > Pivotal (former EMC/Greenplum). > > > > The initial MADlib codebase came from EMC/Greenplum, UC Berkeley, the > > University of Wisconsin, and the University of Florida. The project > > was publicly documented in a paper at VLDB 2012 > > (http://vldb.org/pvldb/vol5/p1700_joehellerstein_vldb2012.pdf). Today > > MADlib has contributors from around the world including both > > individuals and