+1 (non-binding)

On 2024/03/01 18:08:26 Daniël Heres wrote:
> +1 (binding)
> 
> On Fri, Mar 1, 2024, 19:05 Chao Sun <sunc...@apache.org> wrote:
> 
> > +1 (non-binding)
> >
> > On Fri, Mar 1, 2024 at 9:59 AM QP Hou <q...@neuralink.com> wrote:
> >
> > > +1 (binding)
> > >
> > > exciting milestone :)
> > >
> > > On Fri, Mar 1, 2024 at 9:49 AM David Li <lidav...@apache.org> wrote:
> > > >
> > > > +1
> > > >
> > > > On Fri, Mar 1, 2024, at 12:06, Jorge Cardoso Leitão wrote:
> > > > > +1 - great work!!!
> > > > >
> > > > > On Fri, Mar 1, 2024 at 5:49 PM Micah Kornfield <
> > emkornfi...@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> +1 (binding)
> > > > >>
> > > > >> On Friday, March 1, 2024, Uwe L. Korn <uw...@xhochy.com> wrote:
> > > > >>
> > > > >> > +1 (binding)
> > > > >> >
> > > > >> > On Fri, Mar 1, 2024, at 2:37 PM, Andy Grove wrote:
> > > > >> > > +1 (binding)
> > > > >> > >
> > > > >> > > On Fri, Mar 1, 2024 at 6:20 AM Weston Pace <
> > weston.p...@gmail.com
> > > >
> > > > >> > wrote:
> > > > >> > >
> > > > >> > >> +1 (binding)
> > > > >> > >>
> > > > >> > >> On Fri, Mar 1, 2024 at 3:33 AM Andrew Lamb <
> > al...@influxdata.com
> > > >
> > > > >> > wrote:
> > > > >> > >>
> > > > >> > >> > Hello,
> > > > >> > >> >
> > > > >> > >> > As we have discussed[1][2] I would like to vote on the
> > > proposal to
> > > > >> > >> > create a new Apache Top Level Project for DataFusion. The
> > text
> > > of
> > > > >> the
> > > > >> > >> > proposed resolution and background document is copy/pasted
> > > below
> > > > >> > >> >
> > > > >> > >> > If the community is in favor of this, we plan to submit the
> > > > >> resolution
> > > > >> > >> > to the ASF board for approval with the next Arrow report (for
> > > the
> > > > >> > >> > April 2024 board meeting).
> > > > >> > >> >
> > > > >> > >> > The vote will be open for at least 7 days.
> > > > >> > >> >
> > > > >> > >> > [ ] +1 Accept this Proposal
> > > > >> > >> > [ ] +0
> > > > >> > >> > [ ] -1 Do not accept this proposal because...
> > > > >> > >> >
> > > > >> > >> > Andrew
> > > > >> > >> >
> > > > >> > >> > [1]
> > > > >> https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> > > > >> > >> > [2]
> > > https://github.com/apache/arrow-datafusion/discussions/6475
> > > > >> > >> >
> > > > >> > >> > ---------- Proposed Resolution ---------
> > > > >> > >> >
> > > > >> > >> > Resolution to Create the Apache DataFusion Project from the
> > > Apache
> > > > >> > >> > Arrow DataFusion Sub Project
> > > > >> > >> >
> > > > >> > >> > =============================================================
> > > > >> > >> >
> > > > >> > >> > X. Establish the Apache DataFusion Project
> > > > >> > >> >
> > > > >> > >> > WHEREAS, the Board of Directors deems it to be in the best
> > > > >> > >> > interests of the Foundation and consistent with the
> > > > >> > >> > Foundation's purpose to establish a Project Management
> > > > >> > >> > Committee charged with the creation and maintenance of
> > > > >> > >> > open-source software related to an extensible query engine
> > > > >> > >> > for distribution at no charge to the public.
> > > > >> > >> >
> > > > >> > >> > NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> > > > >> > >> > Committee (PMC), to be known as the "Apache DataFusion
> > > Project",
> > > > >> > >> > be and hereby is established pursuant to Bylaws of the
> > > > >> > >> > Foundation; and be it further
> > > > >> > >> >
> > > > >> > >> > RESOLVED, that the Apache DataFusion Project be and hereby is
> > > > >> > >> > responsible for the creation and maintenance of software
> > > > >> > >> > related to an extensible query engine; and be it further
> > > > >> > >> >
> > > > >> > >> > RESOLVED, that the office of "Vice President, Apache
> > > DataFusion" be
> > > > >> > >> > and hereby is created, the person holding such office to
> > > > >> > >> > serve at the direction of the Board of Directors as the chair
> > > > >> > >> > of the Apache DataFusion Project, and to have primary
> > > responsibility
> > > > >> > >> > for management of the projects within the scope of
> > > > >> > >> > responsibility of the Apache DataFusion Project; and be it
> > > further
> > > > >> > >> >
> > > > >> > >> > RESOLVED, that the persons listed immediately below be and
> > > > >> > >> > hereby are appointed to serve as the initial members of the
> > > > >> > >> > Apache DataFusion Project:
> > > > >> > >> >
> > > > >> > >> > * Andy Grove (agr...@apache.org)
> > > > >> > >> > * Andrew Lamb (al...@apache.org)
> > > > >> > >> > * Daniël Heres (dhe...@apache.org)
> > > > >> > >> > * Jie Wen (jake...@apache.org)
> > > > >> > >> > * Kun Liu (liu...@apache.org)
> > > > >> > >> > * Liang-Chi Hsieh (vii...@apache.org)
> > > > >> > >> > * Qingping Hou: (ho...@apache.org)
> > > > >> > >> > * Wes McKinney(w...@apache.org)
> > > > >> > >> > * Will Jones (wjones...@apache.org)
> > > > >> > >> >
> > > > >> > >> > RESOLVED, that the Apache DataFusion Project be and hereby
> > > > >> > >> > is tasked with the migration and rationalization of the
> > Apache
> > > > >> > >> > Arrow DataFusion sub-project; and be it further
> > > > >> > >> >
> > > > >> > >> > RESOLVED, that all responsibilities pertaining to the Apache
> > > > >> > >> > Arrow DataFusion sub-project encumbered upon the
> > > > >> > >> > Apache Arrow Project are hereafter discharged.
> > > > >> > >> >
> > > > >> > >> > NOW, THEREFORE, BE IT FURTHER RESOLVED, that Andrew Lamb
> > > > >> > >> > be appointed to the office of Vice President, Apache
> > > DataFusion, to
> > > > >> > >> > serve in accordance with and subject to the direction of the
> > > > >> > >> > Board of Directors and the Bylaws of the Foundation until
> > > > >> > >> > death, resignation, retirement, removal or disqualification,
> > > > >> > >> > or until a successor is appointed.
> > > > >> > >> > =============================================================
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > -------
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > Summary:
> > > > >> > >> >
> > > > >> > >> > We propose creating a new top level project, Apache
> > > DataFusion, from
> > > > >> > >> > an existing sub project of Apache Arrow to facilitate
> > > additional
> > > > >> > >> > community and project growth.
> > > > >> > >> >
> > > > >> > >> > Abstract
> > > > >> > >> >
> > > > >> > >> > Apache Arrow DataFusion[1]  is a very fast, extensible query
> > > engine
> > > > >> > >> > for building high-quality data-centric systems in Rust, using
> > > the
> > > > >> > >> > Apache Arrow in-memory format. DataFusion offers SQL and
> > > Dataframe
> > > > >> > >> > APIs, excellent performance, built-in support for CSV,
> > Parquet,
> > > > >> JSON,
> > > > >> > >> > and Avro, extensive customization, and a great community.
> > > > >> > >> >
> > > > >> > >> > [1] https://arrow.apache.org/datafusion/
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > Proposal
> > > > >> > >> >
> > > > >> > >> > We propose creating a new top level ASF project, Apache
> > > DataFusion,
> > > > >> > >> > governed initially by a subset of the Apache Arrow project’s
> > > PMC and
> > > > >> > >> > committers. The project’s code is in five existing git
> > > repositories,
> > > > >> > >> > currently governed by Apache Arrow which would transfer to
> > the
> > > new
> > > > >> top
> > > > >> > >> > level project.
> > > > >> > >> >
> > > > >> > >> > Background
> > > > >> > >> >
> > > > >> > >> > When DataFusion was initially donated to the Arrow project,
> > it
> > > did
> > > > >> not
> > > > >> > >> > have a strong enough community to stand on its own. It has
> > > since
> > > > >> grown
> > > > >> > >> > significantly, and benefited immensely from being part of
> > > Arrow and
> > > > >> > >> > nurturing of the Apache Way, and now has a community strong
> > > enough
> > > > >> to
> > > > >> > >> > stand on its own and that would benefit from focused
> > governance
> > > > >> > >> > attention.
> > > > >> > >> >
> > > > >> > >> > The community has discussed this idea publicly for more than
> > 6
> > > > >> months
> > > > >> > >> > https://github.com/apache/arrow-datafusion/discussions/6475
> > > and
> > > > >> > >> > briefly on the Arrow PMC mailing list
> > > > >> > >> >
> > > https://lists.apache.org/thread/thv2jdm6640l6gm88hy8jhk5prjww0cs.
> > > > >> As
> > > > >> > >> > of the time of this writing both had exclusively positive
> > > reactions.
> > > > >> > >> >
> > > > >> > >> > Several current members of the Arrow PMC are both active
> > > > >> contributors
> > > > >> > >> > to DataFusion and understand and believe deeply in the Apache
> > > Way,
> > > > >> and
> > > > >> > >> > play active governance roles in the Arrow project as PMC
> > > members and
> > > > >> > >> > PMC chairs, guiding the community, and releasing software
> > > versions.
> > > > >> > >> > With this existing governance experience and structure, the
> > > new top
> > > > >> > >> > level project will be able to function well immediately and
> > > > >> > >> > independently.
> > > > >> > >> >
> > > > >> > >> > Overview of DataFusion
> > > > >> > >> >
> > > > >> > >> > Current Status
> > > > >> > >> >
> > > > >> > >> > Meritocracy
> > > > >> > >> >
> > > > >> > >> > DataFusion has been developed as part of Apache Arrow and
> > thus
> > > has
> > > > >> > >> > been operating as a meritocracy. Many of the developers of
> > > > >> DataFusion
> > > > >> > >> > are Arrow PMC members or committers. The DataFusion project
> > > plans to
> > > > >> > >> > continue adding new PMC and committers as the project matures
> > > and
> > > > >> > >> > grows.
> > > > >> > >> >
> > > > >> > >> > Community
> > > > >> > >> >
> > > > >> > >> > The DataFusion development team seeks to foster the
> > > development and
> > > > >> > >> > user communities. We hope that becoming a separate project
> > > will help
> > > > >> > >> > both Arrow and DataFusion communities by being more focused.
> > > > >> Focused
> > > > >> > >> > governance will make it easier to grow the community of
> > > committers
> > > > >> and
> > > > >> > >> > PMC members and make the organization more clear to others.
> > > > >> > >> >
> > > > >> > >> > Alignment
> > > > >> > >> >
> > > > >> > >> > The ASF is a natural host for DataFusion given that it is
> > > already
> > > > >> the
> > > > >> > >> > home of Arrow, Parquet, and other related distributed system,
> > > > >> storage
> > > > >> > >> > and query execution systems.
> > > > >> > >> >
> > > > >> > >> > Project Leadership
> > > > >> > >> >
> > > > >> > >> > Proposed Initial PMC
> > > > >> > >> >
> > > > >> > >> > We propose the following people as the initial DataFusion PMC
> > > > >> members.
> > > > >> > >> > This is a subset of the existing Arrow PMC members who
> > > contribute to
> > > > >> > >> > DataFusion
> > https://people.apache.org/phonebook.html?unix=arrow
> > > > >> > >> >
> > > > >> > >> > Andy Grove (agrove):  Arrow PMC Chair
> > > > >> > >> > Andrew Lamb (alamb): Arrow PMC, past Arrow PMC Chair
> > > > >> > >> > Daniël Heres (dheres) Arrow PMC
> > > > >> > >> > Jie Wen (jakevin):  Arrow PMC, Doris Committer
> > > > >> > >> > Kun Liu (liukun): Arrow PMC, IoTDB PMC, TSFile PMC
> > > > >> > >> > Liang-Chi Hsieh (viirya): Arrow PMC, Spark PMC
> > > > >> > >> > Qingping Hou: (houqp): Arrow PMC
> > > > >> > >> > Wes McKinney(wesm): Arrow PMC, ASF Member
> > > > >> > >> > Will Jones (wjones127): Arrow PMC
> > > > >> > >> >
> > > > >> > >> > We’d like to propose Andrew Lamb as the initial Chair, (and
> > > thus ASF
> > > > >> > >> > VP) for the DataFusion project.
> > > > >> > >> >
> > > > >> > >> > Affiliations
> > > > >> > >> >
> > > > >> > >> > Andy Grove (agrove):  NVidia
> > > > >> > >> > Andrew Lamb (alamb): InfluxData
> > > > >> > >> > Daniël Heres (dheres): Coralogix
> > > > >> > >> > Jie Wen (jakevin): SelectDB
> > > > >> > >> > Kun Liu (liukun): Ebay
> > > > >> > >> > Liang-Chi Hsieh (viirya): Apple
> > > > >> > >> > Qingping Hou: (houqp): Scribd
> > > > >> > >> > Wes McKinney(wesm): Posit
> > > > >> > >> > Will Jones (wjones127): LanceDB
> > > > >> > >> >
> > > > >> > >> > Proposed Initial Committers
> > > > >> > >> >
> > > > >> > >> > In addition to the PMC, we propose the following people as
> > the
> > > > >> initial
> > > > >> > >> > DataFusion committers. This is a subset of the existing Arrow
> > > > >> > >> > committers who contribute to DataFusion
> > > > >> > >> > https://people.apache.org/phonebook.html?unix=arrow
> > > > >> > >> >
> > > > >> > >> > akurmustafa Mustafa Akur (Synnada)
> > > > >> > >> > avantgardner Brent Gardner (Coralogix)
> > > > >> > >> > comphead Oleks V. (Unaffiliated)
> > > > >> > >> > jayzhan Jay Zhan (Unaffiliated)
> > > > >> > >> > jeffreyvo Jeffry Vo (Unaffiliated)
> > > > >> > >> > jiayuliu Liu Jiayu (Airbnb)
> > > > >> > >> > mete Metehan Yildirim (Synnada)
> > > > >> > >> > mingmwang Wang Mingming (Ebay)
> > > > >> > >> > mneumann Marco Neumann (InfluxData)
> > > > >> > >> > nju_yaho Zhong Yanghong (Ebay)
> > > > >> > >> > ozankabak Mehmet Ozan Kabak (Synnada)
> > > > >> > >> > paddyhoran Paddy Horan (Assured Allies)
> > > > >> > >> > rdettai Rémi Dettai (Cloudfuse)
> > > > >> > >> > sunchao Chao Sun (Apple)
> > > > >> > >> > thinkharderdev Daniel Harris (Coralogix)
> > > > >> > >> > tustvold Raphael Taylor-Davies (InfluxData)
> > > > >> > >> > wayne Ruihang Xia (Greptime)
> > > > >> > >> > xudong963 Xudong Wang (ByteDance)
> > > > >> > >> > yjshen Yijie Shen (Space and Time)
> > > > >> > >> > yangjiang Yang Jiang (ebay)
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > Risk Assessments
> > > > >> > >> >
> > > > >> > >> > Naming / Trademarks
> > > > >> > >> >
> > > > >> > >> > As a sub-project of Arrow, the DataFusion name has been used
> > > for
> > > > >> over
> > > > >> > >> > 4 years without any known issues. A podling name search did
> > > not turn
> > > > >> > >> > up any concerns and was approved:
> > > > >> > >> > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > > > >> > >> >
> > > > >> > >> > Legal / IP Clearance
> > > > >> > >> >
> > > > >> > >> > All DataFusion code has either been donated to the Arrow
> > > project
> > > > >> with
> > > > >> > >> > appropriate IP clearance or  has been developed directly
> > under
> > > ASF
> > > > >> > >> > processes and procedures. Thus creating a new top level
> > project
> > > > >> poses
> > > > >> > >> > no new Legal or IP risks.
> > > > >> > >> >
> > > > >> > >> > Code Extraction
> > > > >> > >> >
> > > > >> > >> > The relevant code is already in 5 separate repositories:
> > > > >> > >> > https://github.com/apache/arrow-datafusion/
> > > > >> > >> > https://github.com/apache/arrow-datafusion-python
> > > > >> > >> > https://github.com/apache/arrow-ballista
> > > > >> > >> > https://github.com/apache/arrow-ballista-python
> > > > >> > >> > https://github.com/apache/arrow-datafusion-comet
> > > > >> > >> >
> > > > >> > >> > We foresee no issues with code extraction and propose these
> > > > >> > >> > repositories be  renamed to reflect top level projects
> > > > >> > >> >
> > > > >> > >> > Note:  https://github.com/apache/arrow-rs, the Rust
> > > implementation
> > > > >> of
> > > > >> > >> > Arrow, would remain part of the Arrow project.
> > > > >> > >> >
> > > > >> > >> > Orphaned Products
> > > > >> > >> >
> > > > >> > >> > DataFusion is known to be used in many open source and
> > > commercial
> > > > >> > >> > projects
> > > > >> > >> >
> > > > >> > >> https://arrow.apache.org/datafusion/user-guide/
> > > > >> > introduction.html#known-users
> > > > >> > >> > ,
> > > > >> > >> > has had multiple commits daily for several years, and its
> > > adoption
> > > > >> and
> > > > >> > >> > number of contributors appears to be growing. We do not
> > > foresee the
> > > > >> > >> > project being orphaned in the next several years.
> > > > >> > >> >
> > > > >> > >> > Inexperience with Open Source
> > > > >> > >> >
> > > > >> > >> > The proposed PMC has extensive experience with Apache Arrow
> > and
> > > > >> other
> > > > >> > >> > Apache projects, and includes PMC members, PMC chairs and an
> > > ASF
> > > > >> > >> > Member. The DataFusion PMC and more experienced committers
> > will
> > > > >> > >> > continue to coach new community members who may be less
> > > familiar
> > > > >> with
> > > > >> > >> > the Apache Way.
> > > > >> > >> >
> > > > >> > >> > Homogeneous Developers
> > > > >> > >> >
> > > > >> > >> > The 9 proposed PMC members are from 9 different employers and
> > > the
> > > > >> > >> > proposed committers are similarly distributed across
> > > affiliations.
> > > > >> No
> > > > >> > >> > specific entity employs more than 3 total proposed
> > developers.
> > > > >> > >> >
> > > > >> > >> > Reliance on Salaried Developers
> > > > >> > >> >
> > > > >> > >> > A substantial amount of work on DataFusion has been by
> > salaried
> > > > >> > >> > developers, but it also has a long tradition of attracting
> > > > >> > >> > contributions from students and hobbyists and we plan no
> > > changes in
> > > > >> > >> > contribution structure.
> > > > >> > >> >
> > > > >> > >> > Relationships with Other Apache Products
> > > > >> > >> >
> > > > >> > >> > DataFusion will obviously have a strong relationship with the
> > > Arrow
> > > > >> > >> > project given the overlap in people. We don’t foresee close
> > > > >> > >> > collaboration with other projects at this time.
> > > > >> > >> >
> > > > >> > >> > Cryptography
> > > > >> > >> >
> > > > >> > >> > DataFusion does not directly support encryption and there are
> > > no
> > > > >> > >> > near-term plans to add support for encryption. Users who need
> > > this
> > > > >> > >> > functionality can use the extension APIs.
> > > > >> > >> >
> > > > >> > >> > Required Resources
> > > > >> > >> >
> > > > >> > >> > Mailing Lists
> > > > >> > >> >
> > > > >> > >> > - priv...@datafusion.apache.org for private PMC discussions
> > > (with
> > > > >> > >> > moderated subscriptions)
> > > > >> > >> > - d...@datafusion.apache.org
> > > > >> > >> > - comm...@datafusion.apache.org
> > > > >> > >> > - u...@datafusion.apache.org
> > > > >> > >> >
> > > > >> > >> > Version Control
> > > > >> > >> >
> > > > >> > >> > We propose to continue to use git for source control and
> > > github for
> > > > >> > >> > hosting and testing resources.
> > > > >> > >> >
> > > > >> > >> > We also need to rename the github repositories to reflect the
> > > new
> > > > >> top
> > > > >> > >> > level names:
> > > > >> > >> >
> > > > >> > >> > https://github.com/apache/arrow-datafusion/ →
> > > apache/datafusion
> > > > >> > >> > https://github.com/apache/arrow-datafusion-python →
> > > > >> > >> > apache/datafusion-python
> > > > >> > >> > https://github.com/apache/arrow-ballista →
> > > > >> apache/datafusion-ballista
> > > > >> > >> > https://github.com/apache/arrow-ballista-python  →
> > > > >> > >> > apache/datafusion-ballista-python
> > > > >> > >> > https://github.com/apache/arrow-datafusion-comet →
> > > > >> > >> apache/datafusion-comet
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > Issue Tracking
> > > > >> > >> >
> > > > >> > >> > DataFusion would continue to use github for its issue
> > tracking
> > > and
> > > > >> > >> > communications
> > > > >> > >> >
> > > > >> > >> > Other Resources
> > > > >> > >> >
> > > > >> > >> > The existing repositories already make use of existing Apache
> > > > >> > >> > infrastructure, and we expect no change in the initial
> > resource
> > > > >> usage.
> > > > >> > >> > As the project continues to grow, we expect continued
> > > infrastructure
> > > > >> > >> > demand growth.
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > FAQ: Has a sub project been promoted to a top level project
> > > before?
> > > > >> > >> >
> > > > >> > >> > Yes, and it appears to happen commonly. The Arrow project
> > > itself was
> > > > >> > >> > created as a top level project from work that started in
> > Apache
> > > > >> Drill,
> > > > >> > >> > and there are many sub projects of Hadoop that spun out as
> > > their own
> > > > >> > >> > top level projects such as Mahout, Avro and HBase:
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> https://news.apache.org/foundation/entry/the_apache_
> > > > >> > software_foundation_announces4
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > Related material:
> > > > >> > >> > Name search request / research for DataFusion:
> > > > >> > >> > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > > > >> > >> > Discussion about this proposal on the arrow mailing list:
> > > > >> > >> >
> > > https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> > > > >> > >> > Discussion about which repositories on the arrow mailing
> > list:
> > > > >> > >> >
> > > https://lists.apache.org/thread/ob3n0d9ky0bgrryl3xn39w9k566bq00q
> > > > >> > >> > Discussion about initial PMC on the arrow mailing list:
> > > > >> > >> >
> > > https://lists.apache.org/thread/pymrzcdw4qdptvby85f69rg3pcckl15b
> > > > >> > >> > Discussion in github about creating a new DataFusion top
> > level
> > > > >> > >> > project:
> > > > >> https://github.com/apache/arrow-datafusion/discussions/6475
> > > > >> > >> > Discussion about graduating on incubator list:
> > > > >> > >> >
> > > https://lists.apache.org/thread/r4n73pmms1lv0jbohyx1o1z13d615t99
> > > > >> > >> > Original Proposal for the Arrow project:
> > > > >> > >> >
> > > https://lists.apache.org/thread/x2qzdwglm8pkqp9gv03bbgw17khl7pq3
> > > > >> > >> >
> > > > >> > >>
> > > > >> >
> > > > >>
> > >
> >
> 

Reply via email to