+1 (binding)

On Sun, Mar 3, 2024 at 09:43 Wayne Xia <waynest...@gmail.com> wrote:

> +1 (non-binding)
>
> Regards,
> Wayne
>
> Julian Hyde <jhyde.apa...@gmail.com> 于 2024年3月4日周一 上午1:38写道:
>
> > +1 (binding)
> >
> > > On Mar 2, 2024, at 2:28 PM, Dewey Dunnington
> > <de...@voltrondata.com.invalid> wrote:
> > >
> > > +1 (binding)
> > >
> > >> On Sat, Mar 2, 2024 at 8:08 AM vin jake <jakevin...@gmail.com> wrote:
> > >>
> > >> +1 (binding)
> > >>
> > >>> On Fri, Mar 1, 2024 at 7:33 PM Andrew Lamb <al...@influxdata.com>
> > wrote:
> > >>>
> > >>> Hello,
> > >>>
> > >>> As we have discussed[1][2] I would like to vote on the proposal to
> > >>> create a new Apache Top Level Project for DataFusion. The text of the
> > >>> proposed resolution and background document is copy/pasted below
> > >>>
> > >>> If the community is in favor of this, we plan to submit the
> resolution
> > >>> to the ASF board for approval with the next Arrow report (for the
> > >>> April 2024 board meeting).
> > >>>
> > >>> The vote will be open for at least 7 days.
> > >>>
> > >>> [ ] +1 Accept this Proposal
> > >>> [ ] +0
> > >>> [ ] -1 Do not accept this proposal because...
> > >>>
> > >>> Andrew
> > >>>
> > >>> [1] https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> > >>> [2] https://github.com/apache/arrow-datafusion/discussions/6475
> > >>>
> > >>> ---------- Proposed Resolution ---------
> > >>>
> > >>> Resolution to Create the Apache DataFusion Project from the Apache
> > >>> Arrow DataFusion Sub Project
> > >>>
> > >>> =============================================================
> > >>>
> > >>> X. Establish the Apache DataFusion Project
> > >>>
> > >>> WHEREAS, the Board of Directors deems it to be in the best
> > >>> interests of the Foundation and consistent with the
> > >>> Foundation's purpose to establish a Project Management
> > >>> Committee charged with the creation and maintenance of
> > >>> open-source software related to an extensible query engine
> > >>> for distribution at no charge to the public.
> > >>>
> > >>> NOW, THEREFORE, BE IT RESOLVED, that a Project Management
> > >>> Committee (PMC), to be known as the "Apache DataFusion Project",
> > >>> be and hereby is established pursuant to Bylaws of the
> > >>> Foundation; and be it further
> > >>>
> > >>> RESOLVED, that the Apache DataFusion Project be and hereby is
> > >>> responsible for the creation and maintenance of software
> > >>> related to an extensible query engine; and be it further
> > >>>
> > >>> RESOLVED, that the office of "Vice President, Apache DataFusion" be
> > >>> and hereby is created, the person holding such office to
> > >>> serve at the direction of the Board of Directors as the chair
> > >>> of the Apache DataFusion Project, and to have primary responsibility
> > >>> for management of the projects within the scope of
> > >>> responsibility of the Apache DataFusion Project; and be it further
> > >>>
> > >>> RESOLVED, that the persons listed immediately below be and
> > >>> hereby are appointed to serve as the initial members of the
> > >>> Apache DataFusion Project:
> > >>>
> > >>> * Andy Grove (agr...@apache.org)
> > >>> * Andrew Lamb (al...@apache.org)
> > >>> * Daniël Heres (dhe...@apache.org)
> > >>> * Jie Wen (jake...@apache.org)
> > >>> * Kun Liu (liu...@apache.org)
> > >>> * Liang-Chi Hsieh (vii...@apache.org)
> > >>> * Qingping Hou: (ho...@apache.org)
> > >>> * Wes McKinney(w...@apache.org)
> > >>> * Will Jones (wjones...@apache.org)
> > >>>
> > >>> RESOLVED, that the Apache DataFusion Project be and hereby
> > >>> is tasked with the migration and rationalization of the Apache
> > >>> Arrow DataFusion sub-project; and be it further
> > >>>
> > >>> RESOLVED, that all responsibilities pertaining to the Apache
> > >>> Arrow DataFusion sub-project encumbered upon the
> > >>> Apache Arrow Project are hereafter discharged.
> > >>>
> > >>> NOW, THEREFORE, BE IT FURTHER RESOLVED, that Andrew Lamb
> > >>> be appointed to the office of Vice President, Apache DataFusion, to
> > >>> serve in accordance with and subject to the direction of the
> > >>> Board of Directors and the Bylaws of the Foundation until
> > >>> death, resignation, retirement, removal or disqualification,
> > >>> or until a successor is appointed.
> > >>> =============================================================
> > >>>
> > >>>
> > >>> -------
> > >>>
> > >>>
> > >>> Summary:
> > >>>
> > >>> We propose creating a new top level project, Apache DataFusion, from
> > >>> an existing sub project of Apache Arrow to facilitate additional
> > >>> community and project growth.
> > >>>
> > >>> Abstract
> > >>>
> > >>> Apache Arrow DataFusion[1]  is a very fast, extensible query engine
> > >>> for building high-quality data-centric systems in Rust, using the
> > >>> Apache Arrow in-memory format. DataFusion offers SQL and Dataframe
> > >>> APIs, excellent performance, built-in support for CSV, Parquet, JSON,
> > >>> and Avro, extensive customization, and a great community.
> > >>>
> > >>> [1] https://arrow.apache.org/datafusion/
> > >>>
> > >>>
> > >>> Proposal
> > >>>
> > >>> We propose creating a new top level ASF project, Apache DataFusion,
> > >>> governed initially by a subset of the Apache Arrow project’s PMC and
> > >>> committers. The project’s code is in five existing git repositories,
> > >>> currently governed by Apache Arrow which would transfer to the new
> top
> > >>> level project.
> > >>>
> > >>> Background
> > >>>
> > >>> When DataFusion was initially donated to the Arrow project, it did
> not
> > >>> have a strong enough community to stand on its own. It has since
> grown
> > >>> significantly, and benefited immensely from being part of Arrow and
> > >>> nurturing of the Apache Way, and now has a community strong enough to
> > >>> stand on its own and that would benefit from focused governance
> > >>> attention.
> > >>>
> > >>> The community has discussed this idea publicly for more than 6 months
> > >>> https://github.com/apache/arrow-datafusion/discussions/6475  and
> > >>> briefly on the Arrow PMC mailing list
> > >>> https://lists.apache.org/thread/thv2jdm6640l6gm88hy8jhk5prjww0cs. As
> > >>> of the time of this writing both had exclusively positive reactions.
> > >>>
> > >>> Several current members of the Arrow PMC are both active contributors
> > >>> to DataFusion and understand and believe deeply in the Apache Way,
> and
> > >>> play active governance roles in the Arrow project as PMC members and
> > >>> PMC chairs, guiding the community, and releasing software versions.
> > >>> With this existing governance experience and structure, the new top
> > >>> level project will be able to function well immediately and
> > >>> independently.
> > >>>
> > >>> Overview of DataFusion
> > >>>
> > >>> Current Status
> > >>>
> > >>> Meritocracy
> > >>>
> > >>> DataFusion has been developed as part of Apache Arrow and thus has
> > >>> been operating as a meritocracy. Many of the developers of DataFusion
> > >>> are Arrow PMC members or committers. The DataFusion project plans to
> > >>> continue adding new PMC and committers as the project matures and
> > >>> grows.
> > >>>
> > >>> Community
> > >>>
> > >>> The DataFusion development team seeks to foster the development and
> > >>> user communities. We hope that becoming a separate project will help
> > >>> both Arrow and DataFusion communities by being more focused.  Focused
> > >>> governance will make it easier to grow the community of committers
> and
> > >>> PMC members and make the organization more clear to others.
> > >>>
> > >>> Alignment
> > >>>
> > >>> The ASF is a natural host for DataFusion given that it is already the
> > >>> home of Arrow, Parquet, and other related distributed system, storage
> > >>> and query execution systems.
> > >>>
> > >>> Project Leadership
> > >>>
> > >>> Proposed Initial PMC
> > >>>
> > >>> We propose the following people as the initial DataFusion PMC
> members.
> > >>> This is a subset of the existing Arrow PMC members who contribute to
> > >>> DataFusion https://people.apache.org/phonebook.html?unix=arrow
> > >>>
> > >>> Andy Grove (agrove):  Arrow PMC Chair
> > >>> Andrew Lamb (alamb): Arrow PMC, past Arrow PMC Chair
> > >>> Daniël Heres (dheres) Arrow PMC
> > >>> Jie Wen (jakevin):  Arrow PMC, Doris Committer
> > >>> Kun Liu (liukun): Arrow PMC, IoTDB PMC, TSFile PMC
> > >>> Liang-Chi Hsieh (viirya): Arrow PMC, Spark PMC
> > >>> Qingping Hou: (houqp): Arrow PMC
> > >>> Wes McKinney(wesm): Arrow PMC, ASF Member
> > >>> Will Jones (wjones127): Arrow PMC
> > >>>
> > >>> We’d like to propose Andrew Lamb as the initial Chair, (and thus ASF
> > >>> VP) for the DataFusion project.
> > >>>
> > >>> Affiliations
> > >>>
> > >>> Andy Grove (agrove):  NVidia
> > >>> Andrew Lamb (alamb): InfluxData
> > >>> Daniël Heres (dheres): Coralogix
> > >>> Jie Wen (jakevin): SelectDB
> > >>> Kun Liu (liukun): Ebay
> > >>> Liang-Chi Hsieh (viirya): Apple
> > >>> Qingping Hou: (houqp): Scribd
> > >>> Wes McKinney(wesm): Posit
> > >>> Will Jones (wjones127): LanceDB
> > >>>
> > >>> Proposed Initial Committers
> > >>>
> > >>> In addition to the PMC, we propose the following people as the
> initial
> > >>> DataFusion committers. This is a subset of the existing Arrow
> > >>> committers who contribute to DataFusion
> > >>> https://people.apache.org/phonebook.html?unix=arrow
> > >>>
> > >>> akurmustafa Mustafa Akur (Synnada)
> > >>> avantgardner Brent Gardner (Coralogix)
> > >>> comphead Oleks V. (Unaffiliated)
> > >>> jayzhan Jay Zhan (Unaffiliated)
> > >>> jeffreyvo Jeffry Vo (Unaffiliated)
> > >>> jiayuliu Liu Jiayu (Airbnb)
> > >>> mete Metehan Yildirim (Synnada)
> > >>> mingmwang Wang Mingming (Ebay)
> > >>> mneumann Marco Neumann (InfluxData)
> > >>> nju_yaho Zhong Yanghong (Ebay)
> > >>> ozankabak Mehmet Ozan Kabak (Synnada)
> > >>> paddyhoran Paddy Horan (Assured Allies)
> > >>> rdettai Rémi Dettai (Cloudfuse)
> > >>> sunchao Chao Sun (Apple)
> > >>> thinkharderdev Daniel Harris (Coralogix)
> > >>> tustvold Raphael Taylor-Davies (InfluxData)
> > >>> wayne Ruihang Xia (Greptime)
> > >>> xudong963 Xudong Wang (ByteDance)
> > >>> yjshen Yijie Shen (Space and Time)
> > >>> yangjiang Yang Jiang (ebay)
> > >>>
> > >>>
> > >>> Risk Assessments
> > >>>
> > >>> Naming / Trademarks
> > >>>
> > >>> As a sub-project of Arrow, the DataFusion name has been used for over
> > >>> 4 years without any known issues. A podling name search did not turn
> > >>> up any concerns and was approved:
> > >>> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > >>>
> > >>> Legal / IP Clearance
> > >>>
> > >>> All DataFusion code has either been donated to the Arrow project with
> > >>> appropriate IP clearance or  has been developed directly under ASF
> > >>> processes and procedures. Thus creating a new top level project poses
> > >>> no new Legal or IP risks.
> > >>>
> > >>> Code Extraction
> > >>>
> > >>> The relevant code is already in 5 separate repositories:
> > >>> https://github.com/apache/arrow-datafusion/
> > >>> https://github.com/apache/arrow-datafusion-python
> > >>> https://github.com/apache/arrow-ballista
> > >>> https://github.com/apache/arrow-ballista-python
> > >>> https://github.com/apache/arrow-datafusion-comet
> > >>>
> > >>> We foresee no issues with code extraction and propose these
> > >>> repositories be  renamed to reflect top level projects
> > >>>
> > >>> Note:  https://github.com/apache/arrow-rs, the Rust implementation
> of
> > >>> Arrow, would remain part of the Arrow project.
> > >>>
> > >>> Orphaned Products
> > >>>
> > >>> DataFusion is known to be used in many open source and commercial
> > >>> projects
> > >>>
> >
> https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users
> > >>> ,
> > >>> has had multiple commits daily for several years, and its adoption
> and
> > >>> number of contributors appears to be growing. We do not foresee the
> > >>> project being orphaned in the next several years.
> > >>>
> > >>> Inexperience with Open Source
> > >>>
> > >>> The proposed PMC has extensive experience with Apache Arrow and other
> > >>> Apache projects, and includes PMC members, PMC chairs and an ASF
> > >>> Member. The DataFusion PMC and more experienced committers will
> > >>> continue to coach new community members who may be less familiar with
> > >>> the Apache Way.
> > >>>
> > >>> Homogeneous Developers
> > >>>
> > >>> The 9 proposed PMC members are from 9 different employers and the
> > >>> proposed committers are similarly distributed across affiliations. No
> > >>> specific entity employs more than 3 total proposed developers.
> > >>>
> > >>> Reliance on Salaried Developers
> > >>>
> > >>> A substantial amount of work on DataFusion has been by salaried
> > >>> developers, but it also has a long tradition of attracting
> > >>> contributions from students and hobbyists and we plan no changes in
> > >>> contribution structure.
> > >>>
> > >>> Relationships with Other Apache Products
> > >>>
> > >>> DataFusion will obviously have a strong relationship with the Arrow
> > >>> project given the overlap in people. We don’t foresee close
> > >>> collaboration with other projects at this time.
> > >>>
> > >>> Cryptography
> > >>>
> > >>> DataFusion does not directly support encryption and there are no
> > >>> near-term plans to add support for encryption. Users who need this
> > >>> functionality can use the extension APIs.
> > >>>
> > >>> Required Resources
> > >>>
> > >>> Mailing Lists
> > >>>
> > >>> - priv...@datafusion.apache.org for private PMC discussions (with
> > >>> moderated subscriptions)
> > >>> - d...@datafusion.apache.org
> > >>> - comm...@datafusion.apache.org
> > >>> - u...@datafusion.apache.org
> > >>>
> > >>> Version Control
> > >>>
> > >>> We propose to continue to use git for source control and github for
> > >>> hosting and testing resources.
> > >>>
> > >>> We also need to rename the github repositories to reflect the new top
> > >>> level names:
> > >>>
> > >>> https://github.com/apache/arrow-datafusion/ → apache/datafusion
> > >>> https://github.com/apache/arrow-datafusion-python →
> > >>> apache/datafusion-python
> > >>> https://github.com/apache/arrow-ballista →
> apache/datafusion-ballista
> > >>> https://github.com/apache/arrow-ballista-python  →
> > >>> apache/datafusion-ballista-python
> > >>> https://github.com/apache/arrow-datafusion-comet →
> > apache/datafusion-comet
> > >>>
> > >>>
> > >>>
> > >>> Issue Tracking
> > >>>
> > >>> DataFusion would continue to use github for its issue tracking and
> > >>> communications
> > >>>
> > >>> Other Resources
> > >>>
> > >>> The existing repositories already make use of existing Apache
> > >>> infrastructure, and we expect no change in the initial resource
> usage.
> > >>> As the project continues to grow, we expect continued infrastructure
> > >>> demand growth.
> > >>>
> > >>>
> > >>> FAQ: Has a sub project been promoted to a top level project before?
> > >>>
> > >>> Yes, and it appears to happen commonly. The Arrow project itself was
> > >>> created as a top level project from work that started in Apache
> Drill,
> > >>> and there are many sub projects of Hadoop that spun out as their own
> > >>> top level projects such as Mahout, Avro and HBase:
> > >>>
> > >>>
> >
> https://news.apache.org/foundation/entry/the_apache_software_foundation_announces4
> > >>>
> > >>>
> > >>>
> > >>> Related material:
> > >>> Name search request / research for DataFusion:
> > >>> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > >>> Discussion about this proposal on the arrow mailing list:
> > >>> https://lists.apache.org/thread/c150t1s1x0kcb3r03cjyx31kqs5oc341
> > >>> Discussion about which repositories on the arrow mailing list:
> > >>> https://lists.apache.org/thread/ob3n0d9ky0bgrryl3xn39w9k566bq00q
> > >>> Discussion about initial PMC on the arrow mailing list:
> > >>> https://lists.apache.org/thread/pymrzcdw4qdptvby85f69rg3pcckl15b
> > >>> Discussion in github about creating a new DataFusion top level
> > >>> project: https://github.com/apache/arrow-datafusion/discussions/6475
> > >>> Discussion about graduating on incubator list:
> > >>> https://lists.apache.org/thread/r4n73pmms1lv0jbohyx1o1z13d615t99
> > >>> Original Proposal for the Arrow project:
> > >>> https://lists.apache.org/thread/x2qzdwglm8pkqp9gv03bbgw17khl7pq3
> > >>>
> >
>

Reply via email to