Thank you very much

On Fri, Jan 5, 2024 at 11:17 AM Jean-Baptiste Onofré <j...@nanthrax.net>
wrote:

> Hi Andrew,
>
> The PODLINGNAMESEARCH is not yet completed: the VP Brand Management
> (Mark Thomas) should comment in the Jira to approve or not the name.
>
> I added a comment in the Jira to ping Mark. He should get back to us soon.
>
> Regards
> JB
>
> On Fri, Jan 5, 2024 at 3:38 PM Andrew Lamb <al...@influxdata.com> wrote:
> >
> > Thanks JB,
> >
> > I did do a name search and posted the results here [1]
> >
> > However, I am not sure what the next steps for that particular process is
> > (like does someone have to approve it, for example?)
> >
> > Any insight you could provide would be greatly appreciated
> >
> > Andrew
> >
> > [1] https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> >
> >
> > On Fri, Jan 5, 2024 at 7:55 AM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >
> > > Hi Andrew,
> > >
> > > I did a quick review on the doc and it looks good to me. I just added
> > > a question about name search (DataFusion will probably work as TLP,
> > > but we have to check as we have a new Apache name moving from Arrow
> > > DataFusion to DataFusion).
> > >
> > > Please let me know if I can help on that.
> > >
> > > Thanks !
> > > Regards
> > > JB
> > >
> > > On Fri, Jan 5, 2024 at 12:26 PM Andrew Lamb <al...@influxdata.com>
> wrote:
> > > >
> > > > Upon reviewing the board report template, I am planning on the
> following
> > > > schedule:
> > > > 1. I'll leave this proposal for another few weeks to gather any
> > > additional
> > > > input
> > > > 2. In early February 2024 I'll start a formal vote thread on the dev@
> > > > mailing list for this proposal
> > > > 3. If the vote passes, I'll submit a proposed resolution to the ASF
> board
> > > > for their meeting in April 2024 using the pre-existing template[1]
> > > >
> > > >
> > > > [1]
> > > >
> > >
> https://svn.apache.org/repos/private/committers/board/templates/subproject-tlp-resolution.txt
> > > >
> > > > On Wed, Dec 27, 2023 at 6:32 PM L. C. Hsieh <vii...@gmail.com>
> wrote:
> > > >
> > > > > Thanks for writing the proposal. It looks great to me too.
> > > > > I added a few comments on it.
> > > > >
> > > > > On Wed, Dec 27, 2023 at 3:05 PM Andy Grove <andygrov...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > Thank you for creating the draft proposal, Andrew. I have
> reviewed
> > > this
> > > > > and
> > > > > > I think it looks great.
> > > > > >
> > > > > > Andy.
> > > > > >
> > > > > > On Wed, Dec 27, 2023 at 3:19 PM Andrew Lamb <
> al...@influxdata.com>
> > > > > wrote:
> > > > > >
> > > > > > > I have created a draft proposal [1] to break DataFusion out to
> its
> > > own
> > > > > top
> > > > > > > level project. Please provide your feedback and suggestions.
> > > > > > >
> > > > > > > The proposal is included at the end of this email and in this
> > > Google
> > > > > Doc:
> > > > > > >
> > > > > > >
> > > > >
> > >
> https://docs.google.com/document/d/11WTNYS8KWScOt3ySTX39WVS6krPhUvHsuJRY9PZQx4g
> > > > > > > .
> > > > > > >
> > > > > > > Feel free to respond to this email or comment / make
> suggestions
> > > > > directly
> > > > > > > on the document.
> > > > > > >
> > > > > > > I would be especially grateful if people could review and
> comment
> > > on
> > > > > the
> > > > > > > proposed list of committers and PMC members.
> > > > > > >
> > > > > > > I hope everyone is not getting sick of hearing about this, but
> I
> > > think
> > > > > in
> > > > > > > this case it is better to over communicate than risk surprises.
> > > > > > >
> > > > > > > Andrew
> > > > > > >
> > > > > > > [1] https://github.com/apache/arrow-datafusion/issues/8491
> > > > > > >
> > > > > > >
> > > > > > > ----------
> > > > > > >
> > > > > > > DataFusion Top Level Project Proposal
> > > > > > > Dec 27, 2023
> > > > > > >
> > > > > > > [Editor’s note: This document is based on the proposal to the
> ASF
> > > > > board to
> > > > > > > create the Arrow project. One it is been reviewed, we plan to
> send
> > > it
> > > > > to
> > > > > > > the ASF board sometime in January or February 2024 for their
> > > > > consideration]
> > > > > > >
> > > > > > > To: The ASF (bo...@apache.org)
> > > > > > >
> > > > > > > Summary:
> > > > > > >
> > > > > > > We propose creating a new top level project, Apache DataFusion,
> > > from an
> > > > > > > existing sub project of Apache Arrow to facilitate additional
> > > > > community and
> > > > > > > project growth.
> > > > > > >
> > > > > > > ----
> > > > > > > Apache DataFusion for Apache Top Level Project
> > > > > > >
> > > > > > > Abstract
> > > > > > >
> > > > > > > Apache Arrow DataFusion[1]  is a very fast, extensible query
> > > engine for
> > > > > > > building high-quality data-centric systems in Rust, using the
> > > Apache
> > > > > Arrow
> > > > > > > in-memory format. DataFusion offers SQL and Dataframe APIs,
> > > excellent
> > > > > > > performance, built-in support for CSV, Parquet, JSON, and Avro,
> > > > > extensive
> > > > > > > customization, and a great community.
> > > > > > >
> > > > > > > [1] https://arrow.apache.org/datafusion/
> > > > > > >
> > > > > > >
> > > > > > > Proposal
> > > > > > >
> > > > > > > We propose creating a new top level ASF project, Apache
> DataFusion,
> > > > > > > governed initially by a subset of the Arrow project’s PMC and
> > > > > committers.
> > > > > > > The project’s code is in four existing git repositories,
> currently
> > > > > governed
> > > > > > > by Apache Arrow which would transfer to the new top level
> project.
> > > > > > >
> > > > > > > Background
> > > > > > >
> > > > > > > When DataFusion was initially donated to the Arrow project, it
> did
> > > not
> > > > > have
> > > > > > > a strong enough community to stand on its own. It has since
> grown
> > > > > > > significantly, and benefited immensely from being part of
> Arrow and
> > > > > > > nurturing of the Apache Way, and now has a community strong
> enough
> > > to
> > > > > stand
> > > > > > > on its own and that would benefit from focused governance
> > > attention.
> > > > > > >
> > > > > > > The community has discussed this idea publicly for more than 6
> > > months
> > > > > > > https://github.com/apache/arrow-datafusion/discussions/6475
> and
> > > > > briefly
> > > > > > > on
> > > > > > > the Arrow PMC mailing list
> > > > > > >
> https://lists.apache.org/thread/thv2jdm6640l6gm88hy8jhk5prjww0cs.
> > > As
> > > > > of
> > > > > > > the
> > > > > > > time of this writing both had exclusively positive reactions.
> > > > > > >
> > > > > > > Several current members of the Arrow PMC are both active
> > > contributors
> > > > > to
> > > > > > > DataFusion and understand and believe deeply in the Apache
> Way, and
> > > > > play
> > > > > > > active governance roles in the Arrow project as PMC members
> and PMC
> > > > > chairs,
> > > > > > > guiding the community, and releasing software versions. With
> this
> > > > > existing
> > > > > > > governance experience and structure, the new top level project
> > > will be
> > > > > able
> > > > > > > to function well immediately and independently.
> > > > > > >
> > > > > > > Overview of DataFusion
> > > > > > >
> > > > > > > Current Status
> > > > > > >
> > > > > > > Meritocracy
> > > > > > >
> > > > > > > DataFusion has been developed as part of Apache Arrow and thus
> has
> > > been
> > > > > > > operating as a meritocracy. Many of the developers of
> DataFusion
> > > are
> > > > > Arrow
> > > > > > > PMC members or committers. The DataFusion project plans to
> continue
> > > > > adding
> > > > > > > new PMC and committers as the project matures and grows.
> > > > > > >
> > > > > > > Community
> > > > > > >
> > > > > > > The DataFusion development team seeks to foster the
> development and
> > > > > user
> > > > > > > communities. We hope that becoming a separate project will help
> > > both
> > > > > Arrow
> > > > > > > and DataFusion communities by being more focused.  Focused
> > > governance
> > > > > will
> > > > > > > make it easier to grow the community of committers and PMC
> members
> > > and
> > > > > make
> > > > > > > the organization more clear to others.
> > > > > > >
> > > > > > > Alignment
> > > > > > >
> > > > > > > The ASF is a natural host for DataFusion given that it is
> already
> > > the
> > > > > home
> > > > > > > of Arrow, Parquet, and other related distributed system,
> storage
> > > and
> > > > > query
> > > > > > > execution systems.
> > > > > > >
> > > > > > > Project Leadership
> > > > > > >
> > > > > > > Proposed Initial PMC
> > > > > > >
> > > > > > > We propose the following people as the initial DataFusion PMC
> > > members.
> > > > > This
> > > > > > > is a subset of the existing Arrow PMC members who contribute to
> > > > > DataFusion
> > > > > > > https://people.apache.org/phonebook.html?unix=arrow
> > > > > > >
> > > > > > > Andy Grove (agrove):  Arrow PMC Chair
> > > > > > > Andrew Lamb (alamb): Arrow PMC, past Arrow PMC Chair
> > > > > > > Daniël Heres (dheres) Arrow PMC
> > > > > > > Jie Wen (jakevin):  Arrow PMC, Doris Committer
> > > > > > > Kun Liu (liukun): Arrow PMC, IoTDB PMC, TSFile PMC
> > > > > > > Liang-Chi Hsieh (viirya): Arrow PMC, Spark PMC
> > > > > > > Qingping Hou: (houqp): Arrow PMC, Doris Committer
> > > > > > > Will Jones (wjones127): Arrow PMC
> > > > > > >
> > > > > > > We’d like to propose Andrew Lamb as the initial Chair, (and
> thus
> > > ASF
> > > > > VP)
> > > > > > > for the DataFusion project.
> > > > > > >
> > > > > > > Affiliations
> > > > > > >
> > > > > > > Andy Grove (agrove):  NVidia
> > > > > > > Andrew Lamb (alamb): InfluxData
> > > > > > > Daniël Heres (dheres): Coralogix
> > > > > > > Jie Wen (jakevin): SelectDB
> > > > > > > Kun Liu (liukun): Ebay
> > > > > > > Liang-Chi Hsieh (viirya): Apple
> > > > > > > Qingping Hou: (houqp): Scribd
> > > > > > > Will Jones (wjones127): VoltronData
> > > > > > >
> > > > > > > Proposed Initial Committers
> > > > > > >
> > > > > > > In addition to the PMC, we propose the following people as the
> > > initial
> > > > > > > DataFusion committers. This is a subset of the existing Arrow
> > > > > committers
> > > > > > > who contribute to DataFusion
> > > > > > > https://people.apache.org/phonebook.html?unix=arrow
> > > > > > >
> > > > > > > akurmustafa Mustafa Akur (Synnada)
> > > > > > > avantgardner Brent Gardner (Coralogix)
> > > > > > > comphead Oleks V. (Unaffiliated)
> > > > > > > jiayuliu Liu Jiayu (Airbnb)
> > > > > > > mete Metehan Yildirim (Synnada)
> > > > > > > mingmwang Wang Mingming (Ebay)
> > > > > > > mneumann Marco Neumann (InfluxData)
> > > > > > > nju_yaho Zhong Yanghong (Ebay)
> > > > > > > ozankabak Mehmet Ozan Kabak (Synnada)
> > > > > > > paddyhoran Paddy Horan (Assured Allies)
> > > > > > > rdettai Rémi Dettai (Cloudfuse)
> > > > > > > sunchao Sun Chao (Apple)
> > > > > > > thinkharderdev Daniel Harris (Coralogix)
> > > > > > > tustvold Raphael Taylor-Davies (InfluxData)
> > > > > > > viirya L. C. Hsieh (Apple)
> > > > > > > wayne Ruihang Xia (Greptime)
> > > > > > > xudong963 Xudong Wang (ByteDance)
> > > > > > > yjshen Yijie Shen (Space and Time)
> > > > > > >
> > > > > > >
> > > > > > > Risk Assessments
> > > > > > >
> > > > > > > Naming / Trademarks
> > > > > > >
> > > > > > > As a sub-project of Arrow, the DataFusion name has been used
> for
> > > over 4
> > > > > > > years without any known issues. A podling name search has thus
> far
> > > not
> > > > > > > turned up any concerns:
> > > > > > > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > > > > > >
> > > > > > > Legal / IP Clearance
> > > > > > >
> > > > > > > All DataFusion code has either been donated to the Arrow
> project
> > > with
> > > > > > > appropriate IP clearance or  has been developed directly under
> ASF
> > > > > > > processes and procedures. Thus creating a new top level project
> > > poses
> > > > > no
> > > > > > > new Legal or IP risks.
> > > > > > >
> > > > > > > Code Extraction
> > > > > > >
> > > > > > > The relevant code is already in 4 separate repositories:
> > > > > > > https://github.com/apache/arrow-datafusion/
> > > > > > > https://github.com/apache/arrow-datafusion-python
> > > > > > > https://github.com/apache/arrow-ballista
> > > > > > > https://github.com/apache/arrow-ballista-python
> > > > > > >
> > > > > > > We foresee no issues with code extraction and propose these
> > > > > repositories be
> > > > > > > respectively  renamed to reflect top level projects:
> > > > > > > https://github.com/apache/datafusion/
> > > > > > > https://github.com/apache/datafusion-python
> > > > > > > https://github.com/apache/datafusion-ballista
> > > > > > > https://github.com/apache/datafusion-ballista-python
> > > > > > >
> > > > > > > Note:  https://github.com/apache/arrow-rs, the Rust
> > > implementation of
> > > > > > > Arrow, would remain part of the Arrow project.
> > > > > > >
> > > > > > > Orphaned Products
> > > > > > >
> > > > > > > DataFusion is known to be used in many open source and
> commercial
> > > > > projects
> > > > > > >
> > > > > > >
> > > > >
> > >
> https://arrow.apache.org/datafusion/user-guide/introduction.html#known-users
> > > > > > > ,
> > > > > > > has had multiple commits daily for several years, and its
> adoption
> > > and
> > > > > > > number of contributors appears to be growing.
> > > > > > >
> > > > > > > Inexperience with Open Source
> > > > > > >
> > > > > > > The proposed PMC has extensive experience with Apache Arrow and
> > > other
> > > > > > > Apache projects, and includes PMC members and PMC chairs. The
> > > > > DataFusion
> > > > > > > PMC and more experienced committers will continue to coach new
> > > > > community
> > > > > > > members who may be less familiar with the Apache Way.
> > > > > > >
> > > > > > > Homogeneous Developers
> > > > > > >
> > > > > > > The 8 proposed PMC members are from 8 different employers and
> the
> > > > > proposed
> > > > > > > committers are similarly distributed across affiliations. No
> > > specific
> > > > > > > entity employs more than 3 total proposed developers.
> > > > > > >
> > > > > > > Reliance on Salaried Developers
> > > > > > >
> > > > > > > A substantial amount of work on DataFusion has been by salaried
> > > > > developers,
> > > > > > > but it also has a long tradition of attracting contributions
> from
> > > > > students
> > > > > > > and hobbyists and we plan no changes in contribution structure.
> > > > > > >
> > > > > > > Relationships with Other Apache Products
> > > > > > >
> > > > > > > DataFusion will obviously have a strong relationship with the
> Arrow
> > > > > project
> > > > > > > given the overlap in people. We don’t foresee close
> collaboration
> > > with
> > > > > > > other projects at this time.
> > > > > > >
> > > > > > > Cryptography
> > > > > > >
> > > > > > > DataFusion does not directly support encryption and there are
> no
> > > > > near-term
> > > > > > > plans to add support for encryption. Users who need this
> > > functionality
> > > > > can
> > > > > > > use the extension APIs.
> > > > > > >
> > > > > > > Required Resources
> > > > > > >
> > > > > > > Mailing Lists
> > > > > > >
> > > > > > > - private@datafusion for private PMC discussions (with
> moderated
> > > > > > > subscriptions)
> > > > > > > - dev@datafusion
> > > > > > > - commits@datafusion
> > > > > > >
> > > > > > > Version Control
> > > > > > >
> > > > > > > We propose to continue to use git for source control and gitub
> for
> > > > > hosting
> > > > > > > and testing resources.
> > > > > > >
> > > > > > > Issue Tracking
> > > > > > >
> > > > > > > DataFusion would continue to use github for its issue tracking
> and
> > > > > > > communications
> > > > > > >
> > > > > > > Other Resources
> > > > > > >
> > > > > > > The existing repositories already make use of existing Apache
> > > > > > > infrastructure, and we expect no change in the initial resource
> > > usage.
> > > > > As
> > > > > > > the project continues to grow, we expect continued
> infrastructure
> > > > > demand
> > > > > > > growth.
> > > > > > >
> > > > > > >
> > > > > > > FAQ: Has a sub project been promoted to a top level project
> before?
> > > > > > >
> > > > > > > Yes, and it appears to happen commonly. The Arrow project
> itself
> > > was
> > > > > > > created as a top level project from work that started in Apache
> > > Drill,
> > > > > and
> > > > > > > there are many sub projects of Hadoop that spun out as their
> own
> > > top
> > > > > level
> > > > > > > projects such as Mahout, Avro and HBase:
> > > > > > >
> > > > > > >
> > > > >
> > >
> https://news.apache.org/foundation/entry/the_apache_software_foundation_announces4
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Related material:
> > > > > > > Name search request / research for DataFusion:
> > > > > > > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-219
> > > > > > > Discussion about which repositories on the arrow mailing list:
> > > > > > >
> https://lists.apache.org/thread/ob3n0d9ky0bgrryl3xn39w9k566bq00q
> > > > > > > Discussion about initial PMC on the arrow mailing list:
> > > > > > >
> https://lists.apache.org/thread/pymrzcdw4qdptvby85f69rg3pcckl15b
> > > > > > > Discussion about creating a new DataFusion top level project:
> > > > > > > https://github.com/apache/arrow-datafusion/discussions/6475
> > > > > > > Discussion about graduating on incubator list:
> > > > > > >
> https://lists.apache.org/thread/r4n73pmms1lv0jbohyx1o1z13d615t99
> > > > > > > Original Proposal for the Arrow project:
> > > > > > >
> https://lists.apache.org/thread/x2qzdwglm8pkqp9gv03bbgw17khl7pq3
> > > > > > >
> > > > >
> > >
>

Reply via email to