Thank you (again!) to everyone who helped write this report.

I just wanted to share some feedback from Rich Bowen:

> Just a quick note about your July board report.
>
> Just ... wow.
>
> This is one of the best, most thorough, more introspective board reports
that I have read so far this year. Thank you for the time and work that
*clearly* went into this.
>
> I'm especially impressed with the DataFusion project's introspection into
its goals and mission. That kind of clarity is so important, and we don't
see many projects taking the time to have that conversation.



On Fri, Jul 14, 2023 at 4:22 PM Andrew Lamb <al...@influxdata.com> wrote:

> Thanks to everyone who contributed to the board update under such short
> notice.  I always enjoy reading what the rest of the project has been up to
> -- if anyone else is interested, the final report that was submitted can be
> found below.
>
> Thanks again and have a nice weekend,
> Andrew
>
>
>
>
> ## Description:
> The mission of Apache Arrow is the creation and maintenance of software
> related to columnar in-memory processing and data interchange. More
> information can be found at https://arrow.apache.org/overview/
>
> ## Project Status:
>
> Current project status: Ongoing (high activity)
>
> Issues for the board: None
>
> ## Membership Data:
> Apache Arrow was founded 2016-01-19 (7 years ago)
> There are currently 97 committers and 50 PMC members in this project.
> The Committer-to-PMC ratio is roughly 7:4.
>
> Community changes, past quarter:
> - Ben Baumgold was added to the PMC on 2023-06-19
> - Jie Wen was added to the PMC on 2023-06-10
> - Dewey Dunnington was added to the PMC on 2023-06-22
> - Matthew Topol was added to the PMC on 2023-05-02
> - Gang Wu was added as committer on 2023-05-15
> - Kevin Gurney was added as committer on 2023-07-04
> - Marco Neumann was added as committer on 2023-05-11
> - Mehmet Ozan Kabak was added as committer on 2023-06-10
> - Ruihang Xia was added as committer on 2023-04-15
>
> ## Project Activity:
>
> There has been healthy debate about adding new formats, [StringArray] and
> [ListView], focused on increasing Arrow’s appeal in high performance
>  computation engines.
>
> We have completed the transition from JIRA to using GitHub issues for the
> mono
> repo and that appears to be going well.
>
> The DataFusion subproject is considering applying to become its own top
> level
> Apache project (see DataFusion update below)
>
> [StringArray]:
> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v
> [ListView]:
> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb
>
>
> ## Sub Project Updates
> Arrow has several subprojects, as listed on https://arrow.apache.org/
>
> ### ADBC
>
> We have released 2 new minor versions. They include new drivers and new
> implementations.
>
> ### Arrow Flight
>
> We have added new features to the Arrow Flight specification:
>
> 1. Ordered data support: https://github.com/apache/arrow/issues/34852
> 2. Resultset expiration support:
> https://github.com/apache/arrow/issues/35500
>
> ### Arrow Flight SQL
>
> We have updated the Arrow Flight SQL specifications based on the above
> Arrow
> Flight update.
>
> ### DataFusion
>
> DataFusion continues to grow and mature. The community added many new
> features
> as described in the latest [blog] post, and discussed and came to
> consensus on
> the [goals] of the project and is discussing a [move to its own top level
> Apache project]. Current development focus is on performance and adding
> better
> support for structured types such as Lists and Structs. We expect more
> work on
> improving documentation and communicating externally over the next quarter.
>
> [blog]: https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/
> [goals]: https://github.com/apache/arrow-datafusion/discussions/6441
> [move to its own top level Apache project]:
>  https://github.com/apache/arrow-datafusion/discussions/6475
>
>
> ## Language Area Updates
>
> Arrow has at least 12 different language implementations, as explained in
> https://arrow.apache.org/overview/
>
> Arrow 12.0.0 was released from the monorepo:
> https://arrow.apache.org/blog/2023/05/02/12.0.0-release/
>
> ### C++
>
> PRs have been created with example implementations of two new layouts,
> Array
> View and String View.  These layouts are motivated by Arrow-compatible
> engines
> which found these layouts to be more efficient for their workflows.
>
> As mentioned in the previous report, the C++ compute engine Acero was
> broken
> out into a separate module and Arrow-C++ can now be built without it,
> allowing
> for more modular feature configuration.
>
> ### C#
>
> C# now has a complete implementation of the C data interface, allowing for
> efficient intra-process communication between C# and other languages.  In
> addition, there has been some early discussion
>
> ### Go
>
> PRs were created with the example implementation of StringView for Go to be
> the second implementation in order to vote on the layout. Changes were
> introduced to improve compatibility with x86 (32-bit) systems and TinyGo
> builds for WebAssembly builds, along with corresponding CI builds.
>
> A default Arrow Flight middleware was added for handling Cookies via gRPC
> headers.
>
> Usage of the Go implementation continues to grow and expand in the
> community.
>
> ### Java
>
> Ongoing maintenance of the Arrow Java implementation remains steady.
>
> ### JavaScript
>
> ### Julia
> We have released new versions rapidly when we fix a problem.
>
> New PMC member who focuses on Julia has joined. There are 2 PMC members who
> focus on Julia now.
>
> ### nanoarrow
>
> The 0.2.0 release of nanoarrow featured support for decoding the Arrow IPC
> format and included a number of interface improvements and bugfixes
> resulting
> from early usage. Ongoing work includes support for non-CPU data via the
> Arrow
> C Device interface and documentation improvements suggested by early users
> of
> the library.
>
>
> ### Rust
>
> The Rust implementation has been focused on improving the UX of the API,
> the
> speed, consistency and correctness (timezones!) of the kernels.
>
> ### C (GLib)
>
> We have added new bindings continually as usual.
>
> ### MATLAB
>
> New committer who focuses on MATLAB has joined. The committer is the first
> committer who focuses on MATLAB. We’ll expand the MATLAB community.
>
> Integrated support for mathworks/libmexclass, enabling streamlined
> development
> of the MATLAB interface. As a result, significant progress has been made on
> public MATLAB APIs, including support for Array and RecordBatch
> construction
> from equivalent MATLAB types (e.g. table).
>
> Recently merged Windows and ccache CI support, bridging the platform gap
> for
> MATLAB qualification. This will help ensure quality of PRs and improve
> developer confidence when making changes.
>
> Next steps for the MATLAB interface include working on compound / nested
> data
> types and tabular file I/O workflows.
>
> ### Python
>
> The python community is embracing “protocols”, which allow for
> library-agnostic interchange and duck-typing.  Pyarrow has added support
> for
> the dataframe interchange protocol which maps to pyarrow’s Table class.  In
> addition, some early discussion has begun around a dataset protocol based
> on
> pyarrow’s datasets API.
>
> ### R
>
> The R bindings now support JSON Datasets and continue to benefit from
> ongoing
> performance enhancements and feature additions in the C++ library.
>
> ### Ruby
>
> Ruby related questions/issue reports were increased. It shows that user
> base
> of the Ruby bindings is increasing.
>
> ### Swift
>
> We have started implementing Arrow Flight.
>
> ## Community Health:
> Community communication continues to be strong.
>
> There have been 9 blog posts published to https://arrow.apache.org/blog/
>  in
> the last 3 months, including two from community members on their use of
> Arrow
>
> The mailing lists are active
>
> * dev@arrow.apache.org had a 10% decrease in traffic in the past quarter
> (779
>   emails compared to 858)
> * j...@arrow.apache.org had a 100% decrease in traffic in the past
> quarter (0
>   emails compared to 10778)
>
> For the mono repo:
>
> * 2275 commits in the past quarter (5% increase)
> * 254 code contributors in the past quarter (1% increase)
> * 1986 PRs opened on GitHub, past quarter (-6% change)
> * (1954 PRs closed on GitHub, past quarter (-11% change)
> * 1573 issues opened on GitHub, past quarter (-11% change)
> * 1342 issues closed on GitHub, past quarter (-5% change)
>
>
>
>
> On Thu, Jul 13, 2023 at 1:28 PM Kevin Gurney <kgur...@mathworks.com>
> wrote:
>
>> Hi All,
>>
>> Thanks for putting this together, Andrew!
>>
>> Sarah, Fiona, and I added some notes about the MATLAB interface.
>>
>> Best Regards,
>>
>> Kevin Gurney
>> ________________________________
>> From: Sutou Kouhei <k...@clear-code.com>
>> Sent: Wednesday, July 12, 2023 9:36 PM
>> To: dev@arrow.apache.org <dev@arrow.apache.org>
>> Subject: Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide
>> feedback
>>
>> Hi,
>>
>> Thanks! I've added something.
>>
>> --
>> kou
>>
>> In <CAFhtnRxMnFoX5EJ8_7P9XtrmPT7n8AgPSkGTS7cju=vzyqb...@mail.gmail.com>
>> "Re: [CROWDSOURCING] Board Report -- 2 DAYS -- Please provide feedback"
>> on Wed, 12 Jul 2023 16:32:23 -0400,
>> Andrew Lamb <al...@influxdata.com> wrote:
>>
>> > I apologize, I sent the link out for the last board report
>> >
>> > This correct link is [1]
>> >
>> > [1]
>> >
>> https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam
>> <
>> https://docs.google.com/document/d/1-VRSKq6xeBdg8uvZPLk-aMW8XzuwI4-AEnUSK1vssDQ/edit#heading=h.gv1c2bcucuam
>> >
>> >
>> > On Wed, Jul 12, 2023 at 5:49 AM Andrew Lamb <al...@influxdata.com>
>> wrote:
>> >
>> >> Hello Arrow Community,
>> >>
>> >> TLDR: Please add any comments or board content directly to [2] or
>> reply to
>> >> this email and I will incorporate your comments. You can see what we
>> >> currently have at the end of this email.
>> >>
>> >> In an epic scheduling fail, I forgot to organize this report a few
>> weeks
>> >> ago, so now the deadline is tight.
>> >>
>> >> One of the responsibilities of being part of the Apache Software
>> Foundation
>> >> (ASF) is to regularly summarize the state of the project in a quarterly
>> >> update to the ASF board. I plan to submit the next report on July 14,
>> 2023
>> >> (in 2 days time -- I am sorry for the late notice)
>> >>
>> >> Historically[1], Arrow has crowd sourced the content which has worked
>> >> well. While this is partly an administrative reporting exercise, I
>> think it
>> >> is also valuable to reflect on the past and think about goals for the
>> >> future.
>> >>
>> >> It would be especially interesting if anyone from the various language
>> >> implementation communities could provide an update of a sentence or
>> two.
>> >>
>> >> Andrew
>> >>
>> >> [1]: https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5<
>> https://lists.apache.org/thread/xg7pgj4stt4l2sblyt81y9s6h0cl8hw5>
>> >>
>> >> [2]:
>> >>
>> >>
>> https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#
>> <
>> https://docs.google.com/document/d/13FSDydEVXT2UUFdy4XKjVKNJW-WR8ylvG3aI6lD-dNI/edit#
>> >
>> >>
>> >>
>> >>
>> >> ## Description:
>> >> The mission of Apache Arrow is the creation and maintenance of software
>> >> related
>> >> to columnar in-memory processing and data interchange. More information
>> >> can be found at https://arrow.apache.org/overview/<
>> https://arrow.apache.org/overview>
>> >>
>> >> ## Issues:
>> >>
>> >>
>> >> ## Membership Data:
>> >> Apache Arrow was founded 2016-01-19 (7 years ago)
>> >> There are currently 97 committers and 50 PMC members in this project.
>> >> The Committer-to-PMC ratio is roughly 7:4.
>> >>
>> >> Community changes, past quarter:
>> >> - Ben Baumgold was added to the PMC on 2023-06-19
>> >> - Jie Wen was added to the PMC on 2023-06-10
>> >> - Dewey Dunnington was added to the PMC on 2023-06-22
>> >> - Matthew Topol was added to the PMC on 2023-05-02
>> >> - Gang Wu was added as committer on 2023-05-15
>> >> - Kevin Gurney was added as committer on 2023-07-04
>> >> - Marco Neumann was added as committer on 2023-05-11
>> >> - Mehmet Ozan Kabak was added as committer on 2023-06-10
>> >> - Ruihang Xia was added as committer on 2023-04-15
>> >>
>> >>
>> >>
>> >> ## Project Activity:
>> >>
>> >> There has been healthy debate about adding new formats, [StringArray]
>> and
>> >> [ListView], focused on increasing Arrow’s appeal in high performance
>> >> computation engines.
>> >> We have completed the transition from JIRA to using Github issues for
>> the
>> >> mono repo and that appears to be going well.
>> >>
>> >> The DataFusion subproject is considering applying to become its own top
>> >> level Apache project (see DataFusion update below)
>> >> [StringArray]:
>> >> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v<
>> https://lists.apache.org/thread/c6frlr9gcxy8qdhbmv8cn3rdjbrqxb1v>
>> >> [ListView]:
>> >> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb<
>> https://lists.apache.org/thread/r28rw5n39jwtvn08oljl09d4q2c1ysvb>
>> >>
>> >>
>> >>
>> >> ## Community Health:
>> >>
>> >>
>> >> There have been 9 blog posts published to
>> https://arrow.apache.org/blog/<https://arrow.apache.org/blog>
>> >> in the last 3 months, including two from community members on their
>> use of
>> >> Arrow
>> >>
>> >>
>> >> ## Sub Project Updates
>> >> Arrow has several subprojects, as listed on https://arrow.apache.org/<
>> https://arrow.apache.org>
>> >>
>> >> ### ADBC
>> >>
>> >> ### Arrow Flight
>> >>
>> >> ### Arrow Flight SQL
>> >>
>> >> ### DataFusion
>> >>
>> >> DataFusion continues to grow and mature. The community added many new
>> >> features as described in the latest [blog] post, and discussed and
>> came to
>> >> consensus on the [goals] of the project and is discussing a [move to
>> its
>> >> own top level Apache project]. Current development focus is on
>> performance
>> >> and adding better support for structured types such as LIsts and
>> Structs.
>> >> We expect more work on improving documentation and communicating
>> externally
>> >> over the next quarter.
>> >>
>> >> [blog]: https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0/<
>> https://arrow.apache.org/blog/2023/06/24/datafusion-25.0.0>
>> >> [goals]: https://github.com/apache/arrow-datafusion/discussions/6441<
>> https://github.com/apache/arrow-datafusion/discussions/6441>
>> >> [move to its own top level Apache project]:
>> >> https://github.com/apache/arrow-datafusion/discussions/6475<
>> https://github.com/apache/arrow-datafusion/discussions/6475>
>> >>
>> >>
>> >> ## Language Area Updates
>> >>
>> >>
>> >> Arrow has at least 12 different language implementations, as explained
>> in
>> >> https://arrow.apache.org/overview/<https://arrow.apache.org/overview/>
>> >>
>> >> Arrow 12.0.0 was released from the monorepo:
>> >> https://arrow.apache.org/blog/2023/05/02/12.0.0-release/<
>> https://arrow.apache.org/blog/2023/05/02/12.0.0-release>
>> >>
>> >>
>> >>
>> >> ### C++
>> >>
>> >>
>> >>
>> >> ### C#
>> >>
>> >>
>> >>
>> >> ### Go
>> >>
>> >>
>> >> ### Java
>> >>
>> >>
>> >>
>> >>
>> >> ### JavaScript
>> >>
>> >> ### Julia
>> >>
>> >> ### nanoarrow
>> >>
>> >>
>> >>
>> >> ### Rust
>> >>
>> >>
>> >> ### C (GLib)
>> >>
>> >>
>> >> ### MATLAB
>> >>
>> >>
>> >>
>> >>
>> >> ### Python
>> >>
>> >>
>> >>
>> >> ### R
>> >>
>> >>
>> >>
>> >> ### Ruby
>> >>
>> >>
>> >> ### Swift
>> >>
>> >>
>> >> ## Release activity
>> >>
>> >> (This is automatically generated):
>> >>
>> >> RS-DATAFUSION-PYTHON-27.0.0 was released on 2023-07-08.
>> >> RS-43.0.0 was released on 2023-07-03.
>> >> RS-DATAFUSION-27.0.0 was released on 2023-06-30.
>> >> ADBC-0.5.1 was released on 2023-06-26.
>> >> NANOARROW-0.2.0 was released on 2023-06-22.
>> >> ADBC-0.5.0 was released on 2023-06-20.
>> >> RS-42.0.0 was released on 2023-06-20.
>> >> 12.0.1 was released on 2023-06-13.
>> >> JULIA-2.6.2 was released on 2023-06-12.
>> >> JULIA-2.6.1 was released on 2023-06-08.
>> >> RS-DATAFUSION-26.0.0 was released on 2023-06-07.
>> >> RS-41.0.0 was released on 2023-06-06.
>> >> RS-OS-0.6.1 was released on 2023-06-06.
>> >> JULIA-2.6.0 was released on 2023-06-05.
>> >> RS-DATAFUSION-25.0.0 was released on 2023-05-23.
>> >> RS-40.0.0 was released on 2023-05-22.
>> >> RS-OS-0.6.0 was released on 2023-05-22.
>> >> ADBC-0.4.0 was released on 2023-05-12.
>> >> RS-39.0.0 was released on 2023-05-09.
>> >> RS-DATAFUSION-24.0.0 was released on 2023-05-09.
>> >> 12.0.0 was released on 2023-05-01.
>> >> RS-DATAFUSION-PYTHON-23.0.0 was released on 2023-04-28.
>> >> RS-38.0.0 was released on 2023-04-25.
>> >> RS-DATAFUSION-23.0.0 was released on 2023-04-24.
>> >> JULIA-2.5.2 was released on 2023-04-19.
>> >> JULIA-2.5.1 was released on 2023-04-16.
>> >> RS-DATAFUSION-PYTHON-22.0.0 was released on 2023-04-14.
>> >>
>> >>
>>
>

Reply via email to