Thank you for all the collaboration on the board report. I have now submitted this. I have attached the submitted content below.
--- ## Description: The mission of Apache Arrow is the creation and maintenance of software related to columnar in-memory processing and data interchange ## Project Status: Current project status:Ongoing (high activity) Issues for the board:None ## Membership Data: Apache Arrow was founded 2016-01-19 (8 years ago) There are currently 103 committers and 52 PMC members in this project. The Committer-to-PMC ratio is roughly 7:4. Community changes, past quarter: - Jonathan Keane was added to the PMC on 2023-10-13 - Raúl Cumplido was added to the PMC on 2023-11-12 - Curt Hagenlocher was added as committer on 2023-10-14 - Felipe Oliveira Carvalho was added as committer on 2023-12-06 - James Duong was added as committer on 2023-11-16 - Xuwei Fu was added as committer on 2023-10-22 ## Project Activity: ### Sub Project Updates Arrow has several subprojects, as listed on https://arrow.apache.org/ ### ADBC ADBC 0.8.0 was released on 9 November 2023. Highlights: C#/.NET has added a BigQuery driver. Also, it now can be used through ADO.NET. The R bindings now use the ADBC 1.1.0 specification, and more packages are now available on CRAN. The Snowflake driver now has an option to control whether to return decimal types or attempt to convert to integers/floats. The PostgreSQL driver now uses COPY for bulk ingestion, which Pandas has found is approximately 35x faster than their previous method. The SQLite driver can now load extensions and supports more data types, including Arrow binary (SQL BLOB), and supports binding some dictionary-encoded types (which will be unpacked). ### Arrow Flight No update ### Arrow Flight SQL There is community interest in implementing and extending an ODBC driver for Arrow Flight SQL, and we are in the process of accepting an ODBC driver[1] to support that interest. [1] https://lists.apache.org/thread/t109wsn86cnk5kbc390snco0s751qhpv ### Arrow Flight SQL adapter for PostgreSQL There were some feedback after the first release and we solved most of them. We’ll release the next version in a few months. ### DataFusion & Ballista DataFusion continues releasing regularly. We have submitted a proposed paper describing the system to the ACM SIGMOD conference and in general are trying to scale the project as it grows in popularity. There is a draft proposal for promoting DataFusion (including Ballista) to a top-level ASF project, and we hope to have a proposed board resolution for the April board meeting. Ballista is not very active but continues to receive occasional contributions. ### nanoarrow Arrow nanoarrow continues its ~quarterly release cadence with active improvement to the C, R, and Python implementations scheduled for release in mid-January as nanoarrow 0.4. ### Language Area Updates Arrow has at least 13 different language implementations, as explained in https://arrow.apache.org/overview/ Arrow 14.0.0 was released from the monorepo: https://arrow.apache.org/blog/2023/11/01/14.0.0-release/ ### C++ Fixed shape tensor extension type has been added as a new canonical extension type in release 12.0.0 and variable shape tensor extension is now being under review for the next release (16.0.0). #### Dataset & Parquet Added support for reading and writing the newly added Parquet float16 logical type. Added support for Parquet modular encryption. #### Acero & Compute Improvements were made to several compute functions. Added support for serializing and deserializing compute expressions using Substrait. #### Gandiva Migrated LLVM JIT engine from MCJIT to ORC v2/LLJIT. Added support for the latest LLVM (17). Added support for registering external function registries. Added support for registering external C functions. ### C# The C# implementation has been steadily improving its compatibility with the standard. Since the last report, it has gained support for duration and interval types, as well as the new types utf8 view, binary view and list view. Dictionaries now work correctly in file and memory implementations, and there are only four explicit exclusions for C# in the Archery integration tests. ### Go Integration testing for the C Data API has been added to the CI for Go and other implementations. The Parquet implementation continues to gain fixes for different encoding/decoding types and bug fixes. Also fixed Go release verification for arm64. ### Java We are working on adding nullability annotations and enabling module support for Java 9+. ### JavaScript No update ### Julia Version 2.7.0 was released on 10 December 2023 ### Rust The parquet implementation continues to mature, for example supporting new statistics metadata. The FFI bindings have been improved as well, and are now integration tested against arrow-cpp (among others) The object_store module, developed as part of this project, allows for generically interacting with object store systems such as AWS S3, Google Cloud Storage, and Azure Blob Storage. This crate has seen significant adoption outside of the arrow community, for example the crates.io service itself. ### C (GLib) No update ### MATLAB We are currently working on integrating with the project release tooling to make it possible to distribute pre-built MLTBX files for easy installation of the MATLAB interface. ### Python There has been ongoing work on improving interoperability with other Python projects for example adding C Data Interface PyCapsule protocol and implementing the usage of capsules in ADBC and nanoarrow-python. We have also implemented the DLPack protocol on Arrow Arrays that is used to move the data to ML libraries. A critical security vulnerability was discovered in PyArrow versions 0.14.0 to 14.0.0 that allowed arbitrary code execution when loading a malicious Arrow IPC, Feather, or Parquet data file (CVE-2023-47248). The vulnerability was patched in PyArrow version 14.0.1. A hotfix package was released to patch the vulnerability in all other versions of PyArrow for users unable to immediately upgrade. ### R Completed large parts of a major rework of the Arrow R package build system. These changes aim to reduce maintenance burden and streamline new-contributor experience e.g. by automating the use of nightly builds which enables contributions to the R package without having to setup a C++ development environment. ### Ruby Added some convenient APIs. ### Swift Improved Flight SQL implementation. ## Community Health: Community communication continues to be strong. There have been 5 blog posts published to https://arrow.apache.org/blog/ in the last 3 months. The mailing lists are active On Sun, Jan 7, 2024 at 12:16 PM Andy Grove <andygrov...@gmail.com> wrote: > Thanks for the updates so far. There are still no updates for many of the > language implementations, and it would be good to get 1-2 lines for each of > them if possible. > > We are currently missing updates for the C (GLib), C++, Go, Java, > JavaScript, Julia, MATLAB, Python, R, Ruby, and Swift Arrow > implementations, as well as the Acera, nanoarrow, Arrow Flight, and Arrow > Flight SQL adapter for PostgreSQL subprojects. > > The report is due this Wednesday, January 10. > > I will endeavor to start this process for future board reports earlier so > we have more time to complete this. > > Thanks, > > Andy. > > > > On Thu, Jan 4, 2024 at 4:08 PM Andy Grove <andygrov...@gmail.com> wrote: > >> Hello Arrow Community, >> >> Please add any comments or board content directly to [1] or reply to >> this email and I will incorporate your comments. You can see what we >> currently have at the end of this email. >> >> One of the responsibilities of being part of the Apache Software >> Foundation >> (ASF) is to regularly summarize the state of the project in a quarterly >> update to the ASF board. I plan to submit the next report on January 10, >> 2024 >> >> While this is partly an administrative reporting exercise, I think it is >> also valuable to reflect on past accomplishments and think about goals for >> the future. >> >> Historically, Arrow has crowd sourced the content which has worked well. >> It would be especially interesting and valuable for members of the various >> language >> implementation communities and subprojects could provide a sentence or two >> updates >> >> Thank you, >> Andy >> >> [1] >> https://docs.google.com/document/d/1wZkDTcaR-fZwT5QUd6sFeYmmNKZXEhSrTu5U0glLAQg/edit?usp=sharing >> >> >> --- >> >> >> 2024-01-10 Arrow ASF Board Report >> >> Arrow PMC Chair Note: Please add any relevant comments / content to this >> document. Andy Grove will submit to the ASF board on January 10, 2024 >> (about one week prior to the scheduled >> <https://svn.apache.org/repos/private/committers/board/calendar.txt> >> board meeting). >> >> The rationale and process for this report: >> https://www.apache.org/foundation/board/reporting >> >> Past report: 2023-10-11 Arrow ASF Board Report >> <https://docs.google.com/document/d/1MU5cxzVuAIuDb6KXOAkwT4ze7IBGHKks_l92gxZeTbg/edit> >> >> The metrics in this report are derived from >> https://reporter.apache.org/wizard/?arrow >> >> ## Description: >> >> The mission of Apache Arrow is the creation and maintenance of software >> >> related to columnar in-memory processing and data interchange. More >> >> information can be found at https://arrow.apache.org/overview/ >> >> ## Project Status: >> >> Current project status: Ongoing (high activity) >> >> Issues for the board: None >> >> ## Membership Data: >> >> There are currently 103 committers and 52 PMC members in this project. >> >> The Committer-to-PMC ratio is roughly 7:4. >> >> Community changes, past quarter: >> >> - Jonathan Keane was added to the PMC on 2023-10-13 >> >> - Raúl Cumplido was added to the PMC on 2023-11-12 >> >> - Curt Hagenlocher was added as committer on 2023-10-14 >> >> - Felipe Oliveira Carvalho was added as committer on 2023-12-06 >> >> - James Duong was added as committer on 2023-11-16 >> >> - Xuwei Fu was added as committer on 2023-10-22 >> >> >> ## Project Activity: >> >> >> >> ## Sub Project Updates >> >> Arrow has several subprojects, as listed on https://arrow.apache.org/ >> >> ### ADBC >> >> >> ### Arrow Flight >> >> >> >> ### Arrow Flight SQL >> >> >> >> ### Arrow Flight SQL adapter for PostgreSQL >> >> >> >> ### DataFusion & Ballista >> >> DataFusion continues releasing regularly. We are working on a paper >> describing the system for ACM SIGMOD, and in general are trying to scale >> the project as it grows in popularity. >> >> Ballista is not very active but continues to receive occasional >> contributions. >> >> >> ### Acero >> >> >> >> ### nanoarrow >> >> >> >> ## Language Area Updates >> >> Arrow has at least 13 different language implementations, as explained in >> >> https://arrow.apache.org/overview/ >> >> Arrow 14.0.0 was released from the monorepo: >> >> https://arrow.apache.org/blog/2023/11/01/14.0.0-release/ >> >> ### C++ >> >> >> ### C# >> >> >> >> ### Go >> >> >> >> ### Java >> >> >> ### JavaScript >> >> >> ### Julia >> >> >> ### Rust >> >> >> ### C (GLib) >> >> >> ### MATLAB >> >> >> ### Python >> >> >> ### R >> >> >> ### Ruby >> >> >> ### Swift >> >> >> >> ## Recent Releases: >> Recent releases: >> >> - >> >> 14.0.2 was released on 2023-12-18. >> - >> >> RS-DATAFUSION-34.0.0 was released on 2023-12-17. >> - >> >> JULIA-2.7.0 was released on 2023-12-10. >> - >> >> RS-DATAFUSION-PYTHON-33.0.0 was released on 2023-11-19. >> - >> >> RS-DATAFUSION-33.0.0 was released on 2023-11-16. >> - >> >> RS-48.0.1 was released on 2023-11-13. >> - >> >> RS-49.0.0 was released on 2023-11-13. >> - >> >> ADBC-0.8.0 was released on 2023-11-09. >> - >> >> 14.0.1 was released on 2023-11-08. >> - >> >> RS-OS-0.8.0 was released on 2023-11-06. >> - >> >> 14.0.0 was released on 2023-11-01. >> - >> >> RS-DATAFUSION-PYTHON-32.0.0 was released on 2023-10-25. >> - >> >> RS-48.0.0 was released on 2023-10-23. >> - >> >> RS-DATAFUSION-32.0.0 was released on 2023-10-12. >> >> >> >> >> ## Community Health: >> >> Community communication continues to be strong. >> >> There have been 5 blog posts published to https://arrow.apache.org/blog/ >> in >> >> the last 3 months. >> >> The mailing lists are active >> >> >> dev@arrow.apache.org had a 51% increase in traffic in the past quarter >> (715 emails compared to 472) >> >> >> For the mono repo: >> >> 2374 commits in the past quarter (-2% change) >> >> 249 code contributors in the past quarter (-2% change) >> >> 1752 PRs opened on GitHub, past quarter (-7% change) >> >> 1698 PRs closed on GitHub, past quarter (-5% change) >> >> 1364 issues opened on GitHub, past quarter (-14% change) >> >> 1052 issues closed on GitHub, past quarter (-18% change) >> >> >> >> >>