This is an automated email from the ASF dual-hosted git repository. wesm pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push: new 07e1042 ARROW-5826: [Website] Blog post for 0.14.0 release announcement 07e1042 is described below commit 07e104260b3366cc97900abd6aa7999f75969791 Author: Sutou Kouhei <k...@clear-code.com> AuthorDate: Mon Jul 8 12:15:41 2019 -0500 ARROW-5826: [Website] Blog post for 0.14.0 release announcement I plan to publish by noon US/Eastern time on Monday July 8 -- please post commits with any edits or push directly to my branch and I will include the changes in the post. Author: Sutou Kouhei <k...@clear-code.com> Author: Wes McKinney <wesm+...@apache.org> Author: Wes McKinney <w...@users.noreply.github.com> Closes #4819 from wesm/ARROW-5826 and squashes the following commits: 079c88ccc <Wes McKinney> Add discussion links f4f208a42 <Wes McKinney> Update site/_posts/2019-07-08-0.14.0-release.md 4dc02a214 <Sutou Kouhei> Add Packaging section c5d9cdcb0 <Sutou Kouhei> Add Sparse representation and compression e47d69a94 <Sutou Kouhei> Fix a typo 205f222f6 <Sutou Kouhei> Fix markup 1544da180 <Wes McKinney> Attribute community f9653fa17 <Wes McKinney> Finish tweaking formatting, links 6b82e825f <Wes McKinney> Start markdownifying 0.14.0 blog post --- site/README.md | 7 + site/_data/contributors.yml | 5 +- site/_posts/2019-07-08-0.14.0-release.md | 300 +++++++++++++++++++++++++++++++ site/_release/0.14.0.md | 40 ++--- 4 files changed, 331 insertions(+), 21 deletions(-) diff --git a/site/README.md b/site/README.md index 73fd185..33758e9 100644 --- a/site/README.md +++ b/site/README.md @@ -36,6 +36,13 @@ gem install jekyll bundler bundle install ``` +On some platforms, the Ruby `nokogiri` library may fail to build, in +such cases the following configuration option may help: + +``` +bundle config build.nokogiri --use-system-libraries +``` + If you are planning to publish the website, you must clone the arrow-site git repository. Run this command from the `site` directory so that `asf-site` is a subdirectory of `site`. diff --git a/site/_data/contributors.yml b/site/_data/contributors.yml index 95b18b2..185a565 100644 --- a/site/_data/contributors.yml +++ b/site/_data/contributors.yml @@ -16,10 +16,13 @@ # Database of contributors to Apache Arrow (WIP) # Blogs and other pages use this data # +- name: Apache Arrow Community + githubId: apache + homepage: https://arrow.apache.org - name: Wes McKinney apacheId: wesm githubId: wesm - homepage: http://wesmckinney.com + homepage: https://wesmckinney.com role: PMC - name: Uwe Korn apacheId: uwe diff --git a/site/_posts/2019-07-08-0.14.0-release.md b/site/_posts/2019-07-08-0.14.0-release.md new file mode 100644 index 0000000..1c1f1c8 --- /dev/null +++ b/site/_posts/2019-07-08-0.14.0-release.md @@ -0,0 +1,300 @@ +--- +layout: post +title: "Apache Arrow 0.14.0 Release" +date: "2019-07-02 00:00:00 -0600" +author: apache +categories: [release] +--- +<!-- +{% comment %} +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to you under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +{% endcomment %} +--> + +The Apache Arrow team is pleased to announce the 0.14.0 release. This +covers 3 months of development work and includes [**602 resolved +issues**][1] from [**75 distinct contributors**][2]. See the Install +Page to learn how to get the libraries for your platform. The +[complete changelog][3] is also available. + +This post will give some brief highlights in the project since the +0.13.0 release from April. + +## New committers + +Since the 0.13.0 release, the following have been added: + +* [Neville Dipale][5] was added as a committer +* [François Saint-Jacques][6] was added as a committer +* [Praveen Kumar][7] was added as a committer + +Thank you for all your contributions! + +## Upcoming 1.0.0 Format Stability Release + +We are planning for our next major release to move from 0.14.0 to +1.0.0. The major version number will indicate stability of the Arrow +columnar format and binary protocol. While the format has already been +stable since December 2017, we believe it is a good idea to make this +stability official and to indicate that it is safe to persist +serialized Arrow data in applications. This means that applications +will be able to safely upgrade to new Arrow versions without having to +worry about backwards incompatibilities. We will write in a future +blog post about the stability guarantees we intend to provide to help +application developers plan accordingly. + +## Packaging + +We added support for the following platforms: + +* Debian GNU/Linux buster +* Ubuntu 19.04 + +We dropped support for Ubuntu 14.04. + +## Development Infrastructure and Tooling + +As the project has grown larger and more diverse, we are increasingly +outgrowing what we can test in public continuous integration services +like Travis CI and Appveyor. In addition, we share these resources +with the entire Apache Software Foundation, and given the high volume +of pull requests into Apache Arrow, maintainers are frequently waiting +many hours for the green light to merge patches. + +The complexity of our testing is driven by the number of different +components and programming languages as well as increasingly long +compilation and test execution times as individual libraries grow +larger. The 50 minute time limit of public CI services is simply too +limited to comprehensively test the project. Additionally, the CI host +machines are constrained in their features and memory limits, +preventing us from testing features that are only relevant on large +amounts of data (10GB or more) or functionality that requires a +CUDA-enabled GPU. + +Organizations that contribute to Apache Arrow are working on physical +build infrastructure and tools to improve build times and build +scalability. One such new tool is `ursabot`, a GitHub-enabled bot +that can be used to trigger builds either on physical build or in the +cloud. It can also be used to trigger benchmark timing comparisons. If +you are contributing to the project, you may see Ursabot being +employed to trigger tests in pull requests. + +To help assist with migrating away from Travis CI, we are also working +to make as many of our builds reproducible with Docker and not reliant +on Travis CI-specific configuration details. This will also help +contributors reproduce build failures locally without having to wait +for Travis CI. + +## Columnar Format Notes + +* User-defined "extension" types have been formalized in the Arrow + format, enabling library users to embed custom data types in the + Arrow columnar format. Initial support is available in C++, Java, + and Python. +* A new Duration logical type was added to represent absolute lengths + of time. + +## Arrow Flight notes + +Flight now supports many of the features of a complete RPC +framework. + +* Authentication APIs are now supported across all languages (ARROW-5137) +* Encrypted communication using OpenSSL is supported (ARROW-5643, + ARROW-5529) +* Clients can specify timeouts on remote calls (ARROW-5136) +* On the protocol level, endpoints are now identified with URIs, to + support an open-ended number of potential transports (including TLS + and Unix sockets, and perhaps even non-gRPC-based transports in the + future) (ARROW-4651) +* Application-defined metadata can be sent alongside data (ARROW-4626, + ARROW-4627). + +Windows is now a supported platform for Flight in C++ and Python +(ARROW-3294), and Python wheels are shipped for all languages +(ARROW-3150, ARROW-5656). C++, Python, and Java have been brought to +parity, now that actions can return streaming results in Java +(ARROW-5254). + +## C++ notes + +188 resolved issues related to the C++ implementation, so we summarize +some of the work here. + +### General platform improvements + +* A FileSystem abstraction (ARROW-767) has been added, which paves the + way for a future Arrow Datasets library allowing to access sharded + data on arbitrary storage systems, including remote or cloud + storage. A first draft of the Datasets API was committed in + ARROW-5512. Right now, this comes with no implementation, but we + expect to slowly build it up in the coming weeks or months. Early + feedback is welcome on this API. +* The dictionary API has been reworked in ARROW-3144. The dictionary + values used to be tied to the DictionaryType instance, which ended + up too inflexible. Since dictionary-encoding is more often an + optimization than a semantic property of the data, we decided to + move the dictionary values to the ArrayData structure, making it + natural for dictionary-encoded arrays to share the same DataType + instance, regardless of the encoding details. +* The FixedSizeList and Map types have been implemented, including in + integration tests. The Map type is akin to a List of Struct(key, + value) entries, but making it explicit that the underlying data has + key-value mapping semantics. Also, map entries are always non-null. +* A `Result<T>` class has been introduced in ARROW-4800. The aim is to + allow to return an error as w ell as a function's logical result + without resorting to pointer-out arguments. +* The Parquet C++ library has been refactored to use common Arrow IO + classes for improved C++ platform interoperability. + +### Line-delimited JSON reader + +A multithreaded line-delimited JSON reader (powered internally by +RapidJSON) is now available for use (also in Python and R via +bindings) . This will likely be expanded to support more kinds of JSON +storage in the future. + +### New computational kernels + +A number of new computational kernels have been developed + +* Compare filter for logical comparisons yielding boolean arrays +* Filter kernel for selecting elements of an input array according to + a boolean selection array. +* Take kernel, which selects elements by integer index, has been + expanded to support nested types + +## C# Notes + +The native C# implementation has continued to mature since 0.13. This +release includes a number of performance, memory use, and usability +improvements. + +## Go notes + +Go's support for the Arrow columnar format continues to expand. Go now +supports reading and writing the Arrow columnar binary protocol, and +it has also been **added to the cross language integration +tests**. There are now four languages (C++, Go, Java, and JavaScript) +included in our integration tests to verify cross-language +interoperability. + +## Java notes + +* Support for referencing arbitrary memory using `ArrowBuf` has been + implemented, paving the way for memory map support in Java +* A number of performance improvements around vector value access were + added (see ARROW-5264, ARROW-5290). +* The Map type has been implemented in Java and integration tested + with C++ +* Several microbenchmarks have been added and improved. Including a + significant speed-up of zeroing out buffers. +* A new algorithms package has been started to contain reference + implementations of common algorithms. The initial contribution is + for Array/Vector sorting. + +## JavaScript Notes + +A new incremental [array builder API][4] is available. + +## MATLAB Notes + +Version 0.14.0 features improved Feather file support in the MEX bindings. + +## Python notes + +* We fixed a problem with the Python wheels causing the Python wheels + to be much larger in 0.13.0 than they were in 0.12.0. Since the + introduction of LLVM into our build toolchain, the wheels are going + to still be significantly bigger. We are interested in approaches to + enable pyarrow to be installed in pieces with pip or conda rather + than monolithically. +* It is now possible to define ExtensionTypes with a Python + implementation (ARROW-840). Those ExtensionTypes can survive a + roundtrip through C++ and serialization. +* The Flight improvements highlighted above (see C++ notes) are all + available from Python. Furthermore, Flight is now bundled in our + binary wheels and conda packages for Linux, Windows and macOS + (ARROW-3150, ARROW-5656). +* We will build "manylinux2010" binary wheels for Linux systems, in + addition to "manylinux1" wheels (ARROW-2461). Manylinux2010 is a + newer standard for more recent systems, with less limiting toolchain + constraints. Installing manylinux2010 wheels requires an up-to-date + version of pip. +* Various bug fixes for CSV reading in Python and C++ including the + ability to parse Decimal(x, y) columns. + +### Parquet improvements + +* Column statistics for logical types like unicode strings, unsigned + integers, and timestamps are casted to compatible Python types (see + ARROW-4139) +* It's now possible to configure "data page" sizes when writing a file + from Python + +## Ruby and C GLib notes + +The GLib and Ruby bindings have been tracking features in the C++ +project. This release includes bindings for Gandiva, JSON reader, and +other C++ features. + +## Rust notes + +There is ongoing work in Rust happening on Parquet file support, +computational kernels, and the DataFusion query engine. See the full +changelog for details. + +## R notes + +We have been working on build and packaging for R so that community +members can hopefully release the project to CRAN in the near +future. Feature development for R has continued to follow the upstream +C++ project. + +## Community Discussions Ongoing + +There are a number of active discussions ongoing on the developer +d...@arrow.apache.org mailing list. We look forward to hearing from the +community there: + +* [Timing and scope of 1.0.0 release][15] +* [Solutions to increase continuous integration capacity][13] +* [A proposal for versioning and forward/backward compatibility + guarantees for the 1.0.0 release][8] was shared, not much discussion has + occurred yet. +* [Addressing possible unaligned access and undefined behavior concerns][9] + in the Arrow binary protocol +* [Supporting smaller than 128-bit encoding of fixed width decimals][10] +* [Forking the Avro C++ implementation][11] so as to adapt it to Arrow's + needs +* [Sparse representation and compression in Arrow][12] +* [Flight extensions: middleware API and generalized Put operations][14] + +[1]: https://issues.apache.org/jira/issues/?jql=project%20%3D%20ARROW%20AND%20status%20%3D%20Resolved%20AND%20fixVersion%20%3D%200.13.0 +[2]: https://arrow.apache.org/release/0.14.0.html#contributors +[3]: https://arrow.apache.org/release/0.14.0.html +[4]: https://github.com/apache/arrow/tree/master/js/src/builder +[5]: https://github.com/nevi-me +[6]: https://github.com/fsaintjacques +[7]: https://github.com/praveenbingo +[8]: https://lists.apache.org/thread.html/5715a4d402c835d22d929a8069c5c0cf232077a660ee98639d544af8@%3Cdev.arrow.apache.org%3E +[9]: https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E +[10]: https://lists.apache.org/thread.html/31b00086c2991104bd71fb1a2173f32b4a2f569d8e7b5b41e836f3a3@%3Cdev.arrow.apache.org%3E +[11]: https://lists.apache.org/thread.html/97d78112ab583eecb155a7d78342c1063df65d64ec3ccfa0b18737c3@%3Cdev.arrow.apache.org%3E +[12]: https://lists.apache.org/thread.html/a99124e57c14c3c9ef9d98f3c80cfe1dd25496bf3ff7046778add937@%3Cdev.arrow.apache.org%3E +[13]: https://lists.apache.org/thread.html/96b2e22606e8a7b0ad7dc4aae16f232724d1059b34636676ed971d40@%3Cdev.arrow.apache.org%3E +[14]: https://lists.apache.org/thread.html/82a7c026ad18dbe9fdbcffa3560979aff6fd86dd56a49f40d9cfb46e@%3Cdev.arrow.apache.org%3E +[15]: https://lists.apache.org/thread.html/44a7a3d256ab5dbd62da6fe45b56951b435697426bf4adedb6520907@%3Cdev.arrow.apache.org%3E \ No newline at end of file diff --git a/site/_release/0.14.0.md b/site/_release/0.14.0.md index 8bf84c8..ed191d9 100644 --- a/site/_release/0.14.0.md +++ b/site/_release/0.14.0.md @@ -192,13 +192,13 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-3166](https://issues.apache.org/jira/browse/ARROW-3166) - [C++] Consolidate IO interfaces used in arrow/io and parquet-cpp * [ARROW-3191](https://issues.apache.org/jira/browse/ARROW-3191) - [Java] Add support for ArrowBuf to point to arbitrary memory. * [ARROW-3200](https://issues.apache.org/jira/browse/ARROW-3200) - [C++] Add support for reading Flight streams with dictionaries -* [ARROW-3290](https://issues.apache.org/jira/browse/ARROW-3290) - [C++] Toolchain support for secure gRPC +* [ARROW-3290](https://issues.apache.org/jira/browse/ARROW-3290) - [C++] Toolchain support for secure gRPC * [ARROW-3294](https://issues.apache.org/jira/browse/ARROW-3294) - [C++] Test Flight RPC on Windows / Appveyor * [ARROW-3314](https://issues.apache.org/jira/browse/ARROW-3314) - [R] Set -rpath using pkg-config when building * [ARROW-3419](https://issues.apache.org/jira/browse/ARROW-3419) - [C++] Run include-what-you-use checks as nightly build * [ARROW-3459](https://issues.apache.org/jira/browse/ARROW-3459) - [C++][Gandiva] Add support for variable length output vectors * [ARROW-3475](https://issues.apache.org/jira/browse/ARROW-3475) - [C++] Int64Builder.Finish(NumericArray<Int64Type>) -* [ARROW-3572](https://issues.apache.org/jira/browse/ARROW-3572) - [Packaging] Correctly handle ssh origin urls for crossbow +* [ARROW-3572](https://issues.apache.org/jira/browse/ARROW-3572) - [Packaging] Correctly handle ssh origin urls for crossbow * [ARROW-3671](https://issues.apache.org/jira/browse/ARROW-3671) - [Go] implement Interval array * [ARROW-3676](https://issues.apache.org/jira/browse/ARROW-3676) - [Go] implement Decimal128 array * [ARROW-3679](https://issues.apache.org/jira/browse/ARROW-3679) - [Go] implement IPC protocol @@ -213,7 +213,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-3791](https://issues.apache.org/jira/browse/ARROW-3791) - [C++] Add type inference for boolean values in CSV files * [ARROW-3794](https://issues.apache.org/jira/browse/ARROW-3794) - [R] Consider mapping INT8 to integer() not raw() * [ARROW-3804](https://issues.apache.org/jira/browse/ARROW-3804) - [R] Consider lowering required R runtime -* [ARROW-3810](https://issues.apache.org/jira/browse/ARROW-3810) - [R] type= argument for Array and ChunkedArray +* [ARROW-3810](https://issues.apache.org/jira/browse/ARROW-3810) - [R] type= argument for Array and ChunkedArray * [ARROW-3811](https://issues.apache.org/jira/browse/ARROW-3811) - [R] struct arrays inference * [ARROW-3814](https://issues.apache.org/jira/browse/ARROW-3814) - [R] RecordBatch$from\_arrays() * [ARROW-3815](https://issues.apache.org/jira/browse/ARROW-3815) - [R] refine record batch factory @@ -226,7 +226,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-4047](https://issues.apache.org/jira/browse/ARROW-4047) - [Python] Document use of int96 timestamps and options in Parquet docs * [ARROW-4086](https://issues.apache.org/jira/browse/ARROW-4086) - [Java] Add apis to debug alloc failures * [ARROW-4121](https://issues.apache.org/jira/browse/ARROW-4121) - [C++] Refactor memory allocation from InvertKernel -* [ARROW-4159](https://issues.apache.org/jira/browse/ARROW-4159) - [C++] Check for -Wdocumentation issues +* [ARROW-4159](https://issues.apache.org/jira/browse/ARROW-4159) - [C++] Check for -Wdocumentation issues * [ARROW-4194](https://issues.apache.org/jira/browse/ARROW-4194) - [Format] Metadata.rst does not specify timezone for Timestamp type * [ARROW-4302](https://issues.apache.org/jira/browse/ARROW-4302) - [C++] Add OpenSSL to C++ build toolchain * [ARROW-4337](https://issues.apache.org/jira/browse/ARROW-4337) - [C#] Array / RecordBatch Builder Fluent API @@ -246,7 +246,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-4627](https://issues.apache.org/jira/browse/ARROW-4627) - [Flight] Add application metadata field to DoPut * [ARROW-4701](https://issues.apache.org/jira/browse/ARROW-4701) - [C++] Add JSON chunker benchmarks * [ARROW-4702](https://issues.apache.org/jira/browse/ARROW-4702) - [C++] Upgrade dependency versions -* [ARROW-4708](https://issues.apache.org/jira/browse/ARROW-4708) - [C++] Add multithreaded JSON reader +* [ARROW-4708](https://issues.apache.org/jira/browse/ARROW-4708) - [C++] Add multithreaded JSON reader * [ARROW-4714](https://issues.apache.org/jira/browse/ARROW-4714) - [C++][Java] Providing JNI interface to Read ORC file via Arrow C++ * [ARROW-4717](https://issues.apache.org/jira/browse/ARROW-4717) - [C#] Consider exposing ValueTask instead of Task * [ARROW-4719](https://issues.apache.org/jira/browse/ARROW-4719) - [C#] Implement ChunkedArray, Column and Table in C# @@ -262,7 +262,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-4904](https://issues.apache.org/jira/browse/ARROW-4904) - [C++] Move implementations in arrow/ipc/test-common.h into libarrow\_testing * [ARROW-4911](https://issues.apache.org/jira/browse/ARROW-4911) - [R] Support for building package for Windows * [ARROW-4912](https://issues.apache.org/jira/browse/ARROW-4912) - [C++, Python] Allow specifying column names to CSV reader -* [ARROW-4913](https://issues.apache.org/jira/browse/ARROW-4913) - [Java][Memory] Limit number of ledgers and arrowbufs +* [ARROW-4913](https://issues.apache.org/jira/browse/ARROW-4913) - [Java][Memory] Limit number of ledgers and arrowbufs * [ARROW-4945](https://issues.apache.org/jira/browse/ARROW-4945) - [Flight] Enable Flight integration tests in Travis * [ARROW-4956](https://issues.apache.org/jira/browse/ARROW-4956) - [C#] Allow ArrowBuffers to wrap external Memory in C# * [ARROW-4959](https://issues.apache.org/jira/browse/ARROW-4959) - [Gandiva][Crossbow] Builds broken @@ -274,7 +274,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-4990](https://issues.apache.org/jira/browse/ARROW-4990) - [C++] Kernel to compare array with array * [ARROW-4993](https://issues.apache.org/jira/browse/ARROW-4993) - [C++] Display summary at the end of CMake configuration * [ARROW-5000](https://issues.apache.org/jira/browse/ARROW-5000) - [Python] Fix deprecation warning from setup.py -* [ARROW-5007](https://issues.apache.org/jira/browse/ARROW-5007) - [C++] Move DCHECK out of sse-utils +* [ARROW-5007](https://issues.apache.org/jira/browse/ARROW-5007) - [C++] Move DCHECK out of sse-utils * [ARROW-5020](https://issues.apache.org/jira/browse/ARROW-5020) - [C++][Gandiva] Split Gandiva-related conda packages for builds into separate .yml conda env file * [ARROW-5027](https://issues.apache.org/jira/browse/ARROW-5027) - [Python] Add JSON Reader * [ARROW-5038](https://issues.apache.org/jira/browse/ARROW-5038) - [Rust] [DataFusion] Implement AVG aggregate function @@ -282,7 +282,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5045](https://issues.apache.org/jira/browse/ARROW-5045) - [Rust] Code coverage silently failing in CI * [ARROW-5053](https://issues.apache.org/jira/browse/ARROW-5053) - [Rust] [DataFusion] Use env var for location of arrow test data * [ARROW-5054](https://issues.apache.org/jira/browse/ARROW-5054) - [C++][Release] Test Flight in verify-release-candidate.sh -* [ARROW-5056](https://issues.apache.org/jira/browse/ARROW-5056) - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems +* [ARROW-5056](https://issues.apache.org/jira/browse/ARROW-5056) - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems * [ARROW-5061](https://issues.apache.org/jira/browse/ARROW-5061) - [Release] Improve 03-binary performance * [ARROW-5062](https://issues.apache.org/jira/browse/ARROW-5062) - [Java] Shade Java Guava dependency for Flight * [ARROW-5063](https://issues.apache.org/jira/browse/ARROW-5063) - [Java] FlightClient should not create a child allocator @@ -337,7 +337,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5203](https://issues.apache.org/jira/browse/ARROW-5203) - [GLib] Add support for Compare filter * [ARROW-5204](https://issues.apache.org/jira/browse/ARROW-5204) - [C++] Improve BufferBuilder performance * [ARROW-5212](https://issues.apache.org/jira/browse/ARROW-5212) - [Go] Array BinaryBuilder in Go library has no access to resize the values buffer -* [ARROW-5218](https://issues.apache.org/jira/browse/ARROW-5218) - [C++] Improve build when third-party library locations are specified +* [ARROW-5218](https://issues.apache.org/jira/browse/ARROW-5218) - [C++] Improve build when third-party library locations are specified * [ARROW-5219](https://issues.apache.org/jira/browse/ARROW-5219) - [C++] Build protobuf\_ep in parallel when using Ninja * [ARROW-5222](https://issues.apache.org/jira/browse/ARROW-5222) - [Python] Issues with installing pyarrow for development on MacOS * [ARROW-5225](https://issues.apache.org/jira/browse/ARROW-5225) - [Java] Improve performance of BaseValueVector#getValidityBufferSizeFromCount @@ -362,7 +362,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5288](https://issues.apache.org/jira/browse/ARROW-5288) - [Documentation] Enrich the contribution guidelines * [ARROW-5289](https://issues.apache.org/jira/browse/ARROW-5289) - [C++] Move arrow/util/concatenate.h to arrow/array/ * [ARROW-5290](https://issues.apache.org/jira/browse/ARROW-5290) - [Java] Provide a flag to enable/disable null-checking in vectors' get methods -* [ARROW-5291](https://issues.apache.org/jira/browse/ARROW-5291) - [Python] Add wrapper for "take" kernel on Array +* [ARROW-5291](https://issues.apache.org/jira/browse/ARROW-5291) - [Python] Add wrapper for "take" kernel on Array * [ARROW-5298](https://issues.apache.org/jira/browse/ARROW-5298) - [Rust] Add debug implementation for Buffer * [ARROW-5299](https://issues.apache.org/jira/browse/ARROW-5299) - [C++] ListArray comparison is incorrect * [ARROW-5309](https://issues.apache.org/jira/browse/ARROW-5309) - [Python] Add clarifications to Python "append" methods that return new objects @@ -441,7 +441,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5488](https://issues.apache.org/jira/browse/ARROW-5488) - [R] Workaround when C++ lib not available * [ARROW-5490](https://issues.apache.org/jira/browse/ARROW-5490) - [C++] Remove ARROW\_BOOST\_HEADER\_ONLY * [ARROW-5491](https://issues.apache.org/jira/browse/ARROW-5491) - [C++] Remove unecessary semicolons following MACRO definitions -* [ARROW-5492](https://issues.apache.org/jira/browse/ARROW-5492) - [R] Add "col\_select" argument to read\_\* functions to read subset of columns +* [ARROW-5492](https://issues.apache.org/jira/browse/ARROW-5492) - [R] Add "col\_select" argument to read\_\* functions to read subset of columns * [ARROW-5495](https://issues.apache.org/jira/browse/ARROW-5495) - [C++] Use HTTPS consistently for downloading dependencies * [ARROW-5496](https://issues.apache.org/jira/browse/ARROW-5496) - [R][CI] Fix relative paths in R codecov.io reporting * [ARROW-5498](https://issues.apache.org/jira/browse/ARROW-5498) - [C++] Build failure with Flatbuffers 1.11.0 and MinGW @@ -453,7 +453,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5512](https://issues.apache.org/jira/browse/ARROW-5512) - [C++] Draft initial public APIs for Datasets project * [ARROW-5513](https://issues.apache.org/jira/browse/ARROW-5513) - [Java] Refactor method name for getstartOffset to use camel case * [ARROW-5516](https://issues.apache.org/jira/browse/ARROW-5516) - [Python] Development page for pyarrow has a missing dependency in using pip -* [ARROW-5518](https://issues.apache.org/jira/browse/ARROW-5518) - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear +* [ARROW-5518](https://issues.apache.org/jira/browse/ARROW-5518) - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear * [ARROW-5524](https://issues.apache.org/jira/browse/ARROW-5524) - [C++] Turn off PARQUET\_BUILD\_ENCRYPTION in CMake if OpenSSL not found * [ARROW-5526](https://issues.apache.org/jira/browse/ARROW-5526) - [Developer] Add more prominent notice to GitHub issue template to direct bug reports to JIRA * [ARROW-5529](https://issues.apache.org/jira/browse/ARROW-5529) - [Flight] Allow serving with multiple TLS certificates @@ -532,7 +532,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5718](https://issues.apache.org/jira/browse/ARROW-5718) - [R] auto splice data frames in record\_batch() and table() * [ARROW-5721](https://issues.apache.org/jira/browse/ARROW-5721) - [Rust] Move array related code into a separate module * [ARROW-5724](https://issues.apache.org/jira/browse/ARROW-5724) - [R] [CI] AppVeyor build should use ccache -* [ARROW-5725](https://issues.apache.org/jira/browse/ARROW-5725) - [Crossbow] Port conda recipes to azure pipelines +* [ARROW-5725](https://issues.apache.org/jira/browse/ARROW-5725) - [Crossbow] Port conda recipes to azure pipelines * [ARROW-5726](https://issues.apache.org/jira/browse/ARROW-5726) - [Java] Implement a common interface for int vectors * [ARROW-5727](https://issues.apache.org/jira/browse/ARROW-5727) - [Python] [CI] Install pytest-faulthandler before running tests * [ARROW-5748](https://issues.apache.org/jira/browse/ARROW-5748) - [Packaging][deb] Add support for Debian GNU/Linux buster @@ -570,7 +570,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-2461](https://issues.apache.org/jira/browse/ARROW-2461) - [Python] Build wheels for manylinux2010 tag * [ARROW-3344](https://issues.apache.org/jira/browse/ARROW-3344) - [Python] test\_plasma.py fails (in test\_plasma\_list) * [ARROW-3399](https://issues.apache.org/jira/browse/ARROW-3399) - [Python] Cannot serialize numpy matrix object -* [ARROW-3650](https://issues.apache.org/jira/browse/ARROW-3650) - [Python] Mixed column indexes are read back as strings +* [ARROW-3650](https://issues.apache.org/jira/browse/ARROW-3650) - [Python] Mixed column indexes are read back as strings * [ARROW-3762](https://issues.apache.org/jira/browse/ARROW-3762) - [C++] Parquet arrow::Table reads error when overflowing capacity of BinaryArray * [ARROW-4021](https://issues.apache.org/jira/browse/ARROW-4021) - [Ruby] Error building red-arrow on msys2 * [ARROW-4076](https://issues.apache.org/jira/browse/ARROW-4076) - [Python] schema validation and filters @@ -592,7 +592,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-4885](https://issues.apache.org/jira/browse/ARROW-4885) - [Python] read\_csv() can't handle decimal128 columns * [ARROW-4886](https://issues.apache.org/jira/browse/ARROW-4886) - [Rust] Inconsistent behaviour with casting sliced primitive array to list array * [ARROW-4923](https://issues.apache.org/jira/browse/ARROW-4923) - Expose setters for Decimal vector that take long and double inputs -* [ARROW-4934](https://issues.apache.org/jira/browse/ARROW-4934) - [Python] Address deprecation notice that will be a bug in Python 3.8 +* [ARROW-4934](https://issues.apache.org/jira/browse/ARROW-4934) - [Python] Address deprecation notice that will be a bug in Python 3.8 * [ARROW-5019](https://issues.apache.org/jira/browse/ARROW-5019) - [C#] ArrowStreamWriter doesn't work on a non-seekable stream * [ARROW-5049](https://issues.apache.org/jira/browse/ARROW-5049) - [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark * [ARROW-5051](https://issues.apache.org/jira/browse/ARROW-5051) - [GLib][Gandiva] Test failure in release verification script @@ -614,7 +614,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5142](https://issues.apache.org/jira/browse/ARROW-5142) - [CI] Fix conda calls in AppVeyor scripts * [ARROW-5144](https://issues.apache.org/jira/browse/ARROW-5144) - [Python] ParquetDataset and ParquetPiece not serializable * [ARROW-5146](https://issues.apache.org/jira/browse/ARROW-5146) - [Dev] Merge script imposes directory name -* [ARROW-5147](https://issues.apache.org/jira/browse/ARROW-5147) - [C++] get an error in building: Could NOT find DoubleConversion +* [ARROW-5147](https://issues.apache.org/jira/browse/ARROW-5147) - [C++] get an error in building: Could NOT find DoubleConversion * [ARROW-5148](https://issues.apache.org/jira/browse/ARROW-5148) - [CI] [C++] LLVM-related compile errors * [ARROW-5149](https://issues.apache.org/jira/browse/ARROW-5149) - [Packaging][Wheel] Pin LLVM to version 7 in windows builds * [ARROW-5152](https://issues.apache.org/jira/browse/ARROW-5152) - [Python] CMake warnings when building @@ -622,7 +622,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5160](https://issues.apache.org/jira/browse/ARROW-5160) - [C++] ABORT\_NOT\_OK evalutes expression twice * [ARROW-5166](https://issues.apache.org/jira/browse/ARROW-5166) - [Python][Parquet] Statistics for uint64 columns may overflow * [ARROW-5167](https://issues.apache.org/jira/browse/ARROW-5167) - [C++] Upgrade string-view-light to latest -* [ARROW-5169](https://issues.apache.org/jira/browse/ARROW-5169) - [Python] non-nullable fields are converted to nullable in {{Table.from\_pandas}} +* [ARROW-5169](https://issues.apache.org/jira/browse/ARROW-5169) - [Python] non-nullable fields are converted to nullable in Table.from\_pandas * [ARROW-5173](https://issues.apache.org/jira/browse/ARROW-5173) - [Go] handle multiple concatenated streams back-to-back * [ARROW-5174](https://issues.apache.org/jira/browse/ARROW-5174) - [Go] implement Stringer for DataTypes * [ARROW-5177](https://issues.apache.org/jira/browse/ARROW-5177) - [Python] ParquetReader.read\_column() doesn't check bounds @@ -655,13 +655,13 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5301](https://issues.apache.org/jira/browse/ARROW-5301) - [Python] parquet documentation outdated on nthreads argument * [ARROW-5306](https://issues.apache.org/jira/browse/ARROW-5306) - [CI] [GLib] Disable GTK-Doc * [ARROW-5308](https://issues.apache.org/jira/browse/ARROW-5308) - [Go] remove deprecated Feather format -* [ARROW-5314](https://issues.apache.org/jira/browse/ARROW-5314) - [Go] Incorrect Printing for String Arrays with Offsets +* [ARROW-5314](https://issues.apache.org/jira/browse/ARROW-5314) - [Go] Incorrect Printing for String Arrays with Offsets * [ARROW-5325](https://issues.apache.org/jira/browse/ARROW-5325) - [Archery][Benchmark] Output properly formatted jsonlines from benchmark diff cli command * [ARROW-5330](https://issues.apache.org/jira/browse/ARROW-5330) - [Python] [CI] Run Python Flight tests on Travis-CI * [ARROW-5332](https://issues.apache.org/jira/browse/ARROW-5332) - [R] R package fails to build/install: error in dyn.load() * [ARROW-5348](https://issues.apache.org/jira/browse/ARROW-5348) - [CI] [Java] Gandiva checkstyle failure * [ARROW-5360](https://issues.apache.org/jira/browse/ARROW-5360) - [Rust] Builds are broken by rustyline on nightly 2019-05-16+ -* [ARROW-5362](https://issues.apache.org/jira/browse/ARROW-5362) - [C++] Compression round trip test can cause some sanitizers to to fail +* [ARROW-5362](https://issues.apache.org/jira/browse/ARROW-5362) - [C++] Compression round trip test can cause some sanitizers to to fail * [ARROW-5371](https://issues.apache.org/jira/browse/ARROW-5371) - [Release] Add tests for dev/release/00-prepare.sh * [ARROW-5373](https://issues.apache.org/jira/browse/ARROW-5373) - [Java] Add missing details for Gandiva Java Build * [ARROW-5376](https://issues.apache.org/jira/browse/ARROW-5376) - [C++] Compile failure on gcc 5.4.0 @@ -669,7 +669,7 @@ $ git shortlog -csn apache-arrow-0.13.0..apache-arrow-0.14.0 * [ARROW-5387](https://issues.apache.org/jira/browse/ARROW-5387) - [Go] properly handle sub-slice of List * [ARROW-5388](https://issues.apache.org/jira/browse/ARROW-5388) - [Go] use arrow.TypeEqual in array.NewChunked * [ARROW-5390](https://issues.apache.org/jira/browse/ARROW-5390) - [CI] Job time limit exceeded on Travis -* [ARROW-5397](https://issues.apache.org/jira/browse/ARROW-5397) - Test Flight TLS support +* [ARROW-5397](https://issues.apache.org/jira/browse/ARROW-5397) - Test Flight TLS support * [ARROW-5398](https://issues.apache.org/jira/browse/ARROW-5398) - [Python] Flight tests broken by URI changes * [ARROW-5403](https://issues.apache.org/jira/browse/ARROW-5403) - [C++] Test failures not propagated in Windows shared builds * [ARROW-5411](https://issues.apache.org/jira/browse/ARROW-5411) - [C++][Python] Build error building on Mac OS Mojave