Messages by Thread
-
Re: [I] Support async CustomOpen in FileSource [arrow]
via GitHub
-
Re: [I] [R][CI] Re-enable centos binary test job in r-binary-packages [arrow]
via GitHub
-
Re: [I] [C++] Allow non destructive finalize method on aggregation kernels [arrow]
via GitHub
-
Re: [I] [R] Support joining using `NA` as join key [arrow]
via GitHub
-
Re: [I] [C++][FlightRPC] Memory tracking for arrow flight over grpc [arrow]
via GitHub
-
Re: [I] [C++] Concatenating a single array is a compaction utility [arrow]
via GitHub
-
Re: [I] [MATLAB] Consider renaming `field()`, `column()` and `chunk()` methods to `getField()`, `getColumn()`, and `getChunk()` [arrow]
via GitHub
-
Re: [I] [C++] S3 stress tests can fail to delete a temporary directory [arrow]
via GitHub
-
Re: [I] [C++] Add sorting and hashing fast paths for string view [arrow]
via GitHub
-
Re: [I] [C++][Compute] Checked arithmetic functions are slow-ish [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Dataset: Parquet Writer Supports user memory-pool [arrow]
via GitHub
-
Re: [I] [C++] Add software implementation of PDEP and reenable BMI2 code paths on all AVX2 CPUs [arrow]
via GitHub
-
Re: [I] [Docs][FlightRPC] Update documentation to note that `FlightInfo::schema` may be null [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Using BMI to implement filter pushdown [arrow]
via GitHub
-
Re: [I] [R] Add Way to Track and Expose User Defined Functions [arrow]
via GitHub
-
Re: [I] [FlightSQL] Support `DoExchange` (in addition to `DoPut`) to bind parameters and execute prepared statements [arrow]
via GitHub
-
Re: [I] [Docs][FlightSQL] Clarify some questions about PollFlightInfo [arrow]
via GitHub
-
Re: [I] [C++] Add support for system static RE2 [arrow]
via GitHub
-
Re: [I] [C++][Parquet] add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering [arrow]
via GitHub
-
Re: [I] [Python] Add documentation for Dataset FileWriteOptions classes [arrow]
via GitHub
-
Re: [I] [C++] Reduce template parameters for aggregate functions [arrow]
via GitHub
-
Re: [I] [C++][Flight] Improve error message when port cannot be bound with gRPC [arrow]
via GitHub
-
Re: [I] Add more boolean string maps to cpp/src/arrow/csv/options.cc [arrow]
via GitHub
-
Re: [I] [Python] Define a Dataset protocol based on Substrait and C Data Interface [arrow]
via GitHub
-
Re: [I] [C++] Improve adjoin_as_list for struct types [arrow]
via GitHub
-
Re: [I] [C++][Integration] Install executables for integration test and use it [arrow]
via GitHub
-
Re: [I] [Integration] Refactor datagen.py [arrow]
via GitHub
-
Re: [I] [C++] Stop plan early to improve hashjoin performance. [arrow]
via GitHub
-
Re: [I] Add more NULL mappings to arrow/cpp/src/arrow/csv /options.cc [arrow]
via GitHub
-
Re: [I] [Python] Expose additional Cython wrap/unwrap helpers [arrow]
via GitHub
-
Re: [I] [C++] Support Nested Loop Join node. [arrow]
via GitHub
-
Re: [I] [C++] clang-format result may be invalid for cpplint.py [arrow]
via GitHub
-
Re: [I] [Python] Expose StreamDecoder to pyarrow python API [arrow]
via GitHub
-
Re: [I] [Packaging][Release] Use Debian/RPM type Artifactory repositories instead of General type Artifactory repository [arrow]
via GitHub
-
Re: [I] [C++][FlightRPC][Python] Unit-test and expose TransportStatusDetail [arrow]
via GitHub
-
Re: [I] [C++][FlightRPC] Implement async versions of other metadata methods [arrow]
via GitHub
-
Re: [I] [C++] Implement REE support in ArrayFromJSON [arrow]
via GitHub
-
Re: [I] [C++] Support IS DISTINCT and IS NOT DISTINCT expression. [arrow]
via GitHub
-
Re: [I] [C++][Python] pyarrow.ChunkedArray.combine_chunks is slow [arrow]
via GitHub
-
Re: [I] [Python][CI] Enable warnings as errors in pytests for CI jobs [arrow]
via GitHub
-
Re: [I] [CI] Curate Crossbow nightly jobs [arrow]
via GitHub
-
Re: [I] [C++] IO: Can we support extra IO tag in RandomAccessFile? [arrow]
via GitHub
-
Re: [I] [Python] In pyarrow len(ListScalar()) may have performance issues [arrow]
via GitHub
-
Re: [I] [Doc] Enhancement the document for dataset and s3 [arrow]
via GitHub
-
Re: [I] [C++][Python] Add `CastOptions` to `csv.ConvertOptions` for usage in `read_csv`. [arrow]
via GitHub
-
Re: [I] [C++] Util: Compression supports a Compression/Decompression Context [arrow]
via GitHub
-
Re: [I] [C++] Add concrete ArraySpan classes [arrow]
via GitHub
-
Re: [I] [MATLAB] Vectorize the `field` method of `arrow.tabular.Schema` [arrow]
via GitHub
-
Re: [I] [R][C++] Add ability to trim whitespace to CSV reading options [arrow]
via GitHub
-
Re: [I] Missing kernels for ordering with struct types [arrow]
via GitHub
-
Re: [I] [Python] Provide pybind11 type casters [arrow]
via GitHub
-
Re: [I] [Format] Add wording for alternative layouts [arrow]
via GitHub
-
Re: [I] [R] Explicitly enumerate the `ParquetReaderProperties` and `ParquetArrowReaderProperties` arguments in `write_parquet()` [arrow]
via GitHub
-
Re: [I] [Python][Skyhook] pyarrow library not include the "SkyhookFileFormat" function [arrow]
via GitHub
-
Re: [I] Point empty buffers to kNonNullFiller [arrow]
via GitHub
-
Re: [I] [C++] Refactor scan_node to introduce support for scan tasks [arrow]
via GitHub
-
Re: [I] [C++][SIMD] Avoid one-definition-rule violation of `arrow::internal::BitmapWriter` without depending on `-O2` [arrow]
via GitHub
-
Re: [I] [C++] Support parquet field-id resolution in the substrait consumer [arrow]
via GitHub
-
Re: [I] [C++] [Acero] Enhance aggregate kernel API's for intermediate state [arrow]
via GitHub
-
Re: [I] [R] Any support for rolling windows functions? [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Add an asynchronous version of ReadRowGroup/ReadRowGroups [arrow]
via GitHub
-
Re: [I] [C++][Python] Remove integer compatibility for trunc, floor and ceil [arrow]
via GitHub
-
Re: [I] [Python] Instantiate `pa.Table` from a `Generator`/`Iterator` [arrow]
via GitHub
-
Re: [I] [R] Create a wrapper function around Dataset's `$files` method [arrow]
via GitHub
-
Re: [I] [C++] Provide way for extension array to provide it's own value pretty printer [arrow]
via GitHub
-
Re: [I] [R] Update `write_csv_arrow()` implementation to match `readr::write_csv()` [arrow]
via GitHub
-
Re: [I] [C++] Clean up build warning for BaseBinaryScalar [arrow]
via GitHub
-
Re: [I] [Python] Chained projections [arrow]
via GitHub
-
Re: [I] [C++] Dictionary sorting not implemented on chunked array and table [arrow]
via GitHub
-
Re: [I] [Parquet][C++] Support month_day_nano_interval type in Parquet [arrow]
via GitHub
-
Re: [I] [CI][Python] Disable Dataset in "minimal" builds [arrow]
via GitHub
-
Re: [I] [Docs] Create a "New Committer Guide" [arrow]
via GitHub
-
Re: [I] [C++][Dataset] Should we make `ParquetFileWriter::Write` using `WriteRecordBatch`? [arrow]
via GitHub
-
Re: [I] [C++] Dictionary support for ordering-based compute functions. [arrow]
via GitHub
-
Re: [I] [C++] Support reading Hadoop-snappy File Format Directly [arrow]
via GitHub
-
Re: [I] [Python] Add rename_columns to DataSet [arrow]
via GitHub
-
Re: [I] [Dev] Disable review label bot on closed/merged PRs [arrow]
via GitHub
-
Re: [I] [C++] Convert `SingularOrList` to `is_in` when possible [arrow]
via GitHub
-
Re: [I] [R] Bump the default format version from 2.4 -> 2.6 [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Support `num_distinct_value` in Statistics [arrow]
via GitHub
-
Re: [I] [Dev] Curate CODEOWNERS [arrow]
via GitHub
-
Re: [I] [Ruby] Add Arrow::RunEndEncodedArray#{raw_records,values} [arrow]
via GitHub
-
Re: [I] [C++][Python] Add an option to specify encoding in WriteOptions (CSV Writer) [arrow]
via GitHub
-
Re: [I] [C++][Python] Wrap / Unwrap Functions for Other PyArrow Objects [arrow]
via GitHub
-
Re: [I] [Python] pyarrow.compute.Expression is in datasets API instead of compute API [arrow]
via GitHub
-
Re: [I] [C++] Add arrow::compute::Expression::Equals option that does not compare kernels [arrow]
via GitHub
-
Re: [I] [Python] Add missing APIs to RecordBatch class [arrow]
via GitHub
-
Re: [I] [Python] Add missing APIs to Table class [arrow]
via GitHub
-
Re: [I] [R] Documentation for threading could be improved [arrow]
via GitHub
-
Re: [I] [Python][Parquet] Allow `write_batch` to directly write batch [arrow]
via GitHub
-
Re: [I] [Python] Add `exclude_invalid_files` to `ParquetDataset` [arrow]
via GitHub
-
Re: [I] [C++] Support Parquet BloomFilter in dataset [arrow]
via GitHub
-
Re: [I] [CI] Refactor docker push steps into a reusable action [arrow]
via GitHub
-
Re: [I] [C++] Fix and re-enable Asof Join Backpresure test flakiness [arrow]
via GitHub
-
Re: [I] [C++] Investigate scalar.h usage and reduce cost of function.h include [arrow]
via GitHub
-
Re: [I] [Release] Add performance comparison between Release Candidate and previous release [arrow]
via GitHub
-
Re: [I] [CI][C++] Should we run IWYU check as CI [arrow]
via GitHub
-
Re: [I] [C++] Improve future as-of-join algorithmic complexity [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Allow Truncate min-max Statistics [arrow]
via GitHub
-
Re: [I] [R] Improve test coverage of dataset option plumbing [arrow]
via GitHub
-
Re: [I] [C++] Check for duplicate buffers in IPC writer [arrow]
via GitHub
-
Re: [I] Memory leak using pyarrow.Table and to_pandas method [arrow]
via GitHub
-
Re: [I] [PYTHON] Add `nan_count` to `RowGroupMetaData` [arrow]
via GitHub
-
Re: [I] Preserve unit from Pandas timestamp when creating arrow scalar [arrow]
via GitHub
-
Re: [I] [Release] Improve post-11-bump-versions.sh to avoid possible stray commits to reaching main [arrow]
via GitHub
-
Re: [I] [C++] Support linking to libcurl statically [arrow]
via GitHub
-
Re: [I] [Ruby][Parquet] Implement the bindings of parquet::ReaderProperties [arrow]
via GitHub
-
Re: [I] [R][C++] Calling bucket$ls on GCS bucket without `recursive = TRUE` doesn't list full contents [arrow]
via GitHub
-
Re: [I] [C++] [Parquet] StreamWriter should support writing nanosecond timestamps [arrow]
via GitHub
-
Re: [I] [Packaging][Release] Reduce disk requirements for linux packaging jobs [arrow]
via GitHub
-
Re: [I] [C++] Add run-end-encoding benchmark [arrow]
via GitHub
-
Re: [I] [Release][CI] Improve versions used on jobs executed on maintenance branch [arrow]
via GitHub
-
Re: [I] [C++][CI] Enable CMAKE_UNITY_BUILD on more debug builds [arrow]
via GitHub
-
Re: [I] [CI] Refactor ccache steps into reusable action [arrow]
via GitHub
-
Re: [I] [R] Consider using the `patrick` package to reduce duplicate test cases [arrow]
via GitHub
-
Re: [I] [C++][Compute] Function 'utf8_trim' has no kernel matching input types (dictionary<values=string, indices=int64, ordered=0>) [arrow]
via GitHub
-
Re: [I] [R] Developer setup guides need more context on SSL versions [arrow]
via GitHub
-
[PR] WIP: [R] Verify CRAN release 23.0.1 [arrow]
via GitHub
-
Re: [I] [Python] Support pandas dtype_backend=pyarrow in to_pandas [arrow]
via GitHub
-
Re: [I] [R][C++] Support upcasting NULL columns in Dataset CSV reader [arrow]
via GitHub
-
Re: [I] [C++] MemoryPool::Allocate returns an error when the user provides small values for alignment [arrow]
via GitHub
-
Re: [I] [C++] Can InputType support nested types in the future? [arrow]
via GitHub
-
Re: [I] [C++] Enable alignment checks in UBSAN [arrow]
via GitHub
-
Re: [I] [R] Implement binding for cut() [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Should we support PARQUET_2_8 version? [arrow]
via GitHub
-
Re: [I] [Python] Add from_numpy_ndarray and to_numpy_ndarray to ListArray types [arrow]
via GitHub
-
Re: [I] [Python] Convert multidimensional arrays into FixedShapeTensorArrays automatically in `pyarrow.array` [arrow]
via GitHub
-
Re: [I] [Python][Documentation] Clarify language around use_deprecated_int96_timestamps [arrow]
via GitHub
-
Re: [I] [MATLAB] Add test verifying `arrow.array.<Type>Array` constructors accept all supported dimensions: `Nx1`, `1xM`, `1x0`, `0x1`, `0x0` [arrow]
via GitHub
-
Re: [I] [R] Expose AWS SDK retry strategies to s3_bucket/S3FileSystem [arrow]
via GitHub
-
Re: [I] [Python] Fixed size lists of numeric types without nulls could be converted to numpy with zero-copy [arrow]
via GitHub
-
Re: [I] [Java] JDBC Flight Stream Result Set asynchronous VectorSchemaRoot Producer [arrow]
via GitHub
-
Re: [I] [C++] Allow safe cast from int64 to float64 in compute kernel for numerics that can be accurately represented above 2^53 [arrow]
via GitHub
-
Re: [I] [R] Implement summary.ArrowTabular/Dataset [arrow]
via GitHub
-
Re: [I] [R] Split out the docs for our R6 objects from the user-facing APIs [arrow]
via GitHub
-
Re: [I] [C#] Make BitUtility class internal [arrow]
via GitHub
-
Re: [I] [C++] Refactor debug-logging facilities in `AsofJoinNode` [arrow]
via GitHub
-
Re: [I] [GLib] Add GArrowFixedShapeTensorArray [arrow]
via GitHub
-
Re: [I] [MATLAB] Use Arrow::arrow_shared instead of arrow_shared for ExternalProject_Add()-ed Apache Arrow C++ [arrow]
via GitHub
-
Re: [I] [R][C++] Upcasting from int32 to int64 when joining two tables [arrow]
via GitHub
-
Re: [I] [Python] Filter on `__row_index` [arrow]
via GitHub
-
Re: [I] [C++] OrderBy with spillover [arrow]
via GitHub
-
Re: [I] [Python] Pyarrow table conversion from pandas fails for categorical fields with arrow dtypes [arrow]
via GitHub
-
Re: [I] [Docs][C++] Add warning to the "Building on Windows" documentation for the Arrow C++ libraries about potential MSVC Runtime compatibility issues with `Debug` builds [arrow]
via GitHub
-
Re: [I] [MATLAB] Consider lowering the minimum CMake version requirement for building the MATLAB interface [arrow]
via GitHub
-
Re: [I] [CI][C++] Add clang-tidy and IWYU C++ CI jobs [arrow]
via GitHub
-
Re: [I] [C++] Simplify "FileFormat options" vs. "Fragment scan/write options" [arrow]
via GitHub
-
Re: [I] Add a type alias for `pa.dictionary(pa.int32(), pa.string())` [arrow]
via GitHub
-
Re: [I] [Python] `flight.Location(Location)` should be a no-op [arrow]
via GitHub
-
Re: [I] [C++] Add versions of IsNull/IsValid that take an ArrowType tparam so implementation can be statically dispatched [arrow]
via GitHub
-
Re: [I] [R][C++] Let the JSON reader accept a document which is an array at top level in addition to line delimited JSON [arrow]
via GitHub
-
Re: [I] [R] write_parquet expose similar options as python parquet.write_table ? [arrow]
via GitHub
-
Re: [I] [C++][Dev] JNI code is not linted/formatted [arrow]
via GitHub
-
Re: [I] [Python] Allow `pyarrow.compute.cast` to coerce errors to null values [arrow]
via GitHub
-
Re: [I] [R] Improve interface for working with schemas [arrow]
via GitHub
-
Re: [I] [C++] Non-deterministic FetchNode [arrow]
via GitHub
-
Re: [I] Relax pyarrow.compute.is_in type requirement [arrow]
via GitHub
-
Re: [I] [C++] Should all vendored libraries be in private namespaces? [arrow]
via GitHub
-
Re: [I] [R][Doc] Remove discussion of Scanner from vignettes [arrow]
via GitHub
-
Re: [I] [Python] Bindings for FixedShapeTensorType.FromTensor/ToTensor and FixedShapeTensorArray.strides [arrow]
via GitHub
-
Re: [I] [Python] Provide a way to restore a schema from its string representation [arrow]
via GitHub
-
Re: [I] [C++][Python] Allow utf8_slice_codeunits to support default start value of None to support strings of different length [arrow]
via GitHub
-
Re: [I] [C++] Mechanism for throttling remote filesystems to avoid rate limiting [arrow]
via GitHub
-
Re: [I] [Python][Docs] Update/rearrange Data Types section and add FixedShapeTensorType [arrow]
via GitHub
-
Re: [I] [R] Improve configure script for contributor experience [arrow]
via GitHub
-
Re: [I] [R] Add an argument to `open_csv_dataset()` to repair duplicated column names or ignore them? [arrow]
via GitHub
-
Re: [I] [C++]: Support tail in FetchNode [arrow]
via GitHub
-
Re: [I] [C++] utf8_slice_codeunits doesn't support stop/step array type [arrow]
via GitHub
-
Re: [I] [R] What should compute look like in an R minimal build? [arrow]
via GitHub
-
Re: [I] [C++][Gandiva] Support function for average [arrow]
via GitHub
-
Re: [I] [C++][Gandiva] Support function for round with rounding modes [arrow]
via GitHub
-
Re: [I] [C++][Gandiva] Support function for current timestamp [arrow]
via GitHub
-
Re: [I] [C++] Expand test coverage for FieldPath and FieldRef Get/Find methods [arrow]
via GitHub
-
Re: [I] [R] Modernize error handling [arrow]
via GitHub
-
Re: [I] [C++][Python] Parity between `Dataset`, `Table`, and `Scanner` methods which load data: `sort_by`, `join`. [arrow]
via GitHub
-
Re: [I] [C++] Simplified header/inclusion in Acero substrait consumer introduced by segmented aggregation [arrow]
via GitHub
-
Re: [I] [R] write_* methods can't have socketConnection as a sink [arrow]
via GitHub
-
Re: [I] [R] Create feature to read in specific nested columns in Newline-delimited JSON file [arrow]
via GitHub
-
Re: [I] Cast kernel between int32, date and string for partition columns [arrow]
via GitHub
-
Re: [I] [CI] Evaluate new GHA backend for sccache [arrow]
via GitHub
-
Re: [I] [C++] Utilities to estimate average (de)serialized row size [arrow]
via GitHub
-
Re: [I] [C++][Parquet] Decoder: support more DecodeArrow with nulls for `DeltaBitPack` and other decoder [arrow]
via GitHub
-
Re: [I] Support separate null_values per column in pyarrow.csv.ConvertOptions [arrow]
via GitHub
-
Re: [I] [C++][Gandiva] Support functions for average, current timestamp, round with rounding modes [arrow]
via GitHub
-
Re: [I] [C++] Add ToStringExtra to scan2 node. [arrow]
via GitHub
-
Re: [I] [R][CI] Bump the R versions we test to include 4.3 [arrow]
via GitHub
-
Re: [I] DOC: to_pylist returns a pandas.Timestamp instead of datetime.datetime when the type is timestamp[ns] [arrow]
via GitHub
-
Re: [I] [C++][FlightRPC] Flight SQL: make it easier for servers to handle unrecognized messages [arrow]
via GitHub