Re: [PR] Cleanup CSV WriterBuilder, Default to AutoSI Second Precision (#4735) [arrow-rs]

2023-10-10 Thread via GitHub
tustvold merged PR #4909: URL: https://github.com/apache/arrow-rs/pull/4909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] feat: expose csvbuilder has_headers [arrow-rs]

2023-10-10 Thread via GitHub
tustvold closed pull request #4906: feat: expose csvbuilder has_headers URL: https://github.com/apache/arrow-rs/pull/4906 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [I] Add read access to settings in `csv::WriterBuilder` [arrow-rs]

2023-10-10 Thread via GitHub
tustvold closed issue #4735: Add read access to settings in `csv::WriterBuilder` URL: https://github.com/apache/arrow-rs/issues/4735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Cleanup `object_store::retry` client error handling [arrow-rs]

2023-10-10 Thread via GitHub
tustvold merged PR #4915: URL: https://github.com/apache/arrow-rs/pull/4915 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] feat: add method for async read bloom filter [arrow-rs]

2023-10-10 Thread via GitHub
tustvold commented on code in PR #4917: URL: https://github.com/apache/arrow-rs/pull/4917#discussion_r1354276366 ## parquet/src/arrow/async_reader/mod.rs: ## @@ -302,6 +301,46 @@ impl ParquetRecordBatchStreamBuilder { Self::new_builder(AsyncReader(input), metadata)

Re: [PR] Minor: Clarify rationale for `FlightDataEncoder` API, add examples [arrow-rs]

2023-10-10 Thread via GitHub
tustvold merged PR #4916: URL: https://github.com/apache/arrow-rs/pull/4916 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] DataSink Dynamic Execution Time Demux [arrow-datafusion]

2023-10-10 Thread via GitHub
metesynnada commented on PR #7791: URL: https://github.com/apache/arrow-datafusion/pull/7791#issuecomment-1756962477 I will review this PR at my earliest convenience. Your explanations have been very helpful, making the review process much smoother. Thank you! -- This is an automated mes

Re: [I] SQL Filter casting problem [arrow-datafusion]

2023-10-10 Thread via GitHub
metesynnada commented on issue #7601: URL: https://github.com/apache/arrow-datafusion/issues/7601#issuecomment-1756950433 I'm unable to reproduce the issue either. This code appears to be functioning as expected. Consequently, I'm closing the issue for now, as I'm uncertain about the natur

Re: [I] SQL Filter casting problem [arrow-datafusion]

2023-10-10 Thread via GitHub
metesynnada closed issue #7601: SQL Filter casting problem URL: https://github.com/apache/arrow-datafusion/issues/7601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

Re: [I] SQL Filter casting problem [arrow-datafusion]

2023-10-10 Thread via GitHub
metesynnada commented on issue #7601: URL: https://github.com/apache/arrow-datafusion/issues/7601#issuecomment-1756931728 I am on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756861600 Revision: 9d4018473900bf9127c6f52fbd18ab2b0d47bf35 Submitted crossbow builds: [ursacomputing/crossbow @ actions-3b4a613597](https://github.com/ursacomputing/crossbow/bra

[PR] fix: preserve column qualifier for `DataFrame::with_column` [arrow-datafusion]

2023-10-10 Thread via GitHub
jonahgao opened a new pull request, #7792: URL: https://github.com/apache/arrow-datafusion/pull/7792 ## Which issue does this PR close? Closes #7790. ## Rationale for this change The join operation produced two columns with identical names, but they belong to dif

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
assignUser commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756846285 @github-actions crossbow submit r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-10 Thread via GitHub
kou commented on code in PR #38116: URL: https://github.com/apache/arrow/pull/38116#discussion_r1354124526 ## cpp/src/gandiva/function_registry.h: ## @@ -33,6 +33,9 @@ class GANDIVA_EXPORT FunctionRegistry { /// Lookup a pre-compiled function by its signature. const Native

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-10 Thread via GitHub
kou commented on code in PR #38116: URL: https://github.com/apache/arrow/pull/38116#discussion_r1354124148 ## cpp/src/gandiva/engine.cc: ## @@ -152,6 +153,7 @@ Status Engine::LoadFunctionIRs() { if (!functions_loaded_) { ARROW_RETURN_NOT_OK(LoadPreCompiledIR()); ARR

Re: [PR] MINOR: [C#] Bump BenchmarkDotNet from 0.13.8 to 0.13.9 in /csharp [arrow]

2023-10-10 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38161: URL: https://github.com/apache/arrow/pull/38161#issuecomment-1756839218 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit fae16c850a206d97ff23df73ba54bb8991fb35bc. There were no

Re: [PR] GH-36594: [C++] Don't use MSVC_VERSION to determin -fms-compatibility-version [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #36595: URL: https://github.com/apache/arrow/pull/36595#issuecomment-1756829188 Revision: 25621eb5a309bce99cb00fdb04807e76820fd8c2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-34034fb076](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-36594: [C++] Don't use MSVC_VERSION to determin -fms-compatibility-version [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #36595: URL: https://github.com/apache/arrow/pull/36595#issuecomment-1756827420 @github-actions crossbow submit conda-win-x64-cpu-py3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
kou merged PR #38177: URL: https://github.com/apache/arrow/pull/38177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-38193: [CI][Java] Free up disk space for "AMD64 manylinux2014 Java JNI" [arrow]

2023-10-10 Thread via GitHub
kou merged PR #38194: URL: https://github.com/apache/arrow/pull/38194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-38193: [CI][Java] Free up disk space for "AMD64 manylinux2014 Java JNI" [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #38194: URL: https://github.com/apache/arrow/pull/38194#issuecomment-1756823669 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] GH-37510: [C++] Don't install bundled Azure SDK for C++ [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #38176: URL: https://github.com/apache/arrow/pull/38176#issuecomment-1756823405 > A future improvement might be installing the sdk into the build tree instead of the host system by modifing: Does this approach apply a patch to Azure SDK for C++? I want to avoid ha

Re: [PR] GH-37510: [C++] Don't install bundled Azure SDK for C++ [arrow]

2023-10-10 Thread via GitHub
kou commented on code in PR #38176: URL: https://github.com/apache/arrow/pull/38176#discussion_r1354107080 ## cpp/cmake_modules/ThirdpartyToolchain.cmake: ## @@ -5070,8 +5070,20 @@ endif() # Azure SDK for C++ function(build_azure_sdk) + if(CMAKE_VERSION VERSION_LESS 3.22) +

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756786002 Revision: 10ea3a31a863f9bcc2f8a7c9c20e082d344d927a Submitted crossbow builds: [ursacomputing/crossbow @ actions-8ad4975dfd](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38074: [C++] Support uint64_t Types in Slice Function to Address Specific Inner Join Bug [arrow]

2023-10-10 Thread via GitHub
llama90 commented on PR #38147: URL: https://github.com/apache/arrow/pull/38147#issuecomment-1756769643 > @llama90 could you please run the linter? Instructions at https://arrow.apache.org/docs/developers/cpp/development.html#code-style-linting-and-ci Did I apply the lint correctly as

Re: [PR] GH-37510: [C++] Don't install bundled Azure SDK for C++ [arrow]

2023-10-10 Thread via GitHub
assignUser commented on code in PR #38176: URL: https://github.com/apache/arrow/pull/38176#discussion_r1354008015 ## cpp/cmake_modules/ThirdpartyToolchain.cmake: ## @@ -5070,8 +5070,20 @@ endif() # Azure SDK for C++ function(build_azure_sdk) + if(CMAKE_VERSION VERSION_LESS

Re: [PR] MINOR: [R] Add Jacob Wujciak-Jens as author [arrow]

2023-10-10 Thread via GitHub
assignUser merged PR #38188: URL: https://github.com/apache/arrow/pull/38188 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756735707 Revision: 08b463f2a1d3ac90725651ca53abde7eecd12aed Submitted crossbow builds: [ursacomputing/crossbow @ actions-1af1d91145](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756729634 Revision: 845e6070a0441487f7d4208d017c64c41b503581 Submitted crossbow builds: [ursacomputing/crossbow @ actions-8cb947668c](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
assignUser commented on PR #38195: URL: https://github.com/apache/arrow/pull/38195#issuecomment-1756727930 @github-actions crossbow submit r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] GH-38043: [R] Enable cloud support by default [arrow]

2023-10-10 Thread via GitHub
assignUser opened a new pull request, #38195: URL: https://github.com/apache/arrow/pull/38195 ### Rationale for this change Previously GCS/S3 support would need to be explicitly enabled in source builds (when they are build without `NOT_CRAN`). As we want the macos binaries to be ful

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38177: URL: https://github.com/apache/arrow/pull/38177#issuecomment-175561 Revision: df361e3cdc550461874d39e7e49bd1c6050e1f22 Submitted crossbow builds: [ursacomputing/crossbow @ actions-90438eae1c](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38063: [C++] Use absolute path for external project's ar/ranlib [arrow]

2023-10-10 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38064: URL: https://github.com/apache/arrow/pull/38064#issuecomment-1756664898 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 34eb21df5aa5799bf5ecca42a0542ae0124a439e. There were no

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #38177: URL: https://github.com/apache/arrow/pull/38177#issuecomment-1756664847 @github-actions crossbow submit verify-rc-source-*-linux-ubuntu-22.04-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] GH-38193: [CI][Java] Free up disk space for "AMD64 manylinux2014 Java JNI" [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38194: URL: https://github.com/apache/arrow/pull/38194#issuecomment-1756659162 :warning: GitHub issue #38193 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-38193: [CI][Java] Free up disk space for "AMD64 manylinux2014 Java JNI" [arrow]

2023-10-10 Thread via GitHub
kou opened a new pull request, #38194: URL: https://github.com/apache/arrow/pull/38194 ### Rationale for this change We don't have enough disk space. ### What changes are included in this PR? Remove unused pre-installed software. ### Are these changes tested?

Re: [I] [R] CRAN packaging checklist for 14.0.0 [arrow]

2023-10-10 Thread via GitHub
assignUser commented on issue #38141: URL: https://github.com/apache/arrow/issues/38141#issuecomment-1756657888 We can remove all of the autobrew related stuff right? As well as the rtools-packages PRs as we are never looking for a local version of the win library unless the envvar is set..

Re: [PR] GH-37907: [R] Setting rosetta variable is missing [arrow]

2023-10-10 Thread via GitHub
assignUser commented on PR #37961: URL: https://github.com/apache/arrow/pull/37961#issuecomment-1756646162 It looks like this would also fix a nightly fail/cran warning: https://github.com/ursacomputing/crossbow/actions/runs/6463099167/job/17545762151#step:6:3794 -- This is an automated m

Re: [PR] GH-37312: [Python][Docs] Update Python docstrings to reflect new parquet encoding option [arrow]

2023-10-10 Thread via GitHub
mapleFU commented on PR #38070: URL: https://github.com/apache/arrow/pull/38070#issuecomment-1756623086 @pitrou @jorisvandenbossche Would you mind take a look? This is just a 15line doc change -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [I] [C++][Parquet] Using BMI to implement filter pushdown [arrow]

2023-10-10 Thread via GitHub
SZn5489 commented on issue #37559: URL: https://github.com/apache/arrow/issues/37559#issuecomment-1756616551 @mapleFU Thank you for your guidance. I'm going to try to understand the code and try to implement Selection vector. -- This is an automated message from the Apache Git Service

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
paleolimbot commented on code in PR #38115: URL: https://github.com/apache/arrow/pull/38115#discussion_r1353825799 ## r/tools/nixlibs.R: ## @@ -103,6 +104,42 @@ download_binary <- function(lib) { } libfile <- NULL } + # Explicitly setting the env var to "false" wil

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
assignUser commented on code in PR #38115: URL: https://github.com/apache/arrow/pull/38115#discussion_r1353824530 ## r/tools/nixlibs.R: ## @@ -103,6 +104,42 @@ download_binary <- function(lib) { } libfile <- NULL } + # Explicitly setting the env var to "false" will

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
paleolimbot commented on code in PR #38115: URL: https://github.com/apache/arrow/pull/38115#discussion_r1353823420 ## r/tools/nixlibs.R: ## @@ -103,6 +104,42 @@ download_binary <- function(lib) { } libfile <- NULL } + # Explicitly setting the env var to "false" wil

Re: [PR] DataSink Dynamic Execution Time Demux [arrow-datafusion]

2023-10-10 Thread via GitHub
devinjdangelo commented on code in PR #7791: URL: https://github.com/apache/arrow-datafusion/pull/7791#discussion_r1353775705 ## datafusion/common/src/config.rs: ## @@ -254,6 +254,24 @@ config_namespace! { /// Number of files to read in parallel when inferring schema

Re: [PR] GH-37767: [C++][CMake] Don't touch .git/index [arrow]

2023-10-10 Thread via GitHub
assignUser merged PR #38003: URL: https://github.com/apache/arrow/pull/38003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

Re: [PR] GH-37767: [C++][CMake] Don't touch .git/index [arrow]

2023-10-10 Thread via GitHub
assignUser commented on code in PR #38003: URL: https://github.com/apache/arrow/pull/38003#discussion_r1353821127 ## cpp/cmake_modules/DefineOptions.cmake: ## @@ -747,7 +747,7 @@ if(NOT ARROW_GIT_ID) OUTPUT_STRIP_TRAILING_WHITESPACE) endif() if(NOT ARROW_GIT

Re: [PR] fix(r): Build with __USE_MINGW_ANSI_STDIO to enable use of lld in format strings [arrow-adbc]

2023-10-10 Thread via GitHub
paleolimbot commented on PR #1180: URL: https://github.com/apache/arrow-adbc/pull/1180#issuecomment-1756596519 Just bumping this (it's holding up a submission of 0.7.0 for adbcsqlite, which is currently failing a CRAN check that will be fixed by this PR) -- This is an automated message fr

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-10 Thread via GitHub
niyue commented on code in PR #38116: URL: https://github.com/apache/arrow/pull/38116#discussion_r1353820364 ## cpp/src/gandiva/engine.cc: ## @@ -152,6 +153,7 @@ Status Engine::LoadFunctionIRs() { if (!functions_loaded_) { ARROW_RETURN_NOT_OK(LoadPreCompiledIR()); A

Re: [I] Hi there, Which git commit of flatbuffer is used in arrow? will it change often? [arrow]

2023-10-10 Thread via GitHub
felipecrv commented on issue #13891: URL: https://github.com/apache/arrow/issues/13891#issuecomment-1756595966 Being upgraded here https://github.com/apache/arrow/pull/38192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[PR] GH-35497: [C++] Use the latest tagged version of flatbuffers [arrow]

2023-10-10 Thread via GitHub
felipecrv opened a new pull request, #38192: URL: https://github.com/apache/arrow/pull/38192 ### Rationale for this change To use a more modern version of `flatc` in Arrow. ### What changes are included in this PR? 1) Re-generating the the C++ files with a `flatc` based o

Re: [I] [Python] CSV Generation of pyarrow table [arrow]

2023-10-10 Thread via GitHub
assignUser commented on issue #38035: URL: https://github.com/apache/arrow/issues/38035#issuecomment-1756595779 [5.0.0](https://arrow.apache.org/docs/5.0/python/generated/pyarrow.csv.write_csv.html?highlight=write_csv) but I would recommend upgrading the newest version possible. -- This

Re: [PR] feat(c/driver/postgresql): INSERT benchmark for postgres [arrow-adbc]

2023-10-10 Thread via GitHub
paleolimbot commented on code in PR #1189: URL: https://github.com/apache/arrow-adbc/pull/1189#discussion_r1353811809 ## c/driver/postgresql/postgresql_benchmark.cc: ## @@ -0,0 +1,171 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licen

Re: [PR] Change ScalarValue::List to store ArrayRef [arrow-datafusion]

2023-10-10 Thread via GitHub
jayzhan211 commented on code in PR #7629: URL: https://github.com/apache/arrow-datafusion/pull/7629#discussion_r1353775489 ## datafusion/common/src/scalar.rs: ## @@ -1653,42 +1619,66 @@ impl ScalarValue { Ok(array) } -fn iter_to_array_list( -scalars:

Re: [PR] Change ScalarValue::List to store ArrayRef [arrow-datafusion]

2023-10-10 Thread via GitHub
jayzhan211 commented on code in PR #7629: URL: https://github.com/apache/arrow-datafusion/pull/7629#discussion_r1353814368 ## datafusion/proto/tests/cases/roundtrip_logical_plan.rs: ## @@ -424,59 +427,6 @@ fn scalar_values_error_serialization() { Some(vec![]),

Re: [PR] Change ScalarValue::List to store ArrayRef [arrow-datafusion]

2023-10-10 Thread via GitHub
jayzhan211 commented on code in PR #7629: URL: https://github.com/apache/arrow-datafusion/pull/7629#discussion_r1353803958 ## datafusion/physical-expr/src/aggregate/bit_and_or_xor.rs: ## @@ -637,13 +638,14 @@ where // 1. Stores aggregate state in `ScalarValue::List`

Re: [PR] Change ScalarValue::List to store ArrayRef [arrow-datafusion]

2023-10-10 Thread via GitHub
jayzhan211 commented on code in PR #7629: URL: https://github.com/apache/arrow-datafusion/pull/7629#discussion_r1353797274 ## datafusion/common/src/scalar.rs: ## @@ -2093,18 +2203,29 @@ impl ScalarValue { DataType::Utf8 => typed_cast!(array, index, StringArray, Utf8

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1756585018 Revision: dd292b667b56a45a171276b1cb5d8dd0f8f076b3 Submitted crossbow builds: [ursacomputing/crossbow @ actions-d4c5399398](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
assignUser commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1756580102 @github-action crossbow submit -g r -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
assignUser commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1756577888 > Also, it is probably worth rebasing to clear up the CI. I merged to keep the crossbow job shas valid. I will run another round of validation but then this should be merge ready

Re: [PR] Change ScalarValue::List to store ArrayRef [arrow-datafusion]

2023-10-10 Thread via GitHub
jayzhan211 commented on code in PR #7629: URL: https://github.com/apache/arrow-datafusion/pull/7629#discussion_r1353775489 ## datafusion/common/src/scalar.rs: ## @@ -1653,42 +1619,66 @@ impl ScalarValue { Ok(array) } -fn iter_to_array_list( -scalars:

Re: [PR] [GH-37751] [C++][Gandiva] Avoid registering exported functions multiple times in gandiva [arrow]

2023-10-10 Thread via GitHub
niyue commented on code in PR #37752: URL: https://github.com/apache/arrow/pull/37752#discussion_r1353773935 ## cpp/src/gandiva/exported_funcs_registry.h: ## @@ -21,34 +21,31 @@ #include #include +#include namespace gandiva { class ExportedFuncsBase; /// Registry

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-10 Thread via GitHub
assignUser commented on code in PR #38115: URL: https://github.com/apache/arrow/pull/38115#discussion_r1353772670 ## r/tools/nixlibs.R: ## @@ -103,6 +104,39 @@ download_binary <- function(lib) { } libfile <- NULL } + + # validate binary checksum for CRAN release on

Re: [I] Allow Inserts to Partitioned Listing Table [arrow-datafusion]

2023-10-10 Thread via GitHub
devinjdangelo commented on issue #7744: URL: https://github.com/apache/arrow-datafusion/issues/7744#issuecomment-1756559764 > Perhaps filesink could consume a `Stream`. I.e. unlike `Vec` we don't know a fixed number of partitions up front, but rather consume some unknown number of streams

[PR] DataSink Dynamic Execution Time Demux [arrow-datafusion]

2023-10-10 Thread via GitHub
devinjdangelo opened a new pull request, #7791: URL: https://github.com/apache/arrow-datafusion/pull/7791 ## Which issue does this PR close? Closes #5383 Closes #7767 Progresses towards #7744 ## Rationale for this change Currently, we use the partitioning of t

Re: [PR] GH-38174: [C++] Update bundled Azure SDK for C++ to 1.10.3 [arrow]

2023-10-10 Thread via GitHub
kou merged PR #38175: URL: https://github.com/apache/arrow/pull/38175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-38174: [C++] Update bundled Azure SDK for C++ to 1.10.3 [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #38175: URL: https://github.com/apache/arrow/pull/38175#issuecomment-1756541982 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] GH-37812: [MATLAB] Add `arrow.type.ListType` MATLAB class [arrow]

2023-10-10 Thread via GitHub
kou commented on code in PR #38189: URL: https://github.com/apache/arrow/pull/38189#discussion_r1353727208 ## matlab/src/cpp/arrow/matlab/type/proxy/list_type.h: ## @@ -0,0 +1,36 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [I] Unnest `UNION` plans (in addition to UNION ALL) [arrow-datafusion]

2023-10-10 Thread via GitHub
alamb commented on issue #7786: URL: https://github.com/apache/arrow-datafusion/issues/7786#issuecomment-1756516332 > combine this rule with the previous one i This seems natural to me if possible -- This is an automated message from the Apache Git Service. To respond to the messag

Re: [PR] Cleanup `object_store::retry` client error handling [arrow-rs]

2023-10-10 Thread via GitHub
alamb commented on code in PR #4915: URL: https://github.com/apache/arrow-rs/pull/4915#discussion_r1353699068 ## object_store/src/client/retry.rs: ## @@ -39,8 +39,8 @@ pub enum Error { body: Option, }, -#[snafu(display("Response error after {retries} retries:

[PR] feat: add method for async read bloom filter [arrow-rs]

2023-10-10 Thread via GitHub
hengfeiyang opened a new pull request, #4917: URL: https://github.com/apache/arrow-rs/pull/4917 # Which issue does this PR close? Impl #3851 We want to filter `row_groups` in Datafusion but there is no async API for reading `bloom filter`. # What changes are included in

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #38177: URL: https://github.com/apache/arrow/pull/38177#issuecomment-1756492246 Revision: 6560f8beb553483ca9da0fe99f5ee73526e73cf6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-60c5443a01](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
kou commented on code in PR #38177: URL: https://github.com/apache/arrow/pull/38177#discussion_r1353678642 ## dev/release/verify-release-candidate.sh: ## @@ -649,15 +651,17 @@ test_and_install_cpp() { export CMAKE_BUILD_PARALLEL_LEVEL=${CMAKE_BUILD_PARALLEL_LEVEL:-${NPROC}}

Re: [PR] GH-38159: [CI][Release] Run only integration tests on integration test mode [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #38177: URL: https://github.com/apache/arrow/pull/38177#issuecomment-1756490307 @github-actions crossbow submit verify-rc-source-*-linux-ubuntu-22.04-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] GH-35531: [Python] C Data Interface PyCapsule Protocol [arrow]

2023-10-10 Thread via GitHub
wjones127 commented on code in PR #37797: URL: https://github.com/apache/arrow/pull/37797#discussion_r1353674880 ## python/pyarrow/table.pxi: ## @@ -2983,6 +2985,100 @@ cdef class RecordBatch(_Tabular): c_ptr, c_schema)) return pyarrow_wrap_batch(c

Re: [PR] GH-38077: [C++] Output bundled GoogleTest to ${BUILD_DIR}/${CONFIG} [arrow]

2023-10-10 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38132: URL: https://github.com/apache/arrow/pull/38132#issuecomment-1756479381 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit d602d401569697a61e0dc1ea1fabcf829fc09142. There were no

Re: [PR] GH-36594: [C++] Don't use MSVC_VERSION to determin -fms-compatibility-version [arrow]

2023-10-10 Thread via GitHub
github-actions[bot] commented on PR #36595: URL: https://github.com/apache/arrow/pull/36595#issuecomment-1756479247 Revision: 42580ebaed7df28529f26cfb2e655da72932e951 Submitted crossbow builds: [ursacomputing/crossbow @ actions-19246c81d4](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-36594: [C++] Don't use MSVC_VERSION to determin -fms-compatibility-version [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #36595: URL: https://github.com/apache/arrow/pull/36595#issuecomment-1756476968 @github-actions crossbow submit conda-win-x64-cpu-py3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] GH-36594: [C++] Don't use MSVC_VERSION to determin -fms-compatibility-version [arrow]

2023-10-10 Thread via GitHub
kou commented on PR #36595: URL: https://github.com/apache/arrow/pull/36595#issuecomment-1756476786 I revisited this. We can use LLVM as a library with conda. So we can test this case with `conda-win-x64-cpu-py3`. I think that we don't need to use different `-fms-compatibility-

Re: [PR] GH-26685: [Python] use IPC for pickle serialisation [arrow]

2023-10-10 Thread via GitHub
anjakefala commented on PR #37683: URL: https://github.com/apache/arrow/pull/37683#issuecomment-1756461068 Confirmed that this approach would break support for [pickle protocol 5 for out of band](https://peps.python.org/pep-0574/) data. -- This is an automated message from the Apache Git

[I] Ambiguous reference error for named columns [arrow-datafusion]

2023-10-10 Thread via GitHub
Blajda opened a new issue, #7790: URL: https://github.com/apache/arrow-datafusion/issues/7790 ### Describe the bug I'm using the dataframe API to perform a join. I can build a join without issue however attempting to add an additional column results in a failure. This is the logical

Re: [I] [Bug] docker compose up -d error on building failed to calculate checksum [arrow-ballista]

2023-10-10 Thread via GitHub
ehenry2 commented on issue #891: URL: https://github.com/apache/arrow-ballista/issues/891#issuecomment-1756411907 Thanks, yeah looks like the latest commit. I tried to replicate by cloning the repo and running docker compose but wasn't able to replicate the issue unfortunately. That error l

Re: [PR] feat(c/driver/postgresql): INSERT benchmark for postgres [arrow-adbc]

2023-10-10 Thread via GitHub
WillAyd commented on code in PR #1189: URL: https://github.com/apache/arrow-adbc/pull/1189#discussion_r1353512273 ## c/driver/postgresql/postgresql_benchmark.cc: ## @@ -0,0 +1,171 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license a

[PR] feat(c/driver/postgresql): INSERT benchmark for postgres [arrow-adbc]

2023-10-10 Thread via GitHub
WillAyd opened a new pull request, #1189: URL: https://github.com/apache/arrow-adbc/pull/1189 Pretty hacked together but I think is a workable foundation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] GH-37907: [R] Setting rosetta variable is missing [arrow]

2023-10-10 Thread via GitHub
jonkeane commented on code in PR #37961: URL: https://github.com/apache/arrow/pull/37961#discussion_r1353475618 ## r/R/install-arrow.R: ## @@ -267,3 +268,8 @@ wslify_path <- function(path) { end_path <- strsplit(path, drive_expr)[[1]][-1] file.path(wslified_drive, end_path

[PR] docs(java): First step into adding Java docs (quickstart, driver manager summary, etc.) [arrow-adbc]

2023-10-10 Thread via GitHub
ywc88 opened a new pull request, #1188: URL: https://github.com/apache/arrow-adbc/pull/1188 Initially I wanted to use `javasphinx` to use `.. java:`, but ran into several build issues that mostly stem from it not being updated in over 5 years (one of several issues I ran into trying to `mak

Re: [I] [Python] pyarrow 13.0.0 converted `datetime64[ns]` to `datetime64[us]` when using `pd.read_parquet` [arrow]

2023-10-10 Thread via GitHub
seanslma commented on issue #38171: URL: https://github.com/apache/arrow/issues/38171#issuecomment-1756316790 Now understand that this is something related to pyarrow 12.0.0. After upgrading to pyarrow 13.0.0, this will not happen. I will close the issue. -- This is an automated me

Re: [I] [C++][Format] Draft an implementation of the LIST_VIEW array format [arrow]

2023-10-10 Thread via GitHub
zeroshade commented on issue #35344: URL: https://github.com/apache/arrow/issues/35344#issuecomment-1756312861 Should be closed when the C++ impl is merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] [C++][Format] Draft an implementation of the LIST_VIEW array format [arrow]

2023-10-10 Thread via GitHub
zeroshade commented on issue #35344: URL: https://github.com/apache/arrow/issues/35344#issuecomment-1756312598 Should not have been closed when the Go PR was merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] GH-35344: [Go][Format] Implementation of the LIST_VIEW and LARGE_LIST_VIEW array formats [arrow]

2023-10-10 Thread via GitHub
zeroshade merged PR #37468: URL: https://github.com/apache/arrow/pull/37468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] GH-35344: [Go][Format] Implementation of the LIST_VIEW and LARGE_LIST_VIEW array formats [arrow]

2023-10-10 Thread via GitHub
zeroshade commented on code in PR #37468: URL: https://github.com/apache/arrow/pull/37468#discussion_r1353454400 ## go/arrow/array/list.go: ## @@ -618,19 +634,1061 @@ func (b *baseListBuilder) UnmarshalJSON(data []byte) error { return b.Unmarshal(dec) } +// ListView

Re: [PR] GH-35344: [Go][Format] Implementation of the LIST_VIEW and LARGE_LIST_VIEW array formats [arrow]

2023-10-10 Thread via GitHub
felipecrv commented on code in PR #37468: URL: https://github.com/apache/arrow/pull/37468#discussion_r1353449264 ## go/arrow/array/list.go: ## @@ -618,19 +634,1061 @@ func (b *baseListBuilder) UnmarshalJSON(data []byte) error { return b.Unmarshal(dec) } +// ListView

Re: [PR] GH-37907: [R] Setting rosetta variable is missing [arrow]

2023-10-10 Thread via GitHub
thisisnic commented on PR #37961: URL: https://github.com/apache/arrow/pull/37961#issuecomment-1756288146 > @thisisnic should we merge this before the release? Yes, but it'll need a final change before merging -- This is an automated message from the Apache Git Service. To respond t

Re: [PR] GH-37907: [R] Setting rosetta variable is missing [arrow]

2023-10-10 Thread via GitHub
thisisnic commented on PR #37961: URL: https://github.com/apache/arrow/pull/37961#issuecomment-1756286287 @jonkeane You most recently worked on the code that this PR touches and I'm not sure where is best for this function - I don't suppose you'd mind giving this a quick look over? -- Th

Re: [PR] MINOR: [R] Add Jacob Wujciak-Jens as author [arrow]

2023-10-10 Thread via GitHub
assignUser commented on code in PR #38188: URL: https://github.com/apache/arrow/pull/38188#discussion_r1353437990 ## r/DESCRIPTION: ## @@ -10,6 +10,7 @@ Authors@R: c( person("Jonathan", "Keane", email = "jke...@gmail.com", role = c("aut")), person("Drago\u0219", "Moldo

Re: [PR] GH-35531: [Python] C Data Interface PyCapsule Protocol [arrow]

2023-10-10 Thread via GitHub
jorisvandenbossche commented on code in PR #37797: URL: https://github.com/apache/arrow/pull/37797#discussion_r1353434379 ## python/pyarrow/table.pxi: ## @@ -2983,6 +2985,100 @@ cdef class RecordBatch(_Tabular): c_ptr, c_schema)) return pyarrow_wra

Re: [I] [R] read_parquet performs to slow [arrow]

2023-10-10 Thread via GitHub
amoeba commented on issue #38032: URL: https://github.com/apache/arrow/issues/38032#issuecomment-1756263462 Right, running `pyarrow._s3fs.initialize_s3(pyarrow._s3fs.S3LogLevel.Debug)` should immediately start printing to stdout. For reference, with pyarrow 13 I see: ``` Python 3.

Re: [PR] GH-37812: [MATLAB] Add `arrow.type.ListType` MATLAB class [arrow]

2023-10-10 Thread via GitHub
kevingurney commented on code in PR #38189: URL: https://github.com/apache/arrow/pull/38189#discussion_r1353427672 ## matlab/src/cpp/arrow/matlab/type/proxy/list_type.cc: ## @@ -0,0 +1,64 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor l

Re: [PR] GH-37812: [MATLAB] Add `arrow.type.ListType` MATLAB class [arrow]

2023-10-10 Thread via GitHub
kevingurney commented on code in PR #38189: URL: https://github.com/apache/arrow/pull/38189#discussion_r1353414070 ## matlab/test/arrow/type/tListType.m: ## @@ -0,0 +1,124 @@ +% TLISTTYPE Tests for arrow.type.ListType + +% Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] GH-37812: [MATLAB] Add `arrow.type.ListType` MATLAB class [arrow]

2023-10-10 Thread via GitHub
kevingurney commented on code in PR #38189: URL: https://github.com/apache/arrow/pull/38189#discussion_r1353413093 ## matlab/test/arrow/type/tListType.m: ## @@ -0,0 +1,124 @@ +% TLISTTYPE Tests for arrow.type.ListType + +% Licensed to the Apache Software Foundation (ASF) under o

Re: [I] [R] open_dataset() behavior with incorrectly quoted input data [arrow]

2023-10-10 Thread via GitHub
amoeba commented on issue #37908: URL: https://github.com/apache/arrow/issues/37908#issuecomment-1756253850 Hi @angela-li, when I've run into situations like yours in the past, I've resorted to adding a cleanup step in between the raw data and the less flexible system (in this case, arrow)

  1   2   3   4   5   >