Re: [PR] GH-41813: [C++] Fix avx2 gather offset larger than 2GB in `CompareColumnsToRows` [arrow]

2024-06-21 Thread via GitHub
zanmato1984 commented on PR #42188: URL: https://github.com/apache/arrow/pull/42188#issuecomment-2182124858 @pitrou @felipecrv @ZhangHuiGui @mapleFU Would you please help to take a look? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] GH-42102: [C++] Add binary that extracts a footer from a parquet file [arrow]

2024-06-21 Thread via GitHub
alkis commented on PR #42174: URL: https://github.com/apache/arrow/pull/42174#issuecomment-2182110596 Given that tools are packages in deb and rpm this covers the majority of linux. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] [C++][Python] pyarrow table group_by/aggregate results in multiple rows with the same group_by key [arrow]

2024-06-21 Thread via GitHub
FreekPaans commented on issue #42231: URL: https://github.com/apache/arrow/issues/42231#issuecomment-2182094896 Great to hear it'll be fixed in the next release, thanks for looking into it! For future and cross reference, it was fixed in this PR: https://github.com/apache/arrow/issues

Re: [PR] GH-42228: [CI][Java] Suppress transfer progress log in java-jars [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42230: URL: https://github.com/apache/arrow/pull/42230#issuecomment-2182095081 Related https://github.com/apache/arrow/issues/42149 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] MINOR: [Java] Bump com.google.flatbuffers:flatbuffers-java from 23.5.26 to 24.3.25 in /java MANUAL [arrow]

2024-06-21 Thread via GitHub
github-actions[bot] commented on PR #42204: URL: https://github.com/apache/arrow/pull/42204#issuecomment-2182134327 Revision: c5b48a39e879c966b8f2889df92a1d53a6f86a5c Submitted crossbow builds: [ursacomputing/crossbow @ actions-8ab80f7e97](https://github.com/ursacomputing/crossbow/bra

Re: [PR] MINOR: [Java] Bump com.google.flatbuffers:flatbuffers-java from 23.5.26 to 24.3.25 in /java MANUAL [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42204: URL: https://github.com/apache/arrow/pull/42204#issuecomment-2182130922 @github-actions crossbow submit -g java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] MINOR: [Java] Bump com.google.flatbuffers:flatbuffers-java from 23.5.26 to 24.3.25 in /java MANUAL [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42204: URL: https://github.com/apache/arrow/pull/42204#issuecomment-2182130614 @lidavidm rebased. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] GH-42228: [CI][Java] Suppress transfer progress log in java-jars [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42230: URL: https://github.com/apache/arrow/pull/42230#issuecomment-2182092009 Same error in both failing CIs. Cannot find protobuf? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] MINOR: enhance document for RecordBatchReader::ReadNext [arrow]

2024-06-21 Thread via GitHub
wgtmac commented on code in PR #42239: URL: https://github.com/apache/arrow/pull/42239#discussion_r1648426172 ## cpp/src/arrow/record_batch.h: ## @@ -310,7 +310,20 @@ class ARROW_EXPORT RecordBatchReader { /// \brief Read the next record batch in the stream. Return null for b

Re: [PR] GH-34785: [C++][Parquet] Parquet Bloom Filter Writer Implementation [arrow]

2024-06-21 Thread via GitHub
mapleFU commented on code in PR #37400: URL: https://github.com/apache/arrow/pull/37400#discussion_r1648418819 ## cpp/src/parquet/bloom_filter_builder.cc: ## @@ -0,0 +1,158 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreemen

Re: [PR] GH-34785: [C++][Parquet] Parquet Bloom Filter Writer Implementation [arrow]

2024-06-21 Thread via GitHub
emkornfield commented on code in PR #37400: URL: https://github.com/apache/arrow/pull/37400#discussion_r1648419112 ## cpp/src/parquet/bloom_filter_builder.cc: ## @@ -0,0 +1,158 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] GH-42193: [Java] Update dependency to maintain JUnit 5 only [arrow]

2024-06-21 Thread via GitHub
lidavidm merged PR #42206: URL: https://github.com/apache/arrow/pull/42206 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] GH-34785: [C++][Parquet] Parquet Bloom Filter Writer Implementation [arrow]

2024-06-21 Thread via GitHub
emkornfield commented on PR #37400: URL: https://github.com/apache/arrow/pull/37400#issuecomment-2182032386 Sorry for the delay, I didn't get a chance to review all tests in detail. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] GH-41952: [R] Turn S3, GCS, and ZSTD on by default for macOS [arrow]

2024-06-21 Thread via GitHub
nealrichardson commented on PR #42210: URL: https://github.com/apache/arrow/pull/42210#issuecomment-2180582378 > > I also wonder whether GCS is worth including. > > I agree + though similar when I checked if S3 alone was sufficient for quieting the message. I'm happy to pull GCS out o

Re: [PR] GH-41952: [R] Turn S3, GCS, and ZSTD on by default for macOS [arrow]

2024-06-21 Thread via GitHub
jonkeane commented on code in PR #42210: URL: https://github.com/apache/arrow/pull/42210#discussion_r1647508460 ## r/tools/nixlibs.R: ## @@ -574,6 +574,8 @@ build_libarrow <- function(src_dir, dst_dir) { env_var_list <- c(env_var_list, setNames("BUNDLED", env_var))

Re: [PR] GH-41952: [R] Turn S3, GCS, and ZSTD on by default for macOS [arrow]

2024-06-21 Thread via GitHub
jonkeane commented on PR #42210: URL: https://github.com/apache/arrow/pull/42210#issuecomment-2180574740 > I also wonder whether GCS is worth including. I agree + though similar when I checked if S3 alone was sufficient for quieting the message. I'm happy to pull GCS out of this, thou

Re: [PR] GH-41952: [R] Turn S3, GCS, and ZSTD on by default for macOS [arrow]

2024-06-21 Thread via GitHub
nealrichardson commented on code in PR #42210: URL: https://github.com/apache/arrow/pull/42210#discussion_r1647512602 ## r/tools/nixlibs.R: ## @@ -814,8 +816,14 @@ set_thirdparty_urls <- function(env_var_list) { env_var_list } -is_feature_requested <- function(env_varname,

Re: [PR] GH-28866: [Java] Java Dataset API ScanOptions expansion [arrow]

2024-06-21 Thread via GitHub
jinchengchenghh commented on code in PR #41646: URL: https://github.com/apache/arrow/pull/41646#discussion_r1645284541 ## cpp/src/arrow/engine/substrait/serde.h: ## @@ -183,6 +184,13 @@ ARROW_ENGINE_EXPORT Result DeserializeExpressions( const ConversionOptions& conversion_

Re: [PR] GH-28866: [Java] Java Dataset API ScanOptions expansion [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on code in PR #41646: URL: https://github.com/apache/arrow/pull/41646#discussion_r1645283246 ## cpp/src/arrow/engine/substrait/serde.h: ## @@ -183,6 +184,13 @@ ARROW_ENGINE_EXPORT Result DeserializeExpressions( const ConversionOptions& conversion_options

Re: [PR] GH-42197: [CI][Packaging][Java] Ensure updating "python@*" formulae on macOS [arrow]

2024-06-21 Thread via GitHub
assignUser commented on PR #42202: URL: https://github.com/apache/arrow/pull/42202#issuecomment-2177370896 https://github.com/actions/runner-images/issues/9966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] GH-42197: [CI][Packaging][Java] Ensure updating "python@*" formulae on macOS [arrow]

2024-06-21 Thread via GitHub
assignUser commented on PR #42202: URL: https://github.com/apache/arrow/pull/42202#issuecomment-2177369322 I feel like I have seen an issue about this in the runner repo... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

Re: [PR] fix: Fix symbol export visibility [arrow-nanoarrow]

2024-06-21 Thread via GitHub
paleolimbot commented on code in PR #531: URL: https://github.com/apache/arrow-nanoarrow/pull/531#discussion_r1645259233 ## src/nanoarrow/nanoarrow_types.h: ## @@ -32,6 +32,22 @@ extern "C" { #endif +#if defined _WIN32 || defined __CYGWIN__ +#if defined(NANOARROW_BUILD_DLL)

Re: [PR] fix: Fix Meson include directories [arrow-nanoarrow]

2024-06-21 Thread via GitHub
paleolimbot commented on code in PR #532: URL: https://github.com/apache/arrow-nanoarrow/pull/532#discussion_r1645270591 ## src/nanoarrow/nanoarrow_types.h: ## @@ -21,7 +21,7 @@ #include #include -#include "nanoarrow_config.h" +#include Review Comment: ```suggestion

Re: [PR] Add user defined metadata [arrow-rs]

2024-06-21 Thread via GitHub
Xuanwo commented on PR #5915: URL: https://github.com/apache/arrow-rs/pull/5915#issuecomment-2177407329 > The one caveat here is I'm not sure how Rust handles enums (open vs. closed). Rust handles this by [non_exhaustive](https://doc.rust-lang.org/reference/attributes/type_system.htm

Re: [PR] Add user defined metadata [arrow-rs]

2024-06-21 Thread via GitHub
Xuanwo commented on code in PR #5915: URL: https://github.com/apache/arrow-rs/pull/5915#discussion_r1645291141 ## object_store/src/client/get.rs: ## @@ -221,6 +221,23 @@ fn get_result( ) ); +// Add attributes that match the user-defined metadata prefix (e.g.

Re: [PR] GH-28866: [Java] Java Dataset API ScanOptions expansion [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on code in PR #41646: URL: https://github.com/apache/arrow/pull/41646#discussion_r1645275950 ## java/dataset/src/main/cpp/jni_wrapper.cc: ## @@ -363,6 +364,64 @@ std::shared_ptr LoadArrowBufferFromByteBuffer(JNIEnv* env, jobjec return buffer; } +inline

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
github-actions[bot] commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177318558 Revision: 6bc6e7bc4acfae31e5342229c3f5558bab171f7f Submitted crossbow builds: [ursacomputing/crossbow @ actions-04668f4784](https://github.com/ursacomputing/crossbow/bra

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177316898 @github-actions crossbow submit -g java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] GH-42124: [Swift] Add methods for loading and validating builder by type [arrow]

2024-06-21 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #42195: URL: https://github.com/apache/arrow/pull/42195#issuecomment-2177321766 After merging your PR, Conbench analyzed the 8 benchmarking runs that have been run so far on merge-commit 1857e54a689001142a70a828558e00a2cae808ab. There were no

Re: [PR] MINOR: [Java] Bump io.netty:netty-bom from 4.1.110.Final to 4.1.111.Final in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm commented on PR #42186: URL: https://github.com/apache/arrow/pull/42186#issuecomment-2177316599 Please check the grpc-java issues, maybe there is something about it already -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] MINOR: [Java] Bump com.google.flatbuffers:flatbuffers-java from 23.5.26 to 24.3.25 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm commented on PR #42185: URL: https://github.com/apache/arrow/pull/42185#issuecomment-2177316399 We would have to regenerate the format code -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] MINOR: [Java] Bump com.diffplug.spotless:spotless-maven-plugin from 2.30.0 to 2.43.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42183: URL: https://github.com/apache/arrow/pull/42183#issuecomment-2177290414 cc @lidavidm @laurentgo I think we cannot do this at the moment as we decided to keep spotless to support both JDK8 and later, right? -- This is an automated message from the

Re: [PR] MINOR: [Java] Bump checker.framework.version from 3.43.0 to 3.44.0 in /java [arrow]

2024-06-21 Thread via GitHub
github-actions[bot] commented on PR #42180: URL: https://github.com/apache/arrow/pull/42180#issuecomment-2177293669 Revision: e63bab95096c8d0d37f4182659bd29fcad304812 Submitted crossbow builds: [ursacomputing/crossbow @ actions-822956ca2a](https://github.com/ursacomputing/crossbow/bra

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177289515 @dependabot rebase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] GH-42045: [Java] Update Unit Tests for Flight Module [arrow]

2024-06-21 Thread via GitHub
lidavidm merged PR #42158: URL: https://github.com/apache/arrow/pull/42158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] MINOR: [Java] Bump io.netty:netty-bom from 4.1.110.Final to 4.1.111.Final in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42186: URL: https://github.com/apache/arrow/pull/42186#issuecomment-2177295842 @lidavidm for the moment should we not do this upgrade? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] MINOR: [Java] Bump com.google.flatbuffers:flatbuffers-java from 23.5.26 to 24.3.25 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42185: URL: https://github.com/apache/arrow/pull/42185#issuecomment-2177295969 @lidavidm for the moment should we not do this upgrade? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] MINOR: [Java] Bump org.apache.maven.plugin-tools:maven-plugin-annotations from 3.12.0 to 3.13.1 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42179: URL: https://github.com/apache/arrow/pull/42179#issuecomment-2177293054 cc @lidavidm the crossbow failure seems to be due to the flaky one. Shall we re-run? Otherwise this seems good. -- This is an automated message from the Apache Git Service. To res

Re: [PR] MINOR: [Java] Bump checker.framework.version from 3.43.0 to 3.44.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42180: URL: https://github.com/apache/arrow/pull/42180#issuecomment-2177291484 @github-actions crossbow submit -g java -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] MINOR: [Java] Bump checker.framework.version from 3.43.0 to 3.44.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42180: URL: https://github.com/apache/arrow/pull/42180#issuecomment-2177291957 @lidavidm could we re-run the CI (not crossbows) there is one flaky failure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177288672 @lidavidm The same two failed again, but both are expected. Re-run again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

Re: [PR] GH-42045: [Java] Update Unit Tests for Flight Module [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on code in PR #42158: URL: https://github.com/apache/arrow/pull/42158#discussion_r1645234886 ## java/flight/flight-sql-jdbc-driver/src/test/java/org/apache/arrow/driver/jdbc/ITDriverJarValidation.java: ## @@ -113,7 +105,7 @@ public void validateShadedJar() thr

Re: [PR] GH-41974: [C++][Compute] Support more precise pre-allocation and more pre-allocated types for ScalarExecutor and VectorExecutor [arrow]

2024-06-21 Thread via GitHub
felipecrv commented on code in PR #41975: URL: https://github.com/apache/arrow/pull/41975#discussion_r1645229813 ## cpp/src/arrow/compute/exec.cc: ## @@ -1034,9 +1034,23 @@ class VectorExecutor : public KernelExecutorImpl { output_num_buffers_ = static_cast(output_type_.t

Re: [PR] GH-41909: PoC: [C++] Add arrow::ArrayStatistics [arrow]

2024-06-21 Thread via GitHub
felipecrv commented on code in PR #42133: URL: https://github.com/apache/arrow/pull/42133#discussion_r1645224429 ## cpp/src/arrow/array/array_base.h: ## @@ -232,6 +232,14 @@ class ARROW_EXPORT Array { /// \return DeviceAllocationType DeviceAllocationType device_type() cons

Re: [PR] chore(dev/release): include C# in RC verification [arrow-adbc]

2024-06-21 Thread via GitHub
lidavidm merged PR #1916: URL: https://github.com/apache/arrow-adbc/pull/1916 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] GH-15053: [C++] Add option to string 'center' kernel to control left/right alignment on odd number of padding [arrow]

2024-06-21 Thread via GitHub
felipecrv commented on PR #41449: URL: https://github.com/apache/arrow/pull/41449#issuecomment-2177276169 > But the issue is that "align" points to what the actual function does in its entirety (left/right/center padding) already? Yes. It's not really aligning left/right when central

Re: [PR] GH-41974: [C++][Compute] Support more precise pre-allocation and more pre-allocated types for ScalarExecutor and VectorExecutor [arrow]

2024-06-21 Thread via GitHub
felipecrv commented on PR #41975: URL: https://github.com/apache/arrow/pull/41975#issuecomment-2177267969 > @felipecrv I just think about is there some techniques to denote an `ArrayData` is "owned and writable", like the output of `PrepareOutput`. This can do some techniques, like: >

Re: [PR] MINOR: [Java] Bump pl.project13.maven:git-commit-id-plugin from 4.0.5 to 4.9.10 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42181: URL: https://github.com/apache/arrow/pull/42181#issuecomment-2177265978 @kou seems like CIs and Crossbows are passing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] MINOR: [Java] Bump org.codehaus.mojo:exec-maven-plugin from 3.2.0 to 3.3.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42187: URL: https://github.com/apache/arrow/pull/42187#issuecomment-2177266534 @kou Looks like CIs and crossbows are passing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] MINOR: [Java] Bump org.codehaus.mojo:exec-maven-plugin from 3.2.0 to 3.3.0 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm merged PR #42187: URL: https://github.com/apache/arrow/pull/42187 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] MINOR: [Java] Bump pl.project13.maven:git-commit-id-plugin from 4.0.5 to 4.9.10 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm merged PR #42181: URL: https://github.com/apache/arrow/pull/42181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
lidavidm commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177270486 I re-ran the pipelines above -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] MINOR: [Java] Bump org.cyclonedx:cyclonedx-maven-plugin from 2.7.11 to 2.8.0 in /java [arrow]

2024-06-21 Thread via GitHub
vibhatha commented on PR #42184: URL: https://github.com/apache/arrow/pull/42184#issuecomment-2177267988 @kou The same reason as mentioned [here](https://github.com/apache/arrow/pull/42181#issuecomment-2176536343) and the other failure is flaky but expected. Shall we re-run the crossbows?

Re: [PR] fix: Fix Meson include directories [arrow-nanoarrow]

2024-06-21 Thread via GitHub
paleolimbot commented on code in PR #532: URL: https://github.com/apache/arrow-nanoarrow/pull/532#discussion_r1645208963 ## meson.build: ## @@ -62,18 +38,20 @@ else libtype = 'library' endif +subdir('src') Review Comment: > Another approach would be to change the inc

Re: [PR] GH-41884: [Python] Fix RecordBatchReader.cast to support casting to equal schema for all types [arrow]

2024-06-21 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #42098: URL: https://github.com/apache/arrow/pull/42098#issuecomment-2177229484 After merging your PR, Conbench analyzed the 8 benchmarking runs that have been run so far on merge-commit f2ce331d1df414e2a484d976dbea8871e63636af. There were no