Re: [PR] chore: Fix auto documentation update [arrow-js]

2025-06-22 Thread via GitHub
kou commented on PR #173: URL: https://github.com/apache/arrow-js/pull/173#issuecomment-2995138423 Worked: https://github.com/kou/arrow-js/actions/runs/15817003928/job/44577615157 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Release 21.0.0 [arrow-js]

2025-06-22 Thread via GitHub
raulcd commented on issue #167: URL: https://github.com/apache/arrow-js/issues/167#issuecomment-2995106727 Given the upgrade to node and other dependencies, I think it's a good idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

Re: [I] [CI][C++][R] R Sanitizer for M1 fails on S3FileSystem [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #46863: URL: https://github.com/apache/arrow/issues/46863#issuecomment-2995070415 Hmm... I don't understand why this happens... Can someone help us? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[PR] GH-46879: [CI][Packaging][Linux] Don't check example build with old CMake [arrow]

2025-06-22 Thread via GitHub
kou opened a new pull request, #46880: URL: https://github.com/apache/arrow/pull/46880 ### Rationale for this change #46834 required CMake 3.25 or later for an example CMake project. ### What changes are included in this PR? Don't build the example CMake project with CMak

Re: [I] [CI][Dev] Fix shellcheck errors in the ci/scripts/install_dask.sh [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #46846: URL: https://github.com/apache/arrow/issues/46846#issuecomment-2994986293 Issue resolved by pull request 46847 https://github.com/apache/arrow/pull/46847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-46879: [CI][Packaging][Linux] Don't check example build with old CMake [arrow]

2025-06-22 Thread via GitHub
github-actions[bot] commented on PR #46880: URL: https://github.com/apache/arrow/pull/46880#issuecomment-2994984489 Revision: 894f98d893c4acc58ac491a3fc7f496dd6150897 Submitted crossbow builds: [ursacomputing/crossbow @ actions-08dc997b01](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-46846: [CI][Dev] Fix shellcheck errors in the ci/scripts/install_dask.sh [arrow]

2025-06-22 Thread via GitHub
hiroyuki-sato commented on PR #46847: URL: https://github.com/apache/arrow/pull/46847#issuecomment-2994977290 I'll appreciate if someone review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Release 21.0.0 [arrow-swift]

2025-06-22 Thread via GitHub
kou commented on issue #49: URL: https://github.com/apache/arrow-swift/issues/49#issuecomment-2994950298 Should we release a new version after #40? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] GH-788: Bump Windows GitHub hosted runner to windows-2022 on release workflow [arrow-java]

2025-06-22 Thread via GitHub
lidavidm commented on PR #789: URL: https://github.com/apache/arrow-java/pull/789#issuecomment-2994681666 We may just have to merge this for now though and follow up on the other CI issues, given windows-2019 will be deprecated in a week. Though, quite possibly the Windows/Boost issue

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994157996 🤖 `./gh_compare_arrow.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_arrow.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP

Re: [PR] GH-46094: [C++][Docs] Add note to RleDecoder::Get's doc comment [arrow]

2025-06-22 Thread via GitHub
kou merged PR #46874: URL: https://github.com/apache/arrow/pull/46874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-46777: [C++] Use SimplifyIsIn only when the value_set of the expression is lower than a threshold [arrow]

2025-06-22 Thread via GitHub
zanmato1984 commented on code in PR #46859: URL: https://github.com/apache/arrow/pull/46859#discussion_r2160603949 ## cpp/src/arrow/compute/expression.cc: ## @@ -1331,7 +1331,13 @@ struct Inequality { return call->function_name == "is_valid" ? literal(true) : literal(fal

Re: [PR] GH-46777: [C++] Use SimplifyIsIn only when the value_set of the expression is lower than a threshold [arrow]

2025-06-22 Thread via GitHub
zanmato1984 commented on code in PR #46859: URL: https://github.com/apache/arrow/pull/46859#discussion_r2160603949 ## cpp/src/arrow/compute/expression.cc: ## @@ -1331,7 +1331,13 @@ struct Inequality { return call->function_name == "is_valid" ? literal(true) : literal(fal

Re: [I] [JAVA] Client is able to connect to GRPC_TLS flight server with GRPC_INSECURE [arrow-java]

2025-06-22 Thread via GitHub
lidavidm commented on issue #790: URL: https://github.com/apache/arrow-java/issues/790#issuecomment-2994676485 Can you validate this with an http2 client like cURL? I think grpc-java may be negotiating TLS regardless -- This is an automated message from the Apache Git Service. To respond

Re: [PR] GH-725: Added ExtensionReader [arrow-java]

2025-06-22 Thread via GitHub
lidavidm commented on code in PR #726: URL: https://github.com/apache/arrow-java/pull/726#discussion_r2160579010 ## vector/src/test/java/org/apache/arrow/vector/complex/impl/UuidReaderImpl.java: ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

Re: [PR] GH-759: Get length of byte[] in TryCopyLastError [arrow-java]

2025-06-22 Thread via GitHub
lidavidm commented on PR #760: URL: https://github.com/apache/arrow-java/pull/760#issuecomment-2994640852 Regardless, there are Checkstyle failures ``` Error: /build/c/src/test/java/org/apache/arrow/c/ExceptionTest.java:30:8: Unused import: java.util.Objects. [UnusedImports]

Re: [PR] GH-759: Get length of byte[] in TryCopyLastError [arrow-java]

2025-06-22 Thread via GitHub
lidavidm commented on PR #760: URL: https://github.com/apache/arrow-java/pull/760#issuecomment-2994637005 ah right, CI is broken... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160529686 ## arrow-ord/src/cmp.rs: ## @@ -565,24 +565,46 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { /// Item.0 is the array, Item.1 is the in

Re: [PR] Use `Arc<[Buffer]>` instead of raw `Vec` in `GenericByteViewArray` for faster `slice` [arrow-rs]

2025-06-22 Thread via GitHub
ctsk commented on PR #6427: URL: https://github.com/apache/arrow-rs/pull/6427#issuecomment-2994418150 May I have a go at it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] [Release] Build and publish Rust docs [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #22005: URL: https://github.com/apache/arrow/issues/22005#issuecomment-2994453359 Rust was moved to arrow-rs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160319037 ## arrow-ord/src/cmp.rs: ## @@ -566,12 +566,18 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { type Item = (&'a GenericByteViewArray, usize);

Re: [PR] fix: Do not add null buffer for `NullArray` [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on code in PR #7726: URL: https://github.com/apache/arrow-rs/pull/7726#discussion_r2160318478 ## arrow-array/src/array/null_array.rs: ## @@ -201,4 +204,32 @@ mod tests { let array = NullArray::new(1024 * 1024); assert_eq!(format!("{array:?}"), "

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160330993 ## arrow-ord/src/cmp.rs: ## @@ -566,12 +566,18 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { type Item = (&'a GenericByteViewArray, us

Re: [PR] GH-46094: [C++][Docs] Add note to RleDecoder::Get's doc comment [arrow]

2025-06-22 Thread via GitHub
github-actions[bot] commented on PR #46874: URL: https://github.com/apache/arrow/pull/46874#issuecomment-2994130493 Revision: d82cf3854b115612fb59b57e0e9674da17c0eb23 Submitted crossbow builds: [ursacomputing/crossbow @ actions-bbae524876](https://github.com/ursacomputing/crossbow/bra

Re: [PR] [Variant] Use `BTreeMap` for `VariantBuilder.dict` and `ObjectBuilder.fields` to maintain invariants upon entry writes [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7720: URL: https://github.com/apache/arrow-rs/pull/7720#issuecomment-2994194379 I am not sure about returning an error on `append_value` Also, while typing this it seems like `ObjectBuilder::append_value` is a somewhat strange name -- maybe it would be better to

Re: [PR] fix: Do not add null buffer for `NullArray` in MutableArrayData [arrow-rs]

2025-06-22 Thread via GitHub
alamb merged PR #7726: URL: https://github.com/apache/arrow-rs/pull/7726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160529045 ## arrow-ord/src/cmp.rs: ## @@ -565,24 +565,46 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { /// Item.0 is the array, Item.1 is the in

Re: [PR] GH-45098 [R] Provide a translation for data.table::fcase [arrow]

2025-06-22 Thread via GitHub
github-actions[bot] commented on PR #46878: URL: https://github.com/apache/arrow/pull/46878#issuecomment-2994498851 :warning: GitHub issue #45098 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-44502: [R] Negative fractional dates must be converted to integers by floor, not trunc [arrow]

2025-06-22 Thread via GitHub
jonkeane commented on code in PR #46873: URL: https://github.com/apache/arrow/pull/46873#discussion_r2160515139 ## r/tests/testthat/test-Array.R: ## @@ -1397,3 +1397,9 @@ test_that("Can convert R integer/double to decimal (ARROW-11631)", { "Conversion to decimal from non-i

[PR] GH-45098 [R] Provide a translation for data.table::fcase [arrow]

2025-06-22 Thread via GitHub
MichaelChirico opened a new pull request, #46878: URL: https://github.com/apache/arrow/pull/46878 Closes #45098. A number of notes: - I basically had Gemini write this (on my free personal account). It did 95% of the work, from one prompt, then I tidied up the results and fixe

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160383718 ## arrow-ord/src/cmp.rs: ## @@ -565,24 +565,46 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { /// Item.0 is the array, Item.1 is the inde

Re: [I] [C++] Review hardcoded "lib" paths in Find$PACKAGE.cmake related to endogenous libraries [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #21518: URL: https://github.com/apache/arrow/issues/21518#issuecomment-2994454484 We migrated to `${PACKAGE}Config.cmake` from `Find${PACKAGE}.cmake`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] `test_to_pyarrow` tests fail during release verification [arrow-rs]

2025-06-22 Thread via GitHub
brunal commented on issue #7736: URL: https://github.com/apache/arrow-rs/issues/7736#issuecomment-2994314152 This is likely caused by a "cargo test": that will run the tests of all crates in the workspace, including "pyarrow". However "pyarrow" needs specific setup (pip install pyarrow). In

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160331203 ## arrow-ord/src/cmp.rs: ## @@ -566,12 +566,18 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { type Item = (&'a GenericByteViewArray, us

Re: [I] Do not populate nulls for `NullArray` for `MutableArrayData` [arrow-rs]

2025-06-22 Thread via GitHub
alamb closed issue #7725: Do not populate nulls for `NullArray` for `MutableArrayData` URL: https://github.com/apache/arrow-rs/issues/7725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [I] [Crossbow] Consider removing artifact patterns [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #23872: URL: https://github.com/apache/arrow/issues/23872#issuecomment-2994146100 #13740 implemented validation. #46743 may remove Crossbow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160334767 ## arrow-ord/src/cmp.rs: ## @@ -626,6 +645,16 @@ pub fn compare_byte_view( ) -> std::cmp::Ordering { assert!(left_idx < left.len()); assert!(right_idx < r

Re: [PR] GH-45713: [GLib] Add garrow_chunked_array_(import|export)() [arrow]

2025-06-22 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #46876: URL: https://github.com/apache/arrow/pull/46876#issuecomment-2994418061 After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 118ca35440345ebf2dfa4bf40721d8a58c9fb5b8. There were 12

Re: [PR] GH-45713: [GLib] Add garrow_chunked_array_(import|export)() [arrow]

2025-06-22 Thread via GitHub
kou merged PR #46876: URL: https://github.com/apache/arrow/pull/46876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994174237 My results match the 3x / 10x improvement 🚀 ``` eq StringViewArray StringViewArray inlined bytes time: [6.5763 ms 6.6128 ms 6.6497 ms]

Re: [PR] Add fallible versions of temporal functions that may panci [arrow-rs]

2025-06-22 Thread via GitHub
adriangb commented on code in PR #7737: URL: https://github.com/apache/arrow-rs/pull/7737#discussion_r2160358275 ## arrow-arith/src/numeric.rs: ## @@ -510,49 +510,122 @@ fn timestamp_op( } /// Arithmetic trait for date arrays -/// -/// Note: these should be fallible (#4456)

[I] Parquet LevelEncoder is much too eager to write short rle runs [arrow-rs]

2025-06-22 Thread via GitHub
jhorstmann opened a new issue, #7739: URL: https://github.com/apache/arrow-rs/issues/7739 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** While looking into parquet loading performance, I noticed that the definition levels for

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160394616 ## arrow-ord/src/cmp.rs: ## @@ -565,24 +565,46 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { /// Item.0 is the array, Item.1 is the inde

Re: [I] [Variant] `Variant::Object` can contain two fields with the same field name [arrow-rs]

2025-06-22 Thread via GitHub
friendlymatthew commented on issue #7730: URL: https://github.com/apache/arrow-rs/issues/7730#issuecomment-2994267927 https://github.com/apache/arrow-rs/compare/main...pydantic:arrow-rs:friendlymatthew/check-duplicate-field-name?expand=1 -- This is an automated message from the Apache Git

Re: [I] [R] timestamp support? [arrow-adbc]

2025-06-22 Thread via GitHub
eitsupi commented on issue #1587: URL: https://github.com/apache/arrow-adbc/issues/1587#issuecomment-2994266387 Seems fixed by #1707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[PR] [WIP] feat: Add Validation for Variant Deciaml [arrow-rs]

2025-06-22 Thread via GitHub
Weijun-H opened a new pull request, #7738: URL: https://github.com/apache/arrow-rs/pull/7738 # Which issue does this PR close? We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an i

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994254034 The benchmark result, i run locally for latest PR: ```rust critcmp main fast_path_view --filter "iew" group

Re: [I] [Variant] Support Nested Data in `VariantBuilder` [arrow-rs]

2025-06-22 Thread via GitHub
friendlymatthew commented on issue #7696: URL: https://github.com/apache/arrow-rs/issues/7696#issuecomment-2994255999 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994245549 > Promising results! I think we need some test to show it is correct with null characters and different lengths, I think we need to modify the implementation a bit. Thank

Re: [I] docs: Is ADBC support in DuckDB still in progress? [arrow-adbc]

2025-06-22 Thread via GitHub
eitsupi commented on issue #2216: URL: https://github.com/apache/arrow-adbc/issues/2216#issuecomment-2994215601 > Hmm, but there never was an 'ADBC 0.7.0'. The title seems to be saying 0.7.0 because the test started failing with the release of Python's adbc_driver_manager package 0.7.

Re: [PR] Add fallible versions of temporal functions that may panci [arrow-rs]

2025-06-22 Thread via GitHub
adriangb commented on PR #7737: URL: https://github.com/apache/arrow-rs/pull/7737#issuecomment-2994230624 cc @tustvold @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[PR] Add fallible versions of temporal functions that may panci [arrow-rs]

2025-06-22 Thread via GitHub
adriangb opened a new pull request, #7737: URL: https://github.com/apache/arrow-rs/pull/7737 Fixes #4456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [PR] Define a "arrow-pyrarrow" crate to implement the "pyarrow" feature. [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7694: URL: https://github.com/apache/arrow-rs/pull/7694#issuecomment-2994226563 I hit some error probably related to this PR when making a release candidate: https://github.com/apache/arrow-rs/issues/7736 -- This is an automated message from the Apache Git Ser

Re: [I] `test_to_pyarrow` tests fail during release verification [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on issue #7736: URL: https://github.com/apache/arrow-rs/issues/7736#issuecomment-2994226307 It fails locally with this command: ```shell cargo test -p arrow-pyarrow --test pyarrow ``` -- This is an automated message from the Apache Git Service. To respond to the me

Re: [I] Release arrow-rs / parquet Minor version `55.2.0` (June 2025) [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on issue #7394: URL: https://github.com/apache/arrow-rs/issues/7394#issuecomment-2994225859 I hit a problem when testing the RC: - https://github.com/apache/arrow-rs/issues/7736 I will investigate -- This is an automated message from the Apache Git Service. To res

[I] `test_to_pyarrow` tests fail during release verification [arrow-rs]

2025-06-22 Thread via GitHub
alamb opened a new issue, #7736: URL: https://github.com/apache/arrow-rs/issues/7736 **Describe the bug** When verifying the release cand ``` running 2 tests test test_to_pyarrow_byte_view ... FAILED test test_to_pyarrow ... FAILED failures: test_to_pyarr

Re: [PR] Prepare for `55.2.0` release [arrow-rs]

2025-06-22 Thread via GitHub
alamb merged PR #7722: URL: https://github.com/apache/arrow-rs/pull/7722 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994192858 🤖: Benchmark completed Details ``` group fast_path_view

Re: [I] rust: split workspaces so that dependencies of adbc_core aren't tied to what the datafusion driver requires [arrow-adbc]

2025-06-22 Thread via GitHub
lidavidm commented on issue #2739: URL: https://github.com/apache/arrow-adbc/issues/2739#issuecomment-2994130448 Ok. It sounds like we don't need to split workspaces, at least. We can just relax `adbc_core`'s version bound (https://github.com/apache/arrow-adbc/issues/2524) and have `adbc_da

Re: [I] [Variant] Add input validation in `VariantBuilder` [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on issue #7697: URL: https://github.com/apache/arrow-rs/issues/7697#issuecomment-2994200543 > It would be better to provide two append functions: one that includes validation and returns an error, and another without validation. 🤔 Another approach that @friendlymatthew

Re: [I] In arrow_json, Decoder::decode can panic if it encounters two high surrogates in a row. [arrow-rs]

2025-06-22 Thread via GitHub
alamb closed issue #7712: In arrow_json, Decoder::decode can panic if it encounters two high surrogates in a row. URL: https://github.com/apache/arrow-rs/issues/7712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] [Archery] lint sub-command should provide a --fail-fast option [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #23158: URL: https://github.com/apache/arrow/issues/23158#issuecomment-2994148535 #40417 removed `archery lint`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [Variant] Use `BTreeMap` for `VariantBuilder.dict` and `ObjectBuilder.fields` to maintain invariants upon entry writes [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7720: URL: https://github.com/apache/arrow-rs/pull/7720#issuecomment-2994196744 > I got too excited and decided to add the duplicate field name check to this PR. I'm happy to roll that commit back and merge this PR as strictly a BTreeMap change, and then push up a fol

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160332637 ## arrow-ord/src/cmp.rs: ## @@ -626,6 +645,16 @@ pub fn compare_byte_view( ) -> std::cmp::Ordering { assert!(left_idx < left.len()); assert!(right_idx < r

Re: [PR] [Variant] Use `BTreeMap` for `VariantBuilder.dict` and `ObjectBuilder.fields` to maintain invariants upon entry writes [arrow-rs]

2025-06-22 Thread via GitHub
friendlymatthew commented on PR #7720: URL: https://github.com/apache/arrow-rs/pull/7720#issuecomment-2994196078 > I am not sure about returning an error on `append_value` > > Also, while typing this it seems like `ObjectBuilder::append_value` is a somewhat strange name -- maybe it wo

Re: [PR] [Variant] Use `BTreeMap` for `VariantBuilder.dict` and `ObjectBuilder.fields` to maintain invariants upon entry writes [arrow-rs]

2025-06-22 Thread via GitHub
friendlymatthew commented on code in PR #7720: URL: https://github.com/apache/arrow-rs/pull/7720#discussion_r2160335027 ## parquet-variant/src/builder.rs: ## @@ -484,13 +486,33 @@ impl<'a> ObjectBuilder<'a> { } } +fn check_duplicate_field_name(&self, key: &st

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
zhuqi-lucas commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160334834 ## arrow-ord/src/cmp.rs: ## @@ -626,6 +645,16 @@ pub fn compare_byte_view( ) -> std::cmp::Ordering { assert!(left_idx < left.len()); assert!(right_idx <

Re: [PR] [Variant] Use `BTreeMap` for `VariantBuilder.dict` and `ObjectBuilder.fields` to maintain invariants upon entry writes [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on code in PR #7720: URL: https://github.com/apache/arrow-rs/pull/7720#discussion_r2160334082 ## parquet-variant/src/builder.rs: ## @@ -484,13 +486,33 @@ impl<'a> ObjectBuilder<'a> { } } +fn check_duplicate_field_name(&self, key: &str) -> Resu

Re: [PR] parquet_derive: update in working example for ParquetRecordWriter [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on code in PR #7733: URL: https://github.com/apache/arrow-rs/pull/7733#discussion_r2160332949 ## parquet_derive/src/lib.rs: ## @@ -50,45 +50,52 @@ mod parquet_field; /// Example: /// /// ```no_run Review Comment: Is there any way we can remove the `no_run`

Re: [PR] fix JSON decoder error checking for UTF16 / surrogate parsing panic [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7721: URL: https://github.com/apache/arrow-rs/pull/7721#issuecomment-2994190667 If anything this seems faster than what is on main so it has a added bonus -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] fix JSON decoder error checking for UTF16 / surrogate parsing panic [arrow-rs]

2025-06-22 Thread via GitHub
alamb merged PR #7721: URL: https://github.com/apache/arrow-rs/pull/7721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] fix JSON decoder error checking for UTF16 / surrogate parsing panic [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7721: URL: https://github.com/apache/arrow-rs/pull/7721#issuecomment-2994190714 Thanks again @nicklan and @scovich -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994190179 🤖 `./gh_compare_arrow.sh` [Benchmark Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_arrow.sh) Running Linux aal-dev 6.11.0-1015-gcp #15~24.04.1-Ubuntu SMP

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
Dandandan commented on code in PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#discussion_r2160322025 ## arrow-ord/src/cmp.rs: ## @@ -566,12 +566,18 @@ impl<'a, T: ByteViewType> ArrayOrd for &'a GenericByteViewArray { type Item = (&'a GenericByteViewArray, usiz

Re: [PR] Perf: Optimize comparison kernels for inlined views [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on PR #7731: URL: https://github.com/apache/arrow-rs/pull/7731#issuecomment-2994160678 🤖: Benchmark completed Details ``` group fast_path_view

Re: [I] [Release] Document to use SNAPSHOT versions in pom.xml files for patch releases [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #22413: URL: https://github.com/apache/arrow/issues/22413#issuecomment-2994151030 We moved Java to apache/arrow-java. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [I] [Archery] Consider to use archery with or instead of the pre-commit hooks [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #23447: URL: https://github.com/apache/arrow/issues/23447#issuecomment-2994148062 #40417 removed `archery lint` because we use `pre-commit`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] [GLib] Add Duration type support [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #23970: URL: https://github.com/apache/arrow/issues/23970#issuecomment-2994144400 This should be implemented. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] [GLib] Add `garrow_chunked_array_import()` and `garrow_chunked_array_export()` [arrow]

2025-06-22 Thread via GitHub
kou commented on issue #45713: URL: https://github.com/apache/arrow/issues/45713#issuecomment-2994132932 Issue resolved by pull request 46876 https://github.com/apache/arrow/pull/46876 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-46094: [C++][Docs] Add note to RleDecoder::Get's doc comment [arrow]

2025-06-22 Thread via GitHub
kou commented on PR #46874: URL: https://github.com/apache/arrow/pull/46874#issuecomment-2994129246 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] docs: Is ADBC support in DuckDB still in progress? [arrow-adbc]

2025-06-22 Thread via GitHub
lidavidm commented on issue #2216: URL: https://github.com/apache/arrow-adbc/issues/2216#issuecomment-2994128905 Hmm, but there never was an 'ADBC 0.7.0'. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] arrow-row: Refactor arrow-row REE roundtrip tests [arrow-rs]

2025-06-22 Thread via GitHub
alamb commented on code in PR #7729: URL: https://github.com/apache/arrow-rs/pull/7729#discussion_r2160302652 ## arrow-row/src/run.rs: ## @@ -297,24 +262,7 @@ mod tests { .into_iter() .collect(); -let converter = RowConverter::new(vec![SortFi

Re: [I] [Variant] Add input validation in `VariantBuilder` [arrow-rs]

2025-06-22 Thread via GitHub
Weijun-H commented on issue #7697: URL: https://github.com/apache/arrow-rs/issues/7697#issuecomment-2994093254 It would be better to provide two append functions: one that includes validation and returns an error, and another without validation. 🤔 -- This is an automated message from the

Re: [I] [Variant] Add input validation in `VariantBuilder` [arrow-rs]

2025-06-22 Thread via GitHub
Weijun-H commented on issue #7697: URL: https://github.com/apache/arrow-rs/issues/7697#issuecomment-2994085111 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] docs: Is ADBC support in DuckDB still in progress? [arrow-adbc]

2025-06-22 Thread via GitHub
lidavidm commented on issue #2216: URL: https://github.com/apache/arrow-adbc/issues/2216#issuecomment-2994025682 Hmm...I think they must've written "DuckDB 0.7.0 supports ADBC" wrongly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] rust: split workspaces so that dependencies of adbc_core aren't tied to what the datafusion driver requires [arrow-adbc]

2025-06-22 Thread via GitHub
eitsupi commented on issue #2739: URL: https://github.com/apache/arrow-adbc/issues/2739#issuecomment-2994031457 I think we can close this. (at least at this time the latest datafusion and adbc are based on the same arrow-rs version, no additional work needed) -- This is an automated messa

Re: [I] docs: Is ADBC support in DuckDB still in progress? [arrow-adbc]

2025-06-22 Thread via GitHub
eitsupi commented on issue #2216: URL: https://github.com/apache/arrow-adbc/issues/2216#issuecomment-2994030347 > I think they must've written "DuckDB 0.7.0 supports ADBC" wrongly This document was updated at the time of the DuckDB 0.10.0 release, so I believe they are referring to th

Re: [I] rust: split workspaces so that dependencies of adbc_core aren't tied to what the datafusion driver requires [arrow-adbc]

2025-06-22 Thread via GitHub
lidavidm commented on issue #2739: URL: https://github.com/apache/arrow-adbc/issues/2739#issuecomment-2994028082 @eitsupi so do you think we can just close this (now that you've added an MSRV test)? -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [I] docs: Is ADBC support in DuckDB still in progress? [arrow-adbc]

2025-06-22 Thread via GitHub
eitsupi commented on issue #2216: URL: https://github.com/apache/arrow-adbc/issues/2216#issuecomment-2994023152 I was wondering if there is a possibility that ADBC may not support API 1.1.0, since the DuckDB documentation states the following. > DuckDB's ADBC driver currently supports