Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142935 ## datafusion/common/src/scalar.rs: ## @@ -3419,8 +3492,8 @@ mod tests { { let scalar_result = left.add_checked(&right); -let left_arr

Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142052 ## datafusion/common/src/scalar.rs: ## @@ -3419,8 +3492,8 @@ mod tests { { let scalar_result = left.add_checked(&right); -let left_arr

Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374142052 ## datafusion/common/src/scalar.rs: ## @@ -3419,8 +3492,8 @@ mod tests { { let scalar_result = left.add_checked(&right); -let left_arr

Re: [I] [C++] Should we add a arrow::RecordBatchWriter and make it parent of arrow::ipc::RecordBatchWriter [arrow]

2023-10-26 Thread via GitHub
mapleFU commented on issue #38487: URL: https://github.com/apache/arrow/issues/38487#issuecomment-1782384623 @lidavidm Is this expected šŸ¤”? Or how could we by pass this? Since it's a bit weird here.. -- This is an automated message from the Apache Git Service. To respond to the message, pl

Re: [PR] MINOR: [Python] Deduplicate `ensure_s3_initialized()` call [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38451: URL: https://github.com/apache/arrow/pull/38451#issuecomment-1782367222 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit ff7b3bba3262c38ba94041aaf0143c0217e41457. There were no

Re: [I] [R] Inconsistent naming [arrow]

2023-10-26 Thread via GitHub
thisisnic commented on issue #38456: URL: https://github.com/apache/arrow/issues/38456#issuecomment-1782360840 The thing is though, they're almost but not directly interchangeable. There are some params in readr::read_csv which arrow doesn't have, and reasons why folks might want both in 1

Re: [PR] GH-38346: [C++][Parquet] Use new encrypted files for page index encryption test [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38347: URL: https://github.com/apache/arrow/pull/38347#issuecomment-1782358128 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit e8360615adf6c5a9bb76b81267d08388c7cfc3a9. There were 9

Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109902 ## datafusion/common/src/scalar.rs: ## @@ -3468,22 +3541,30 @@ mod tests { } // decimal scalar to array -let array = decimal_value

Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109734 ## datafusion/common/src/scalar.rs: ## @@ -3468,22 +3541,30 @@ mod tests { } // decimal scalar to array -let array = decimal_value

Re: [PR] chore: remove panics in datafusion-common::scalar [arrow-datafusion]

2023-10-26 Thread via GitHub
junjunjd commented on code in PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#discussion_r1374109555 ## datafusion/common/src/scalar.rs: ## @@ -3468,22 +3541,30 @@ mod tests { } // decimal scalar to array -let array = decimal_value

Re: [I] python: How `adbc_driver_manager.AdbcConnection` gets outgoing call headers [arrow-adbc]

2023-10-26 Thread via GitHub
xinyiZzz commented on issue #1079: URL: https://github.com/apache/arrow-adbc/issues/1079#issuecomment-1782243600 Solved, impl auth2 in Arrow 13.0. Close this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] GH-38452: [C++][Benchmark] Adding benchmark for LZ4/Snappy Compression [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38453: URL: https://github.com/apache/arrow/pull/38453#issuecomment-1782223260 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 73589ddd60e4cbcd860102871692541989ea38c6. There were no

Re: [PR] GH-38438: [C++] Dataset: Trying to fix the async bug in Parquet dataset [arrow]

2023-10-26 Thread via GitHub
mapleFU commented on PR #38466: URL: https://github.com/apache/arrow/pull/38466#issuecomment-178746 @bkietz I've tried but cannot reproduce it, but currently failed to reproduce without the sample file... -- This is an automated message from the Apache Git Service. To respond to the m

Re: [PR] GH-38438: [C++] Dataset: Trying to fix the async bug in Parquet dataset [arrow]

2023-10-26 Thread via GitHub
mapleFU commented on PR #38466: URL: https://github.com/apache/arrow/pull/38466#issuecomment-1782212775 Also cannot re-produce by: ``` class DelayedBufferReader : public ::arrow::io::BufferReader { public: explicit DelayedBufferReader(const std::shared_ptr<::arrow::Buff

Re: [PR] GH-38447: [CI][Release] Don't use "|| {exit,continue}" [arrow]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #38486: URL: https://github.com/apache/arrow/pull/38486#issuecomment-1782160298 Revision: c3335b62977e331a8eda889596f91ff14764ab3f Submitted crossbow builds: [ursacomputing/crossbow @ actions-8541aa5596](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38447: [CI][Release] Don't use "|| {exit,continue}" [arrow]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #38486: URL: https://github.com/apache/arrow/pull/38486#issuecomment-1782157207 :warning: GitHub issue #38447 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-38447: [CI][Release] Don't use "|| {exit,continue}" [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #38486: URL: https://github.com/apache/arrow/pull/38486#issuecomment-1782157184 @github-actions crossbow submit -g verify-rc -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[PR] [CI][Release] Don't use "|| {exit,continue}" [arrow]

2023-10-26 Thread via GitHub
kou opened a new pull request, #38486: URL: https://github.com/apache/arrow/pull/38486 ### Rationale for this change If we use "|| {exit,continue}", "set -x" doesn't work. With "|| exit" ("false" in "a()" doesn't stop the shell execution): ```console $ cat /tmp/with-or

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #38116: URL: https://github.com/apache/arrow/pull/38116#issuecomment-1782139747 (I'll review this in the next week.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] GH-37429: [C++] Add arrow::ipc::StreamDecoder::Reset() [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #37970: URL: https://github.com/apache/arrow/pull/37970#issuecomment-1782137506 I'll merge this in a few days if nobody objects it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] GH-38339: [C++][CMake] Use transitive dependency for system GoogleTest [arrow]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #38340: URL: https://github.com/apache/arrow/pull/38340#issuecomment-1782134003 Revision: 50e94952e80a10eb1a0e77a18c1291b88dc4a4f3 Submitted crossbow builds: [ursacomputing/crossbow @ actions-a89b41bb9d](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38339: [C++][CMake] Use transitive dependency for system GoogleTest [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #38340: URL: https://github.com/apache/arrow/pull/38340#issuecomment-1782132376 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] GH-38339: [C++][CMake] Use transitive dependency for system GoogleTest [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #38340: URL: https://github.com/apache/arrow/pull/38340#issuecomment-1782128714 I'll merge this after I resolved a conflict. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] GH-38378: [C++][Parquet] Don't initialize OpenSSL explicitly with OpenSSL 1.1 [arrow]

2023-10-26 Thread via GitHub
kou merged PR #38379: URL: https://github.com/apache/arrow/pull/38379 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-38378: [C++][Parquet] Don't initialize OpenSSL explicitly with OpenSSL 1.1 [arrow]

2023-10-26 Thread via GitHub
kou commented on PR #38379: URL: https://github.com/apache/arrow/pull/38379#issuecomment-1782127857 No objection. I'll merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-38424: [CI][C++] Use Fedora 38 instead of 35 [arrow]

2023-10-26 Thread via GitHub
kou merged PR #38425: URL: https://github.com/apache/arrow/pull/38425 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] Add HashMap for searching fields in DFSchema [arrow-datafusion]

2023-10-26 Thread via GitHub
maruschin closed pull request #7878: Add HashMap for searching fields in DFSchema URL: https://github.com/apache/arrow-datafusion/pull/7878 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] GH-33984: [Python] __dlpack__ implementation (producer) [arrow]

2023-10-26 Thread via GitHub
kkraus14 commented on code in PR #38472: URL: https://github.com/apache/arrow/pull/38472#discussion_r1373958835 ## python/pyarrow/_dlpack.pxi: ## @@ -0,0 +1,126 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the N

Re: [PR] GH-33984: [Python] __dlpack__ implementation (producer) [arrow]

2023-10-26 Thread via GitHub
kkraus14 commented on code in PR #38472: URL: https://github.com/apache/arrow/pull/38472#discussion_r1373958622 ## python/pyarrow/_dlpack.pxi: ## @@ -0,0 +1,126 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the N

[PR] Add Ruff CI action [arrow-datafusion-python]

2023-10-26 Thread via GitHub
andygrove opened a new pull request, #529: URL: https://github.com/apache/arrow-datafusion-python/pull/529 # Which issue does this PR close? N/A # Rationale for this change Experiment. # What changes are included in this PR? Ruff CI check

Re: [I] Panic with queries with multiple `COUNT DISTINCT` aggregates on dictionary values, and a group by [arrow-datafusion]

2023-10-26 Thread via GitHub
jayzhan211 commented on issue #7938: URL: https://github.com/apache/arrow-datafusion/issues/7938#issuecomment-1782112411 In this case we may need to change to logic of state() https://github.com/apache/arrow-datafusion/blob/a9d66e2b492843c2fb335a7dfe27fed073629b09/datafusion/physical-expr/s

Re: [PR] GH-38449: [Release][Go][macOS] Use local test data if possible [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38450: URL: https://github.com/apache/arrow/pull/38450#issuecomment-1782111914 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 25e5d19b98511310e617c8b6e3c6ce6af39dfcde. There were no

Re: [PR] Initial implementation of array union without deduplication [arrow-datafusion]

2023-10-26 Thread via GitHub
edmondop commented on code in PR #7897: URL: https://github.com/apache/arrow-datafusion/pull/7897#discussion_r1373952693 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -1478,6 +1480,53 @@ macro_rules! to_string { }}; } + +/// Array_union SQL function +pub fn

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
prmoore77 commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373947747 ## java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightServer.java: ## @@ -328,6 +360,15 @@ public Builder useTls(final InputStream certChain, final

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
prmoore77 commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373945638 ## java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightServer.java: ## @@ -317,6 +340,15 @@ public Builder useTls(final File certChain, final File ke

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
prmoore77 commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373942791 ## java/flight/flight-sql-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/ConnectionMutualTlsTest.java: ## @@ -0,0 +1,436 @@ +/* + * Licensed to the Apache Softw

Re: [I] Panic with queries with multiple `COUNT DISTINCT` aggregates on dictionary values, and a group by [arrow-datafusion]

2023-10-26 Thread via GitHub
jayzhan211 commented on issue #7938: URL: https://github.com/apache/arrow-datafusion/issues/7938#issuecomment-1782096338 btw I forgot to fix this "Unsupported data type {:?} for ScalarValue::list_to_array" to "Unsupported data type {:?} for ScalarValue::new_list" -- This is an automate

Re: [I] Panic with queries with multiple `COUNT DISTINCT` aggregates on dictionary values, and a group by [arrow-datafusion]

2023-10-26 Thread via GitHub
jayzhan211 commented on issue #7938: URL: https://github.com/apache/arrow-datafusion/issues/7938#issuecomment-1782083281 Other than Dictionary, there are also many types I did not support yet, since I am not sure whether we need them all, so I only supported the type that is used (aka pass

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
jduo commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373923037 ## java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightServer.java: ## @@ -328,6 +360,15 @@ public Builder useTls(final InputStream certChain, final Input

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
jduo commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373920916 ## java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightServer.java: ## @@ -317,6 +340,15 @@ public Builder useTls(final File certChain, final File key) th

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
jduo commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373920916 ## java/flight/flight-core/src/main/java/org/apache/arrow/flight/FlightServer.java: ## @@ -317,6 +340,15 @@ public Builder useTls(final File certChain, final File key) th

Re: [I] Can't read a partitioned dataset using directory partitioning in windows - "C:" drive letter [arrow]

2023-10-26 Thread via GitHub
davlee1972 commented on issue #38485: URL: https://github.com/apache/arrow/issues/38485#issuecomment-1782068671 Seems like there should some sort of reverse_path() function to capture partitioned directory values.. -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on PR #38380: URL: https://github.com/apache/arrow/pull/38380#issuecomment-1782024162 Ready for another round. TY for the feedback -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
danepitkin commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373896544 ## java/flight/flight-sql-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/ConnectionMutualTlsTest.java: ## @@ -0,0 +1,436 @@ +/* + * Licensed to the Apache Soft

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
danepitkin commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373896544 ## java/flight/flight-sql-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/ConnectionMutualTlsTest.java: ## @@ -0,0 +1,436 @@ +/* + * Licensed to the Apache Soft

Re: [I] [R] How to replace NA with another value using arrow or R libraries [arrow]

2023-10-26 Thread via GitHub
eitsupi commented on issue #38473: URL: https://github.com/apache/arrow/issues/38473#issuecomment-1781987483 @TPDeramus I think you are using `summarise` and `across` incorrectly, check the dplyr documentation. -- This is an automated

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373771336 ## cpp/src/arrow/acero/unmaterialized_table.h: ## @@ -0,0 +1,279 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373695176 ## cpp/src/arrow/acero/unmaterialized_table.h: ## @@ -0,0 +1,234 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373695176 ## cpp/src/arrow/acero/unmaterialized_table.h: ## @@ -0,0 +1,234 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373771336 ## cpp/src/arrow/acero/unmaterialized_table.h: ## @@ -0,0 +1,279 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [I] [Java] Remove Netty dependency from arrow-vector [arrow]

2023-10-26 Thread via GitHub
lidavidm commented on issue #14936: URL: https://github.com/apache/arrow/issues/14936#issuecomment-1781951650 Vendoring is fine, yes (just update LICENSE to reflect that). Or we could possibly pull in Eclipse or Apache Commons equivalents if they exist. -- This is an automated message fro

Re: [I] [C++] Simplify type_traits.h [arrow]

2023-10-26 Thread via GitHub
WillAyd commented on issue #38204: URL: https://github.com/apache/arrow/issues/38204#issuecomment-1781951538 @felipecrv is that visual available in the source or documentation today? It is a fantastic reference -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] [Java] Remove Netty dependency from arrow-vector [arrow]

2023-10-26 Thread via GitHub
jduo commented on issue #14936: URL: https://github.com/apache/arrow/issues/14936#issuecomment-1781949279 We probably don't want to re-implement IntObjectHashMap. Can we simply take the Netty implementation and add it to the Arrow project? It Netty has an Apache 2.0 license. -- This is a

Re: [I] [Java] Remove Netty dependency from arrow-vector [arrow]

2023-10-26 Thread via GitHub
jduo commented on issue #14936: URL: https://github.com/apache/arrow/issues/14936#issuecomment-1781945018 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] MINOR: [C++] Fix a maybe-uninitialized warning [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38433: URL: https://github.com/apache/arrow/pull/38433#issuecomment-1781928475 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit c655c33fdf72d86e24cc4a1c1f977c48ea868039. There were no

Re: [PR] GH-34569: [C++] Diffing of Run-End Encoded arrays [arrow]

2023-10-26 Thread via GitHub
bkietz merged PR #35003: URL: https://github.com/apache/arrow/pull/35003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] GH-37710: [C++][Integration] Add C++ Utf8View implementation [arrow]

2023-10-26 Thread via GitHub
bkietz merged PR #37792: URL: https://github.com/apache/arrow/pull/37792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

Re: [PR] GH-37710: [C++][Integration] Add C++ Utf8View implementation [arrow]

2023-10-26 Thread via GitHub
bkietz commented on PR #37792: URL: https://github.com/apache/arrow/pull/37792#issuecomment-1781919864 Thanks @paleolimbot ! I'll merge now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

Re: [PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb commented on code in PR #7944: URL: https://github.com/apache/arrow-datafusion/pull/7944#discussion_r1373834408 ## datafusion/common/src/dfschema.rs: ## @@ -33,17 +33,36 @@ use crate::{ use arrow::compute::can_cast_types; use arrow::datatypes::{DataType, Field, FieldRe

[PR] `WIP` Prototype new DFSchema implementation [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb opened a new pull request, #7944: URL: https://github.com/apache/arrow-datafusion/pull/7944 I have long been bothered by the amount of copying required in DFSchema and how akward it is to use The current setup also makes adding additional indexes such as described on https://g

Re: [PR] Use btree to search fields in DFSchema [arrow-datafusion]

2023-10-26 Thread via GitHub
oleggator commented on PR #7870: URL: https://github.com/apache/arrow-datafusion/pull/7870#issuecomment-1781894246 > Is there a reason to use a b-tree ( O(log⁔n) ) vs a hash map ( O(1) )? Using b-tree we can query all fields matching to a "prefix" in one O(logn) hop (`column.*.*.*`,

Re: [PR] GH-38462: [Go][Parquet] Handle Boolean RLE encoding/decoding [arrow]

2023-10-26 Thread via GitHub
zeroshade commented on PR #38367: URL: https://github.com/apache/arrow/pull/38367#issuecomment-1781872845 @pitrou added the test for the rle_boolean_encoding.parquet file -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on PR #38380: URL: https://github.com/apache/arrow/pull/38380#issuecomment-1781871517 > What's your thinking around the long term approach? > * Migrate this node and asof join node to act more like the other nodes (no independent threads, works even if plan is multi-t

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373792311 ## cpp/src/arrow/acero/sorted_merge_node.cc: ## @@ -0,0 +1,606 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] build(csharp): add Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
lidavidm merged PR #1229: URL: https://github.com/apache/arrow-adbc/pull/1229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] build(csharp): add Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
lidavidm commented on PR #1229: URL: https://github.com/apache/arrow-adbc/pull/1229#issuecomment-1781865612 @CurtHagenlocher @davidhcoe I assume this is fine? Once Arrow-C# stabilizes a bit we can go back to depending on releases -- This is an automated message from the Apache Git Service

Re: [PR] GH-38351: [C#] Add SqlDecimal support to Decimal128Array [arrow]

2023-10-26 Thread via GitHub
CurtHagenlocher commented on code in PR #38481: URL: https://github.com/apache/arrow/pull/38481#discussion_r1373791284 ## csharp/src/Apache.Arrow/Arrays/Decimal128Array.cs: ## @@ -61,6 +64,31 @@ public Builder AppendRange(IEnumerable values) return Instance;

Re: [PR] GH-38166: [MATLAB] [MATLAB] Improve tabular object display [arrow]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #38482: URL: https://github.com/apache/arrow/pull/38482#issuecomment-1781860474 :warning: GitHub issue #38166 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373787258 ## cpp/src/arrow/acero/sorted_merge_node.cc: ## @@ -0,0 +1,606 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

[PR] GH-38166: [MATLAB] [MATLAB] Improve tabular object display [arrow]

2023-10-26 Thread via GitHub
sgilmore10 opened a new pull request, #38482: URL: https://github.com/apache/arrow/pull/38482 ### Rationale for this change Currently, the display for `arrow.tabular.RecordBatch` and `arrow.tabular.Table` are not very MATLAB-like. ### What changes are included in this

Re: [PR] build(csharp): add Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
vleslief-ms commented on PR #1229: URL: https://github.com/apache/arrow-adbc/pull/1229#issuecomment-1781859258 > Just for my own edification, is there a way to have NuGet or the C# build/packaging system do this? For instance for Go we can reference a development build of Arrow when needed

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373784091 ## cpp/src/arrow/acero/sorted_merge_node.cc: ## @@ -0,0 +1,606 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373780764 ## cpp/src/arrow/acero/sorted_merge_node.cc: ## @@ -0,0 +1,606 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agre

Re: [PR] GH-38351: [C#] Add SqlDecimal support to Decimal128Array [arrow]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #38481: URL: https://github.com/apache/arrow/pull/38481#issuecomment-1781850155 :warning: GitHub issue #38351 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-38351: [C#] Add SqlDecimal support to Decimal128Array [arrow]

2023-10-26 Thread via GitHub
CurtHagenlocher opened a new pull request, #38481: URL: https://github.com/apache/arrow/pull/38481 ### What changes are included in this PR? Adds support for reading and writing System.Data.SqlTypes.SqlDecimal against Decimal128Array. ### Are these changes tested? Yes.

Re: [I] [Python] IPC error using Python GeneratorStream for tables containing Categorical / DictionaryArray [arrow]

2023-10-26 Thread via GitHub
phoebey01 commented on issue #38480: URL: https://github.com/apache/arrow/issues/38480#issuecomment-1781848966 Thanks a lot David! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [I] [Python] IPC error using Python GeneratorStream for tables containing Categorical / DictionaryArray [arrow]

2023-10-26 Thread via GitHub
lidavidm commented on issue #38480: URL: https://github.com/apache/arrow/issues/38480#issuecomment-1781842640 That also means that if you need a workaround, you should try: ```python reader = pyarrow.RecordBatchReader.from_batches(table.schema, table.to_batches())

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373771336 ## cpp/src/arrow/acero/unmaterialized_table.h: ## @@ -0,0 +1,279 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license ag

Re: [I] [Python] IPC error using Python GeneratorStream for tables containing Categorical / DictionaryArray [arrow]

2023-10-26 Thread via GitHub
lidavidm commented on issue #38480: URL: https://github.com/apache/arrow/issues/38480#issuecomment-1781837791 We don't handle dictionaries here, that should probably be fixed: https://github.com/apache/arrow/blob/818f71d085b6f820903afc6b1f1e577d8e45ff47/python/pyarrow/_flight.pyx#L2002-L2089

Re: [PR] refactor(r): Improve testing for ADBC 1.1 features in R bindings [arrow-adbc]

2023-10-26 Thread via GitHub
paleolimbot merged PR #1214: URL: https://github.com/apache/arrow-adbc/pull/1214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373762959 ## cpp/src/arrow/acero/concurrent_queue.h: ## @@ -0,0 +1,150 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreem

Re: [PR] build(csharp): add Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
lidavidm commented on PR #1229: URL: https://github.com/apache/arrow-adbc/pull/1229#issuecomment-1781826776 Just for my own edification, is there a way to have NuGet or the C# build/packaging system do this? For instance for Go we can reference a development build of Arrow when needed witho

Re: [PR] Adding Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
github-actions[bot] commented on PR #1229: URL: https://github.com/apache/arrow-adbc/pull/1229#issuecomment-1781822051 :warning: Please follow the [Conventional Commits format in CONTRIBUTING.md](https://github.com/apache/arrow-adbc/blob/main/CONTRIBUTING.md) for PR titles. -- This is an

[PR] Adding Arrow as submodule proper for CSharp project [arrow-adbc]

2023-10-26 Thread via GitHub
vleslief-ms opened a new pull request, #1229: URL: https://github.com/apache/arrow-adbc/pull/1229 Can't build the C# project checking out from main directly at the moment. README mentioned the submodule for Arrow and it's included in .gitmodules, but none of it works unless the csharp

Re: [PR] [GH-38381][C++][Acero] Create a sorted merge node [arrow]

2023-10-26 Thread via GitHub
JerAguilon commented on code in PR #38380: URL: https://github.com/apache/arrow/pull/38380#discussion_r1373757021 ## cpp/build-support/lint_cpp_cli.py: ## @@ -77,6 +77,7 @@ def lint_file(path): EXCLUSIONS = _paths('''\ +arrow/acero/concurrent_queue.h Review Comment:

Re: [PR] GH-38418: [MATLAB] Add method for extracting one row of an `arrow.tabular.Table` as a string [arrow]

2023-10-26 Thread via GitHub
kevingurney merged PR #38463: URL: https://github.com/apache/arrow/pull/38463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] GH-38462: [Go][Parquet] Handle Boolean RLE encoding/decoding [arrow]

2023-10-26 Thread via GitHub
pitrou commented on PR #38367: URL: https://github.com/apache/arrow/pull/38367#issuecomment-1781810511 > As for an individual file, we've confirmed that the updated `parquet-testing` files use the RLE encoding for boolean columns, so I believe i need to add a separate test for that particul

Re: [PR] GH-38418: [MATLAB] Add method for extracting one row of an `arrow.tabular.Table` as a string [arrow]

2023-10-26 Thread via GitHub
kevingurney commented on PR #38463: URL: https://github.com/apache/arrow/pull/38463#issuecomment-1781809173 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] GH-38462: [Go][Parquet] Handle Boolean RLE encoding/decoding [arrow]

2023-10-26 Thread via GitHub
zeroshade commented on PR #38367: URL: https://github.com/apache/arrow/pull/38367#issuecomment-1781809143 I'd like to get this in relatively soon as all CI for go that runs the parquet tests is going to fail on the main branch until this is merged -- This is an automated message from the

Re: [PR] refactor(r): Improve testing for ADBC 1.1 features in R bindings [arrow-adbc]

2023-10-26 Thread via GitHub
zeroshade commented on code in PR #1214: URL: https://github.com/apache/arrow-adbc/pull/1214#discussion_r1373746728 ## r/adbcdrivermanager/src/driver_base.h: ## @@ -0,0 +1,564 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agree

Re: [PR] Minor: extend ci to test each package [arrow-datafusion]

2023-10-26 Thread via GitHub
Weijun-H commented on PR #7940: URL: https://github.com/apache/arrow-datafusion/pull/7940#issuecomment-1781801352 > #7933 is fixed I tried to merge #7934, it didn't solve #7933 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] GH-38462: [Go][Parquet] Handle Boolean RLE encoding/decoding [arrow]

2023-10-26 Thread via GitHub
zeroshade commented on PR #38367: URL: https://github.com/apache/arrow/pull/38367#issuecomment-1781801116 @pitrou I don't think we need a higher level roundtrip test than the tests we currently have and use for this. As for an individual file, we've confirmed that the updated `parque

Re: [PR] feat(c/driver/sqlite): Support binding dictionary-encoded string and binary types [arrow-adbc]

2023-10-26 Thread via GitHub
lidavidm merged PR #1224: URL: https://github.com/apache/arrow-adbc/pull/1224 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] GH-37657: [JS] Run bin scripts with ts-node [arrow]

2023-10-26 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #37668: URL: https://github.com/apache/arrow/pull/37668#issuecomment-1781790381 Thanks for your patience. Conbench analyzed the 0 benchmarking runs that have been run so far on PR commit 2ee36ead9f67e034ac6c73d752e5c74cb7d75e89. None of the s

Re: [PR] Add HashMap for searching fields in DFSchema [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb commented on PR #7878: URL: https://github.com/apache/arrow-datafusion/pull/7878#issuecomment-1781787797 Related comment: https://github.com/apache/arrow-datafusion/issues/7698#issuecomment-1781787244 -- This is an automated message from the Apache Git Service. To respond to the me

Re: [PR] Use btree to search fields in DFSchema [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb commented on PR #7870: URL: https://github.com/apache/arrow-datafusion/pull/7870#issuecomment-1781787734 Related comment: https://github.com/apache/arrow-datafusion/issues/7698#issuecomment-1781787244 -- This is an automated message from the Apache Git Service. To respond to the me

Re: [I] Bad performance on wide tables (1000+ columns) [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb commented on issue #7698: URL: https://github.com/apache/arrow-datafusion/issues/7698#issuecomment-1781787244 I have reviewed https://github.com/apache/arrow-datafusion/pull/7870 and https://github.com/apache/arrow-datafusion/pull/7878 Here are my thoughts: 1. I think some s

Re: [I] [Docs] How to get data type of a logical expression [arrow-datafusion]

2023-10-26 Thread via GitHub
alamb commented on issue #7725: URL: https://github.com/apache/arrow-datafusion/issues/7725#issuecomment-1781779174 I think a good place to put these docs is as a docstring on `Expr` itself, if possible. -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] GH-38460: [Java][FlightRPC] Add mTLS support for Flight SQL JDBC driver [arrow]

2023-10-26 Thread via GitHub
prmoore77 commented on code in PR #38461: URL: https://github.com/apache/arrow/pull/38461#discussion_r1373719110 ## docs/source/java/flight_sql_jdbc_driver.rst: ## @@ -114,6 +114,21 @@ parameters are: - null - When TLS is enabled, the password for the certificate sto

Re: [I] [Java] Gandiva fails to build on M1 Mac [arrow]

2023-10-26 Thread via GitHub
jbonofre commented on issue #36918: URL: https://github.com/apache/arrow/issues/36918#issuecomment-1781767398 I fixed this issue on a fork. I will propose a corresponding PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

  1   2   3   4   >