[I] Attempted optimizations in arrow_buffer::util::bit_util do more harm than good [arrow-rs]

2024-05-15 Thread via GitHub
HadrienG2 opened a new issue, #5771: URL: https://github.com/apache/arrow-rs/issues/5771 **Describe the bug** The bitmap access and manipulation functions in `arrow_buffer::util::bit_util` do not follow the most obvious implementation strategy... ```rust pub fn get_bit(data: &[

Re: [I] soversion bumps on minor releases [arrow]

2024-05-15 Thread via GitHub
pitrou commented on issue #41659: URL: https://github.com/apache/arrow/issues/41659#issuecomment-2114194460 We're not supposed to break the ABI in bugfix releases, so I don't know why the soversion was bumped. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
github-actions[bot] commented on PR #40392: URL: https://github.com/apache/arrow/pull/40392#issuecomment-2114193299 Revision: 4ebe53bd21fd1f28ce6babebf6802488283c692c Submitted crossbow builds: [ursacomputing/crossbow @ actions-d1985bc136](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
pitrou commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602710047 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,81 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
pitrou commented on PR #40392: URL: https://github.com/apache/arrow/pull/40392#issuecomment-2114185667 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] GH-39204: [Format][FlightRPC][Docs] Stabilize Flight SQL [arrow]

2024-05-15 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41657: URL: https://github.com/apache/arrow/pull/41657#issuecomment-2114146653 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 084387c56e45bf7e8335c28e14a2e61b16515ad5. There were no

Re: [PR] GH-41287: [Java] ListViewVector Implementation [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on PR #41285: URL: https://github.com/apache/arrow/pull/41285#issuecomment-2114145757 @lidavidm CIs are passing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] GH-41560: [C++] ChunkResolver: Implement ResolveMany and add unit tests [arrow]

2024-05-15 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41561: URL: https://github.com/apache/arrow/pull/41561#issuecomment-2113992811 After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit e04f5b4b905cfc37b5eaeea2c34e51349ae562b9. There were 7

Re: [PR] GH-41287: [Java] ListViewVector Implementation [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on PR #41285: URL: https://github.com/apache/arrow/pull/41285#issuecomment-2113989211 let's just rebase and see -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] GH-41287: [Java] ListViewVector Implementation [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on PR #41285: URL: https://github.com/apache/arrow/pull/41285#issuecomment-2113986395 @lidavidm should we rebase and run CIs again. It seems that there was PR to fix the breaking Gandiva issue? And any other changes you would expect here? -- This is an automate

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113985057 I will try to reproduce this and evaluate the approach suggested by @lidavidm. I will probably need some time to try this out. -- This is an automated message from the Apache Git

Re: [I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
mapleFU commented on issue #5770: URL: https://github.com/apache/arrow-rs/issues/5770#issuecomment-2113971815 I don't have specific data by now. At my use case, usally row-group would be limit by size rather than row-count, and we would disable statistics for most of these column. And only

Re: [I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
westonpace commented on issue #5770: URL: https://github.com/apache/arrow-rs/issues/5770#issuecomment-2113963336 Also, I did do some experimentation here (what feels like a long time ago now but was at least a year ago). Two things I observed: * The in-memory representation of the p

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113961717 CC @vibhatha We also have to traverse recursively since child arrays can have their own offset -- This is an automated message from the Apache Git Service. To respond to t

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
hellishfire commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113948678 > Yes, but that's different in intent than the C++ version (which just tracks an offset). Here it means we can't do a 0 copy import. I understand that, but IMO data correct

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113934843 Yes, but that's different in intent than the C++ version (which just tracks an offset). Here it means we can't do a 0 copy import. -- This is an automated message from the Apache

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on PR #40392: URL: https://github.com/apache/arrow/pull/40392#issuecomment-2113930475 Parquet tests were failing all over on CI and due to needing to bump submodules so I pushed 4ebe53bd21fd1f28ce6babebf6802488283c692c so we could see CI. -- This is an automated message

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
hellishfire commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113930191 > Java doesn't really have a concept of slicing, so either this needs to copy the data (at least for things that can't easily be sliced, like the validity buffer) or it should er

Re: [PR] GH-40934: [Java] TypeLayout enhancement to support StringView [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on code in PR #41676: URL: https://github.com/apache/arrow/pull/41676#discussion_r1602528886 ## java/vector/src/main/java/org/apache/arrow/vector/VectorLoader.java: ## @@ -98,7 +98,7 @@ private void loadBuffers( CompressionCodec codec) { checkArgum

Re: [I] [Java][C] sliced RecordBatch offset info is lost when imported from c-data [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on issue #41682: URL: https://github.com/apache/arrow/issues/41682#issuecomment-2113920997 Java doesn't really have a concept of slicing, so either this needs to copy the data (at least for things that can't easily be sliced, like the validity buffer) or it should error o

Re: [PR] GH-40934: [Java] TypeLayout enhancement to support StringView [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on code in PR #41676: URL: https://github.com/apache/arrow/pull/41676#discussion_r1602517022 ## java/vector/src/main/java/org/apache/arrow/vector/VectorLoader.java: ## @@ -98,7 +98,7 @@ private void loadBuffers( CompressionCodec codec) { checkArgum

Re: [PR] GH-41547: [C++] Thirdparty: Upgrade xsimd to 13.0.0 [arrow]

2024-05-15 Thread via GitHub
mapleFU commented on PR #41548: URL: https://github.com/apache/arrow/pull/41548#issuecomment-2113898961 Ah I mean all the third-party libraries about xsimd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-40934: [Java] TypeLayout enhancement to support StringView [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on code in PR #41676: URL: https://github.com/apache/arrow/pull/41676#discussion_r1602508993 ## java/vector/src/main/java/org/apache/arrow/vector/VectorLoader.java: ## @@ -98,7 +98,7 @@ private void loadBuffers( CompressionCodec codec) { checkArgum

Re: [I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
westonpace commented on issue #5770: URL: https://github.com/apache/arrow-rs/issues/5770#issuecomment-2113848324 10K columns by 10 row groups by 1M rows is 100B values (400GB with int32). I don't think anyone has data like that. My experience has been either: * The files they

Re: [PR] GH-41541: [Go][Parquet] Fix writer performance regression [arrow]

2024-05-15 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41638: URL: https://github.com/apache/arrow/pull/41638#issuecomment-2113802044 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit e1de9c52d5a60b2e2a314b8589170467fe36415d. There were no

Re: [PR] GH-41134: [GLib] Support building arrow-glib with MSVC [arrow]

2024-05-15 Thread via GitHub
adamreeve commented on code in PR #41599: URL: https://github.com/apache/arrow/pull/41599#discussion_r1602432722 ## ci/scripts/c_glib_build.sh: ## @@ -28,17 +28,38 @@ build_root=${2} : ${BUILD_DOCS_C_GLIB:=OFF} with_doc=$([ "${BUILD_DOCS_C_GLIB}" == "ON" ] && echo "true" || ec

Re: [PR] GH-41134: [GLib] Support building arrow-glib with MSVC [arrow]

2024-05-15 Thread via GitHub
adamreeve commented on code in PR #41599: URL: https://github.com/apache/arrow/pull/41599#discussion_r1602431330 ## c_glib/arrow-glib/array-builder.h: ## @@ -22,70 +22,97 @@ #include #include #include +#include G_BEGIN_DECLS #define GARROW_TYPE_ARRAY_BUILDER (garrow

Re: [I] [Java] Implement `RangeEqualsVisitor` for StringView [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on issue #40943: URL: https://github.com/apache/arrow/issues/40943#issuecomment-2113723557 Issue resolved by pull request 41636 https://github.com/apache/arrow/pull/41636 -- This is an automated message from the Apache Git Service. To respond to the message, please lo

Re: [PR] GH-40943: [Java] Implement RangeEqualsVisitor for StringView [arrow]

2024-05-15 Thread via GitHub
lidavidm merged PR #41636: URL: https://github.com/apache/arrow/pull/41636 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

Re: [PR] GH-40934: [Java] TypeLayout enhancement to support StringView [arrow]

2024-05-15 Thread via GitHub
lidavidm commented on code in PR #41676: URL: https://github.com/apache/arrow/pull/41676#discussion_r1602413649 ## java/vector/src/main/java/org/apache/arrow/vector/VectorLoader.java: ## @@ -98,7 +98,7 @@ private void loadBuffers( CompressionCodec codec) { checkArgum

Re: [PR] GH-40934: [Java] TypeLayout enhancement to support StringView [arrow]

2024-05-15 Thread via GitHub
vibhatha commented on PR #41676: URL: https://github.com/apache/arrow/pull/41676#issuecomment-2113714065 @lidavidm could you please take a look? I am not sure the best practices for deprecation. Should we also add docs? I mean not only Java docs, but also documentation (rst)? -- This is

Re: [I] [R] CRAN packaging checklist for version 16.1.0 [arrow]

2024-05-15 Thread via GitHub
amoeba commented on issue #41647: URL: https://github.com/apache/arrow/issues/41647#issuecomment-2113707760 Thanks for starting this off @thisisnic. I noticed that https://github.com/apache/arrow/issues/40388 is still open and I think at least one or two things are still undone (tagging com

Re: [I] apache-arrow-apt-source-latest-bullseye.deb.asc is signed by a key not in apache-arrow-keyring.gpg [arrow]

2024-05-15 Thread via GitHub
assignUser commented on issue #41678: URL: https://github.com/apache/arrow/issues/41678#issuecomment-2113701840 Supply chain attacks are of course a valid concern but we can't avoid changes to the KEYS file, Arrow is a large project and individuals sign our releases, sometimes someone new t

Re: [I] [Docs][CI] Enable more sphinx-lint rules for documentation [arrow]

2024-05-15 Thread via GitHub
amoeba commented on issue #41611: URL: https://github.com/apache/arrow/issues/41611#issuecomment-2113699697 Thanks @AlenkaF. I marked the PR as ready for review and I tagged you for review in addition to @jorisvandenbossche since he saw the last PR. -- This is an automated message from th

Re: [I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
alamb commented on issue #5770: URL: https://github.com/apache/arrow-rs/issues/5770#issuecomment-2113697968 One idea would be to create a dataset with `1000` columns, `2000` columns , `5000` columns and `1` columns with say 10 row groups of 1M rows Measure the size of the metadata

Re: [I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
alamb commented on issue #5770: URL: https://github.com/apache/arrow-rs/issues/5770#issuecomment-2113695611 I have hopes that I will find time to work on this next week but am not sure If anyone knows of prior art in this area (measuring the actual metadata size) it would be great if

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on PR #40392: URL: https://github.com/apache/arrow/pull/40392#issuecomment-2113695113 Thanks @pitrou for the review. I rebased from apache/main, accepted your three changes as-is, and added one extra change I caught while dealing with conflicts on rebase, 4bd0339e596d410bd5

[I] Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns [arrow-rs]

2024-05-15 Thread via GitHub
alamb opened a new issue, #5770: URL: https://github.com/apache/arrow-rs/issues/5770 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** There are several proposals for remedying perceived issues with parquet which generally propose n

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602388341 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602387512 ## cpp/src/arrow/ipc/CMakeLists.txt: ## @@ -41,6 +41,7 @@ add_arrow_test(feather_test) add_arrow_ipc_test(json_simple_test) add_arrow_ipc_test(read_write_test EXTRA_LI

Re: [I] [Format][FlightRPC][Docs] Stabilize Flight SQL [arrow]

2024-05-15 Thread via GitHub
kou commented on issue #39204: URL: https://github.com/apache/arrow/issues/39204#issuecomment-2113690167 Issue resolved by pull request 41657 https://github.com/apache/arrow/pull/41657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-39204: [Format][FlightRPC][Docs] Stabilize Flight SQL [arrow]

2024-05-15 Thread via GitHub
kou merged PR #41657: URL: https://github.com/apache/arrow/pull/41657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [I] soversion bumps on minor releases [arrow]

2024-05-15 Thread via GitHub
kou commented on issue #41659: URL: https://github.com/apache/arrow/issues/41659#issuecomment-2113688894 @pitrou What do you think about this? FYI: #4801 for the current versioning schema. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [I] IOError: Invalid: SCRAM auth method isn't supported yet [arrow-flight-sql-postgresql]

2024-05-15 Thread via GitHub
kou commented on issue #187: URL: https://github.com/apache/arrow-flight-sql-postgresql/issues/187#issuecomment-2113683676 Right. Challenge-response style auths aren't implemented yet. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602388341 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602387673 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-41547: [C++] Thirdparty: Upgrade xsimd to 13.0.0 [arrow]

2024-05-15 Thread via GitHub
kou commented on PR #41548: URL: https://github.com/apache/arrow/pull/41548#issuecomment-2113680997 Sorry. What the "3rd" refers? (You want to check xsimd version in our code?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602387512 ## cpp/src/arrow/ipc/CMakeLists.txt: ## @@ -41,6 +41,7 @@ add_arrow_test(feather_test) add_arrow_ipc_test(json_simple_test) add_arrow_ipc_test(read_write_test EXTRA_LI

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602388341 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602387673 ## cpp/src/arrow/ipc/message_internal_test.cc: ## @@ -0,0 +1,80 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreeme

Re: [PR] GH-40361: [C++] Make flatbuffers serialization more deterministic [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #40392: URL: https://github.com/apache/arrow/pull/40392#discussion_r1602387512 ## cpp/src/arrow/ipc/CMakeLists.txt: ## @@ -41,6 +41,7 @@ add_arrow_test(feather_test) add_arrow_ipc_test(json_simple_test) add_arrow_ipc_test(read_write_test EXTRA_LI

Re: [I] [C++] Add ChunkResolver::ResolveMany() API [arrow]

2024-05-15 Thread via GitHub
felipecrv commented on issue #41560: URL: https://github.com/apache/arrow/issues/41560#issuecomment-2113670321 Issue resolved by pull request 41561 https://github.com/apache/arrow/pull/41561 -- This is an automated message from the Apache Git Service. To respond to the message, please l

Re: [PR] GH-41560: [C++] ChunkResolver: Implement ResolveMany and add unit tests [arrow]

2024-05-15 Thread via GitHub
felipecrv merged PR #41561: URL: https://github.com/apache/arrow/pull/41561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] GH-41660: [CI][Java] Restore devtoolset relatead GANDIVA_CXX_FLAGS [arrow]

2024-05-15 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41661: URL: https://github.com/apache/arrow/pull/41661#issuecomment-2113659077 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 63fddd7b2f12fb65ed5feff820a1913931773968. There were no

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-05-15 Thread via GitHub
stevelorddremio commented on PR #41237: URL: https://github.com/apache/arrow/pull/41237#issuecomment-2113654413 > > It looks like my fix in @afterall for TestFlightSql and TestFlightSqlStateless was not successful. > > I see that the tests use the same filename and the tests try to de

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602361549 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602361348 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602360474 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602360886 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602359136 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602358811 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602358585 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602357914 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602358138 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602357914 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602355813 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602352183 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on PR #41593: URL: https://github.com/apache/arrow/pull/41593#issuecomment-2113624589 Hey @AlenkaF, this is so great to see. I think the text and diagrams will be useful. I left some suggestions for style and: - Did an editing pass over the text. Feel free to ignore a

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602337052 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602337052 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602336913 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] feat(csharp/src/Drivers/Apache): extend capability of GetInfo for Spark driver [arrow-adbc]

2024-05-15 Thread via GitHub
birschick-bq commented on code in PR #1863: URL: https://github.com/apache/arrow-adbc/pull/1863#discussion_r1602334285 ## csharp/src/Drivers/Apache/Hive2/HiveServer2Connection.cs: ## @@ -103,64 +116,23 @@ protected Schema GetSchema() return SchemaParser.GetArrowSche

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602333969 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602333742 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602333610 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602333610 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] feat(csharp/src/Drivers/Apache): extend capability of GetInfo for Spark driver [arrow-adbc]

2024-05-15 Thread via GitHub
birschick-bq commented on code in PR #1863: URL: https://github.com/apache/arrow-adbc/pull/1863#discussion_r1602331203 ## csharp/src/Drivers/Apache/Hive2/HiveServer2Connection.cs: ## @@ -35,6 +36,8 @@ public abstract class HiveServer2Connection : AdbcConnection internal

Re: [PR] feat(csharp/src/Drivers/Apache): extend capability of GetInfo for Spark driver [arrow-adbc]

2024-05-15 Thread via GitHub
birschick-bq commented on code in PR #1863: URL: https://github.com/apache/arrow-adbc/pull/1863#discussion_r1602330324 ## csharp/src/Drivers/Apache/Hive2/HiveServer2Connection.cs: ## @@ -46,6 +49,30 @@ internal TCLIService.Client Client get { return this.client ?? t

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602329769 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602329769 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [I] Support list_with_offset for Azure using continuation token [arrow-rs]

2024-05-15 Thread via GitHub
alexwilcoxson-rel commented on issue #5653: URL: https://github.com/apache/arrow-rs/issues/5653#issuecomment-2113588239 đź‘Ť we tried to implement this and what they suggested to us is not working anyway, so we're going to pursue other options. -- This is an automated message from the Apache

Re: [I] Support list_with_offset for Azure using continuation token [arrow-rs]

2024-05-15 Thread via GitHub
alexwilcoxson-rel closed issue #5653: Support list_with_offset for Azure using continuation token URL: https://github.com/apache/arrow-rs/issues/5653 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602327937 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602327554 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [PR] GH-41673: [Format][Docs] Add arrow format introductory page [arrow]

2024-05-15 Thread via GitHub
amoeba commented on code in PR #41593: URL: https://github.com/apache/arrow/pull/41593#discussion_r1602327104 ## docs/source/format/FormatIntro.rst: ## @@ -0,0 +1,511 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. S

Re: [I] Snowflake driver report invalid JWT token for the keypair auth with privateLink accountIdentifer [arrow-adbc]

2024-05-15 Thread via GitHub
lidavidm commented on issue #1777: URL: https://github.com/apache/arrow-adbc/issues/1777#issuecomment-2113565778 This still sounds like an upstream issue, then. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [I] [R] poor R performance for arrow_fixed_size_list types [arrow]

2024-05-15 Thread via GitHub
amoeba commented on issue #41644: URL: https://github.com/apache/arrow/issues/41644#issuecomment-2113520306 Hi @kendonB, I'm not sure if performance of either R or Python's routines here is a high priority. That said, I think PRs improving their performance without making breaking ch

Re: [I] apache-arrow-apt-source-latest-bullseye.deb.asc is signed by a key not in apache-arrow-keyring.gpg [arrow]

2024-05-15 Thread via GitHub
andersk commented on issue #41678: URL: https://github.com/apache/arrow/issues/41678#issuecomment-2113519633 That doesn’t solve the issue. There are 16 distinct keys in that new keyring file, 5 of which are expired, so presumably it’s changing quite frequently. Like I said, downloading a ne

Re: [PR] Use interval type constructors in integration test (#5654) [arrow-rs]

2024-05-15 Thread via GitHub
tustvold closed pull request #5765: Use interval type constructors in integration test (#5654) URL: https://github.com/apache/arrow-rs/pull/5765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

2024-05-15 Thread via GitHub
joellubi commented on issue #1847: URL: https://github.com/apache/arrow-adbc/issues/1847#issuecomment-2113506797 Sure @zeroshade, I ported the repro to a failing test case and pushed it up to #1866 -- This is an automated message from the Apache Git Service. To respond to the message, ple

[PR] fix(go/driver/snowflake): Records dropped on ingestion when empty batch is present [arrow-adbc]

2024-05-15 Thread via GitHub
joellubi opened a new pull request, #1866: URL: https://github.com/apache/arrow-adbc/pull/1866 Just a failing test to reproduce the bug for now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] GH-39858: [C++][Device] Add Copy/View slice functions to MemoryManager [arrow]

2024-05-15 Thread via GitHub
alanstoate commented on code in PR #41477: URL: https://github.com/apache/arrow/pull/41477#discussion_r1602286157 ## cpp/src/arrow/c/bridge.cc: ## @@ -1868,24 +1868,15 @@ struct ArrayImporter { template Status ImportStringValuesBuffer(int32_t offsets_buffer_id, int32_t bu

Re: [I] apache-arrow-apt-source-latest-bullseye.deb.asc is signed by a key not in apache-arrow-keyring.gpg [arrow]

2024-05-15 Thread via GitHub
kou commented on issue #41678: URL: https://github.com/apache/arrow/issues/41678#issuecomment-2113496926 Ah, https://apache.jfrog.io/artifactory/arrow/debian/apache-arrow-keyring.gpg is deprecated by `apache-arrow-apt-source-latest*.deb` and it's not updated. You need to use https:/

Re: [PR] GH-41581: [C++][CMake] correctly use Protobuf_PROTOC_EXECUTABLE [arrow]

2024-05-15 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #41582: URL: https://github.com/apache/arrow/pull/41582#issuecomment-2113483061 After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit edd62f75326c86edc22e705e13b0674acd7cc1c1. There was 1 b

Re: [PR] Add Meson build with Werror [arrow-nanoarrow]

2024-05-15 Thread via GitHub
WillAyd commented on code in PR #448: URL: https://github.com/apache/arrow-nanoarrow/pull/448#discussion_r1602267855 ## src/nanoarrow/buffer_test.cc: ## @@ -141,9 +141,6 @@ TEST(BufferTest, BufferTestFill) { } ArrowBufferReset(&buffer); - Review Comment: The BufferTe

Re: [PR] Add Meson build with Werror [arrow-nanoarrow]

2024-05-15 Thread via GitHub
WillAyd commented on code in PR #448: URL: https://github.com/apache/arrow-nanoarrow/pull/448#discussion_r1602267855 ## src/nanoarrow/buffer_test.cc: ## @@ -141,9 +141,6 @@ TEST(BufferTest, BufferTestFill) { } ArrowBufferReset(&buffer); - Review Comment: The BufferTe

Re: [I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

2024-05-15 Thread via GitHub
zeroshade commented on issue #1847: URL: https://github.com/apache/arrow-adbc/issues/1847#issuecomment-2113471775 Awesome. So now we just gotta figure out if the issue is in the Parquet writer, or on snowflake's side :) if you don't get the time to dig deeper, I should be able to poke it to

Re: [I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

2024-05-15 Thread via GitHub
joellubi commented on issue #1847: URL: https://github.com/apache/arrow-adbc/issues/1847#issuecomment-2113468212 @zeroshade Yes just got a pure go reproduction With a record reader that produces 1 empty batch and then 10 batches of 100 (i.e. expecting 1000 rows): ``` joel@Joels-

Re: [PR] GH-41262: [Java][FlightSQL] Implement stateless prepared statements [arrow]

2024-05-15 Thread via GitHub
jduo commented on PR #41237: URL: https://github.com/apache/arrow/pull/41237#issuecomment-2113453410 > It looks like my fix in @afterall for TestFlightSql and TestFlightSqlStateless was not successful. I see that the tests use the same filename and the tests try to delete the file.

Re: [I] adbc_ingest() is dropping rows in Snowflake [arrow-adbc]

2024-05-15 Thread via GitHub
zeroshade commented on issue #1847: URL: https://github.com/apache/arrow-adbc/issues/1847#issuecomment-2113409097 @joellubi can you replicate it with pure go? Or only through pyarrow with some table chunks set to be empty? Just trying to narrow down where the issue might be -- This is an

  1   2   3   >