Re: [I] [C++] get different total rows answer for one dataset [arrow]

2023-10-06 Thread via GitHub
litao3rd commented on issue #37840: URL: https://github.com/apache/arrow/issues/37840#issuecomment-1751622884 The original problem has been resolved. Close this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] [C++] examples for self-defined compute function [arrow]

2023-10-06 Thread via GitHub
litao3rd commented on issue #37924: URL: https://github.com/apache/arrow/issues/37924#issuecomment-1751622454 Thank you for your response. The provided information is valuable to me. Hope it will be merged into the master branch soon. -- This is an automated message from the Apache Git Se

Re: [PR] Updated sort.rs to show `TopK` [arrow-datafusion]

2023-10-06 Thread via GitHub
Night-Amber3301 commented on PR #7751: URL: https://github.com/apache/arrow-datafusion/pull/7751#issuecomment-1751613313 I've merged it up from main, please check it out. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[PR] Updated sort.rs to show `TopK` [arrow-datafusion]

2023-10-06 Thread via GitHub
Night-Amber3301 opened a new pull request, #7763: URL: https://github.com/apache/arrow-datafusion/pull/7763 solves: #7750 Replaced `SortExec: fetch={fetch}, expr=[{}]` with `SortExec: TopK(fetch={fetch}), expr=[{}]` in [sort.rs](https://github.com/apache/arrow-datafusion/blob/main/da

Re: [PR] GH-37939: [C++] Use signed arithmetic for frame of reference when encoding DELTA_BINARY_PACKED [arrow]

2023-10-06 Thread via GitHub
mapleFU commented on PR #37940: URL: https://github.com/apache/arrow/pull/37940#issuecomment-1751612340 I generate data using the code below, and use zstd default to compress it with 1 values. ``` if i % 4 == 0: return i * -400; return i * 4; ``` The plain siz

Re: [PR] GH-37939: [C++] Use signed arithmetic for frame of reference when encoding DELTA_BINARY_PACKED [arrow]

2023-10-06 Thread via GitHub
etseidl commented on PR #37940: URL: https://github.com/apache/arrow/pull/37940#issuecomment-1751610584 > lol, I tried this in my own dataset, the page size after encoding becomes smaller, but the size after compression even glows twice than before😂 Oh no 😮 -- This is an automated

Re: [I] [Python] ERROR: Failed building wheel for pyarrow [arrow]

2023-10-06 Thread via GitHub
dss010101 commented on issue #34757: URL: https://github.com/apache/arrow/issues/34757#issuecomment-1751605396 > You might be using Python 3.12? In that case, there are no wheels yet for pyarrow for Python 3.12, the solution is to wait a month with using Python 3.12. yes that's correc

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751603600 Revision: 924286e2ba25072a0b1f211cac79688c0b0b3a3b Submitted crossbow builds: [ursacomputing/crossbow @ actions-e05753dd99](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38011: [C++][Dataset] Change force close to tend to close on write [arrow]

2023-10-06 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38030: URL: https://github.com/apache/arrow/pull/38030#issuecomment-1751602763 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 050ccee9ff4e967a9e604f8e4829a2b400688ee1. There were 3

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751597823 Revision: 3aaecf6ae983d03e9dc1d9ca9921389a2ac2c0a2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-3103086aa9](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
assignUser commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751597378 @github-actions crossbow submit r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-38034: [Python] DataFrame Interchange Protocol - correct dtype information for categorical columns [arrow]

2023-10-06 Thread via GitHub
AlenkaF commented on PR #38065: URL: https://github.com/apache/arrow/pull/38065#issuecomment-1751596545 Oh, bummer. Thanks for pinging me Dane! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751590077 Revision: 88dde984514318dfa89e302299724550d85743d0 Submitted crossbow builds: [ursacomputing/crossbow @ actions-9dd4c97371](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37939: [C++] Use signed arithmetic for frame of reference when encoding DELTA_BINARY_PACKED [arrow]

2023-10-06 Thread via GitHub
mapleFU commented on PR #37940: URL: https://github.com/apache/arrow/pull/37940#issuecomment-1751586297 lol, I tried this in my own dataset, the page size after encoding becomes smaller, but the size after compression even glows twice than before😂 -- This is an automated message from the

Re: [I] [C++] [Acero] Incorrect results in inner join [arrow]

2023-10-06 Thread via GitHub
llama90 commented on issue #38074: URL: https://github.com/apache/arrow/issues/38074#issuecomment-1751582986 It seems correct. I've noticed that when modifying the `kMiniBatchLength` value arbitrarily to be larger than the number of match records, the results come out fine. I s

Re: [PR] [feat] Support cache ListFiles result cache in session level [arrow-datafusion]

2023-10-06 Thread via GitHub
Ted-Jiang commented on PR #7620: URL: https://github.com/apache/arrow-datafusion/pull/7620#issuecomment-1751580979 > Thanks @Ted-Jiang and @suremarc > > Do you have any performance measurements you can shar @Ted-Jiang about how much this feature increases performance for your usecase

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-06 Thread via GitHub
niyue commented on PR #37787: URL: https://github.com/apache/arrow/pull/37787#issuecomment-1751580122 After discussing this proposal in mailing list, I will close this PR in favor of https://github.com/apache/arrow/pull/38116 Thank you all for the help! -- This is an automated message fro

Re: [PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-06 Thread via GitHub
niyue closed pull request #37787: GH-37753: [C++][Gandiva] Add external function registry support URL: https://github.com/apache/arrow/pull/37787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] GH-37753: [C++][Gandiva] Add external function registry support [arrow]

2023-10-06 Thread via GitHub
niyue opened a new pull request, #38116: URL: https://github.com/apache/arrow/pull/38116 # Rationale for this change This PR tries to enhance Gandiva by supporting external function registry, so that developers can author third party functions without modifying Gandiva's core codebas

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751578677 Revision: 3aad00200f3df4c6061d66f7f5ffa5ab8466ab8c Submitted crossbow builds: [ursacomputing/crossbow @ actions-32180a4493](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751566505 Revision: 28ce9e70dbd9dbc112458d6640580cabb2101011 Submitted crossbow builds: [ursacomputing/crossbow @ actions-12f8605306](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-38005: [Java] disable the debug log when running Java tests [arrow]

2023-10-06 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38006: URL: https://github.com/apache/arrow/pull/38006#issuecomment-1751566347 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 47f547749a524086c592d87d3c1f48f12730d74e. There were no

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
assignUser commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751565755 @github-actions crossbow submit r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] GH-38088: [R] Remove outdated references to brew and autobrew [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38089: URL: https://github.com/apache/arrow/pull/38089#issuecomment-1751555947 ``` No such command 'crossbow-submit'. The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/6438276069 ``` -- This is an automated mess

Re: [PR] GH-38088: [R] Remove outdated references to brew and autobrew [arrow]

2023-10-06 Thread via GitHub
paleolimbot commented on PR #38089: URL: https://github.com/apache/arrow/pull/38089#issuecomment-1751555728 @github-actions crossbow-submit --group r -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751546256 Revision: 271574cbdd6c689fdf2fb288005e503ca48e90ea Submitted crossbow builds: [ursacomputing/crossbow @ actions-954d9e9a04](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751544406 Revision: 9fbd82a698dca8b4ccdccd75c7f4d0a413b5a76d Submitted crossbow builds: [ursacomputing/crossbow @ actions-fcd6a592d1](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751539688 Revision: 16ae2618d7976fccae8b663d7be5df2c366d20d7 Submitted crossbow builds: [ursacomputing/crossbow @ actions-8328a73b02](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
assignUser commented on PR #38115: URL: https://github.com/apache/arrow/pull/38115#issuecomment-1751538859 @github-actions crossbow submit r-binary-packages -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[PR] GH-37941: [R][CI][Release] Add checksum verification for pre-compiled binaries [arrow]

2023-10-06 Thread via GitHub
assignUser opened a new pull request, #38115: URL: https://github.com/apache/arrow/pull/38115 ### Rationale for this change This change is to restore parity with the previous solution on macOS (brew does cs validation) and improve security for windows and linux. This also align with

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349420060 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349418679 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751513222 > Sorry for the false claim. I couldn't find it in the docs and I had a syntax error on my end! No worries -- any chance you can add it to the docs @Yacobolo 🙏

[I] [bug] 241.9 error: failed to select a version for the requirement `parquet = "^22.0.0"` [arrow-datafusion]

2023-10-06 Thread via GitHub
YuriyGavrilov opened a new issue, #7762: URL: https://github.com/apache/arrow-datafusion/issues/7762 ### Describe the bug `=> ERROR [builder 6/6] RUN cargo build --release

Re: [PR] GH-37876: [Format] Add list-view specification to arrow format [arrow]

2023-10-06 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #37877: URL: https://github.com/apache/arrow/pull/37877#issuecomment-1751500157 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 6d551aa88237325978f51dbe02773a0aec2bfe9d. There was 1 b

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349382380 ## go/arrow/avro/reader_types.go: ## @@ -0,0 +1,892 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. S

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349378306 ## go/arrow/avro/reader_types.go: ## @@ -0,0 +1,892 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. S

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349371528 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349371714 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349371327 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [PR] GH-36760: [Go] Adding avro ocf reader - schema converter [arrow]

2023-10-06 Thread via GitHub
loicalleyne closed pull request #36796: GH-36760: [Go] Adding avro ocf reader - schema converter URL: https://github.com/apache/arrow/pull/36796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349354201 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
Yacobolo closed issue #7760: Support for WITH Statements in DataFusion (they are currently silently ignored) URL: https://github.com/apache/arrow-datafusion/issues/7760 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349354201 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

[PR] GH-38090: [C++][Emscripten] dataset: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38114: URL: https://github.com/apache/arrow/pull/38114 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38096: [Java] FlightStream with metadata can cause error when closing [arrow]

2023-10-06 Thread via GitHub
BryanCutler commented on PR #38110: URL: https://github.com/apache/arrow/pull/38110#issuecomment-1751445924 @lidavidm I was playing around with the DoExchange call and noticed a couple of issues. At first I was a little confused as to why `getWriter().getResult()` wasn't being called until

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
Yacobolo commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751445491 Sorry for the false claim. I couldn't find it in the docs and I had a syntax error on my end! -- This is an automated message from the Apache Git Service. To respond to

Re: [I] [C++] [Acero] Incorrect results in inner join [arrow]

2023-10-06 Thread via GitHub
benibus commented on issue #38074: URL: https://github.com/apache/arrow/issues/38074#issuecomment-1751445025 I haven't managed to find the exact issue yet (I'm not too familiar with this code in particular), but this section is fairly suspicious: https://github.com/apache/arrow/blob/3697bcd

[PR] GH-38090: [C++][Emscripten] compute/row: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38113: URL: https://github.com/apache/arrow/pull/38113 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751443836 Revision: 14b7cfcff89983a2c86f34a0aa7cf948c086aaa6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-b62f8c70c6](https://github.com/ursacomputing/crossbow/bra

[PR] GH-38090: [C++][Emscripten] compute: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38112: URL: https://github.com/apache/arrow/pull/38112 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
danepitkin commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751441764 @github-actions crossbow submit ubuntu-focal-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] GH-36760: [Go] Add Avro OCF reader [arrow]

2023-10-06 Thread via GitHub
loicalleyne commented on code in PR #37115: URL: https://github.com/apache/arrow/pull/37115#discussion_r1349346821 ## go/arrow/avro/schema.go: ## @@ -0,0 +1,409 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the

[PR] GH-38090: [C++][Emscripten] compute/kernels/scalar_if_else: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38111: URL: https://github.com/apache/arrow/pull/38111 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
danepitkin commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751441368 Well.. it probably is added, but needs to be included somewhere. Either way, I don't have much time to dig into before the Arrow v14 release. -- This is an automated message from the

[PR] GH-38096: [Java] FlightStream with metadata can cause error when closing [arrow]

2023-10-06 Thread via GitHub
BryanCutler opened a new pull request, #38110: URL: https://github.com/apache/arrow/pull/38110 ### Rationale for this change The Java FlightStream can raise an error if metadata is transferred and ends up being closed twice. ``` java.lang.IllegalStateException: RefCnt has go

[PR] GH-38090: [C++][Emscripten] compute/kernels/scalar: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38109: URL: https://github.com/apache/arrow/pull/38109 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
danepitkin commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751440331 Good catch, looks like one of the CI jobs still doesn't like the enum added (I thought 11.2.2 had added it). Reverted the change. This is the CI error: ``` /build/apache-a

Re: [PR] Encode all join conditions in a single expression field [arrow-datafusion]

2023-10-06 Thread via GitHub
nseekhao commented on PR #7612: URL: https://github.com/apache/arrow-datafusion/pull/7612#issuecomment-1751439292 @alamb I extracted the predicate splits from the `from_substrait_rel()` so the code is easier to read. Unfortunately, I couldn't use the function `split_eq_and_noneq_join_predi

[PR] GH-38090: [C++][Emscripten] compute/kernels/row_encoder: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38108: URL: https://github.com/apache/arrow/pull/38108 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] Minor: Improve TableProvider document, and add ascii art [arrow-datafusion]

2023-10-06 Thread via GitHub
comphead commented on code in PR #7759: URL: https://github.com/apache/arrow-datafusion/pull/7759#discussion_r1349344404 ## datafusion/core/src/datasource/provider.rs: ## @@ -54,24 +54,87 @@ pub trait TableProvider: Sync + Send { None } -/// Get the Logical P

[PR] GH-38090: [C++][Emscripten] compute/kernels/hash_aggregate: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38107: URL: https://github.com/apache/arrow/pull/38107 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] compute/kernels/internal: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38106: URL: https://github.com/apache/arrow/pull/38106 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] compute/kernels/aggregate: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38105: URL: https://github.com/apache/arrow/pull/38105 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] compute/kernel: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38104: URL: https://github.com/apache/arrow/pull/38104 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] compare: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38103: URL: https://github.com/apache/arrow/pull/38103 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] chunked_array: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38102: URL: https://github.com/apache/arrow/pull/38102 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] chunk_resolver: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38101: URL: https://github.com/apache/arrow/pull/38101 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] c: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38100: URL: https://github.com/apache/arrow/pull/38100 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] Updated sort.rs to show `TopK` [arrow-datafusion]

2023-10-06 Thread via GitHub
Night-Amber3301 commented on PR #7751: URL: https://github.com/apache/arrow-datafusion/pull/7751#issuecomment-1751427174 Yeah sure, thanks for consideration, I would love to work on it, will go through it in a while. -- This is an automated message from the Apache Git Service. To respond

[PR] GH-38090: [C++][Emscripten] buffer: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38099: URL: https://github.com/apache/arrow/pull/38099 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] array: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38098: URL: https://github.com/apache/arrow/pull/38098 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] Add AWS presigned URL support [arrow-rs]

2023-10-06 Thread via GitHub
carols10cents commented on PR #4876: URL: https://github.com/apache/arrow-rs/pull/4876#issuecomment-1751425604 Ok, I think I'm happy-ish with this? I have [an iox branch working](https://github.com/influxdata/influxdb_iox/pull/8927) that I think isn't terrible, maybe? CI is failing w

Re: [PR] Support InsertInto Sorted ListingTable [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on PR #7743: URL: https://github.com/apache/arrow-datafusion/pull/7743#issuecomment-1751421496 > I actually think the above is correct behavior. The table is not globally sorted, but rather each individual file is sorted. Each time you insert, at least one new file is inser

[PR] GH-38090: [C++][Emscripten] orc: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38097: URL: https://github.com/apache/arrow/pull/38097 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
devinjdangelo commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751419505 ``` ❯ create table t(x int) as values (1), (2), (3), (2), (5), (1); 0 rows in set. Query took 0.004 seconds. ❯ with t2 as (select * from t where x > 2) sel

[PR] GH-38090: [C++][Emscripten] acero/tpch_node: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38095: URL: https://github.com/apache/arrow/pull/38095 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751418389 ``` ❯ create table table1 (a int) as values (1); 0 rows in set. Query took 0.002 seconds. ❯ WITH table2 AS ( SELECT a, count(1) as cnt FROM table1 GROUP

[PR] GH-38090: [C++][Emscripten] acero/swiss_join: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38094: URL: https://github.com/apache/arrow/pull/38094 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [I] Support for WITH Statements in DataFusion (they are currently silently ignored) [arrow-datafusion]

2023-10-06 Thread via GitHub
devinjdangelo commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751415697 I might be missing something, but I thought Datafusion did support [with statements](https://arrow.apache.org/datafusion/user-guide/sql/select.html) I tested th

[PR] GH-38090: [C++][Emscripten] acero/hash_join: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38093: URL: https://github.com/apache/arrow/pull/38093 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

[PR] GH-38090: [C++][Emscripten] acero/bloom_filter: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38092: URL: https://github.com/apache/arrow/pull/38092 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] GH-38090: [C++][Emscripten] acero/asof_join_node: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38091: URL: https://github.com/apache/arrow/pull/38091#issuecomment-1751411065 :warning: GitHub issue #38090 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-38090: [C++][Emscripten] acero/asof_join_node: Suppress shorten-64-to-32 warnings [arrow]

2023-10-06 Thread via GitHub
kou opened a new pull request, #38091: URL: https://github.com/apache/arrow/pull/38091 ### Rationale for this change We need explicit cast to use `int64_t` for `size_t` on Emscripten. ### What changes are included in this PR? Explicit casts. ### Are these changes t

Re: [PR] Support InsertInto Sorted ListingTable [arrow-datafusion]

2023-10-06 Thread via GitHub
devinjdangelo commented on PR #7743: URL: https://github.com/apache/arrow-datafusion/pull/7743#issuecomment-1751409949 > I found a few other issues, but I don't think they are caused by this PR > > ```shell > $ mkdir /tmp/output > $ datafusion-cli > DataFusion CLI v31.0.0

Re: [PR] GH-23221: [C++] Add support for building with Emscripten [arrow]

2023-10-06 Thread via GitHub
kou commented on code in PR #37821: URL: https://github.com/apache/arrow/pull/37821#discussion_r1348205281 ## cpp/cmake_modules/SetupCxxFlags.cmake: ## @@ -700,3 +702,30 @@ if(MSVC) set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${MSVC_LINKER_FLAGS}") endif

Re: [PR] Updated sort.rs to show `TopK` [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on PR #7751: URL: https://github.com/apache/arrow-datafusion/pull/7751#issuecomment-1751404949 This looks really close @Night-Amber3301 -- thank you. I think some of the clippy errors will be resolved if you merge up from main. And it looks like there are just a few more te

Re: [PR] GH-37978: [C++] Add support for specifying custom Array element delimiter to `arrow::PrettyPrintOptions` [arrow]

2023-10-06 Thread via GitHub
kevingurney commented on PR #37981: URL: https://github.com/apache/arrow/pull/37981#issuecomment-1751403686 One of the benchmarks that may have regressed is [`dataframe-to-table`](https://github.com/voltrondata-labs/benchmarks/blob/main/benchmarks/dataframe_to_table_benchmark.py). However,

Re: [PR] Support InsertInto Sorted ListingTable [arrow-datafusion]

2023-10-06 Thread via GitHub
devinjdangelo commented on code in PR #7743: URL: https://github.com/apache/arrow-datafusion/pull/7743#discussion_r1349322356 ## datafusion/physical-plan/src/insert.rs: ## @@ -73,6 +73,8 @@ pub struct FileSinkExec { sink_schema: SchemaRef, /// Schema describing the str

Re: [I] [Java][CI][Java-Jars][MacOS] C++ libraries for MacOS AARCH 64 [arrow]

2023-10-06 Thread via GitHub
assignUser commented on issue #38076: URL: https://github.com/apache/arrow/issues/38076#issuecomment-1751401405 Thanks for investigating. We will need to upgrade those machines. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [I] Support for WITH Statements in DataFusion [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on issue #7760: URL: https://github.com/apache/arrow-datafusion/issues/7760#issuecomment-1751400922 Thanks @Yacobolo -- I think this would be a relatively straightforward feature to add to the SQL planner It actually looks like the sql parser already supports this an

Re: [PR] GH-37978: [C++] Add support for specifying custom Array element delimiter to `arrow::PrettyPrintOptions` [arrow]

2023-10-06 Thread via GitHub
kevingurney commented on PR #37981: URL: https://github.com/apache/arrow/pull/37981#issuecomment-1751400679 This PR appears to have triggered a regression, but I am not clear why that would be. I wouldn't have expected the overhead of accessing a nested property on `PrettyPrintOptions` to b

Re: [PR] Make parquet an option [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on PR #7745: URL: https://github.com/apache/arrow-datafusion/pull/7745#issuecomment-1751399359 Thanks @ongchi 🙏 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] GH-38060: [Python][CI] Upgrade Spark versions [arrow]

2023-10-06 Thread via GitHub
danepitkin commented on PR #38082: URL: https://github.com/apache/arrow/pull/38082#issuecomment-1751395455 Thanks @kou ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[I] [Feature] Add support iceberg table [arrow-ballista]

2023-10-06 Thread via GitHub
YuriyGavrilov opened a new issue, #890: URL: https://github.com/apache/arrow-ballista/issues/890 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** There will be good to have iceberg table support (https://github.com/apache/icebe

Re: [PR] GH-38060: [Python][CI] Upgrade Spark versions [arrow]

2023-10-06 Thread via GitHub
kou merged PR #38082: URL: https://github.com/apache/arrow/pull/38082 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] Support InsertInto Sorted ListingTable [arrow-datafusion]

2023-10-06 Thread via GitHub
alamb commented on PR #7743: URL: https://github.com/apache/arrow-datafusion/pull/7743#issuecomment-1751391242 > However, if I insert the same data again, now the data is not sorted! I couldn't reproduce this locally with smaller reproducer and I am out of time. I'll investig

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
github-actions[bot] commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751387988 Revision: 834b29e835d62dbeae8fd1b41224347372f867e7 Submitted crossbow builds: [ursacomputing/crossbow @ actions-0381279f14](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-35328: [Go][FlightSQL] Fix flaky test for FlightSql driver [arrow]

2023-10-06 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #38044: URL: https://github.com/apache/arrow/pull/38044#issuecomment-1751387761 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 1cad7a787847add45261abe5acdd3ba080e54a75. There were no

Re: [PR] GH-38059: [Python][CI] Upgrade CUDA to 11.2.2 [arrow]

2023-10-06 Thread via GitHub
kou commented on PR #38081: URL: https://github.com/apache/arrow/pull/38081#issuecomment-1751385447 @github-actions crossbow submit debian-* ubuntu-* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

  1   2   3   4   >