Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501357779 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501357779 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501357779 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501357779 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] GH-39759: [Docs] Update pydata-sphinx-theme to 0.15.x [arrow]

2024-02-23 Thread via GitHub
Divyansh200102 commented on PR #39879: URL: https://github.com/apache/arrow/pull/39879#issuecomment-1962276504 @kou Are there any more changes needed in this pr? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] move Array related function to datafusion-functions-array crate [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on issue #9322: URL: https://github.com/apache/arrow-datafusion/issues/9322#issuecomment-1962274235 I plan to work on `make_array`. Not sure which one are you working on, but I think it is better to move the functions one by one unless they are similar. -- This is an

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#issuecomment-1962271571 If I understand correctly, it should pass the test like ```rust let inner = Arc::new(Field::new("item", DataType::Int32, false)); let current_types

Re: [PR] Move abs to datafusion_functions [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9313: URL: https://github.com/apache/arrow-datafusion/pull/9313#discussion_r1501352875 ## datafusion/substrait/tests/cases/roundtrip_logical_plan.rs: ## @@ -304,10 +304,11 @@ async fn not_between_integers() -> Result<()> { .await } -#[

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501352016 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501352016 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501348925 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] GH-40209: [C++][CMake] Use "RapidJSON" CMake target for RapidJSON [arrow]

2024-02-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40210: URL: https://github.com/apache/arrow/pull/40210#issuecomment-1962266060 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 193e39cad4d8e1a01376d6b5199077e401484838. There were no

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501348925 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501348925 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] feat: Support EscapedStringLiteral [arrow-datafusion]

2024-02-23 Thread via GitHub
Jefffrey commented on code in PR #9268: URL: https://github.com/apache/arrow-datafusion/pull/9268#discussion_r1501345793 ## datafusion-cli/Cargo.lock: ## Review Comment: Maybe omit these changes from PR, now that no new dependency is added? ## datafusion/sql/test

[I] support more date inferences in the csv reader [arrow-datafusion]

2024-02-23 Thread via GitHub
universalmind303 opened a new issue, #9331: URL: https://github.com/apache/arrow-datafusion/issues/9331 ### Is your feature request related to a problem or challenge? given a csv ```csv mm/dd/,mm-dd-,dd/mm/,dd-mm-,/mm/dd,-mm-dd 01/01/2012,01-01-2012,01

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501346257 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes =

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
Jefffrey commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962249351 > Off-topic warning: the below behavior is not related to this PR > > It seems that if prune is disabled, then filter pushdown will also be disabled as the predicate argum

Re: [PR] GH-40023: [Python] Use Cast() instead of CastTo [arrow]

2024-02-23 Thread via GitHub
llama90 commented on PR #40116: URL: https://github.com/apache/arrow/pull/40116#issuecomment-1962247561 @AlenkaF Hello. If I would like to get this part reviewed, who should I ask? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [I] ParquetExec: don't evalute predicates unless corresponding pushdowns are enabled [arrow-datafusion]

2024-02-23 Thread via GitHub
SteveLauC closed issue #9282: ParquetExec: don't evalute predicates unless corresponding pushdowns are enabled URL: https://github.com/apache/arrow-datafusion/issues/9282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [I] ParquetExec: don't evalute predicates unless corresponding pushdowns are enabled [arrow-datafusion]

2024-02-23 Thread via GitHub
SteveLauC commented on issue #9282: URL: https://github.com/apache/arrow-datafusion/issues/9282#issuecomment-1962244156 Close as this issue does not exist. For more info, see [this comment](https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962240794). -- This is an autom

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
SteveLauC commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962243526 For this PR and #9282, I am sorry that I opened an issue that actually does not exist:D - > Off-topic warning: the below behavior is not related to this PR

Re: [PR] fix: issue #9327 throw error when incursion happen in dataframe api [arrow-datafusion]

2024-02-23 Thread via GitHub
Jefffrey commented on code in PR #9330: URL: https://github.com/apache/arrow-datafusion/pull/9330#discussion_r1501335629 ## datafusion/core/src/dataframe/mod.rs: ## @@ -1044,6 +1044,9 @@ impl DataFrame { /// # } /// ``` pub fn explain(self, verbose: bool, analyze:

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962242136 > It seems that if `prune` is disabled, then filter pushdown will also be disabled as the `predicate` argument of `ParquetExec::new()` will be `None`. I know that. But I

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
SteveLauC commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962241407 It seems that if `prune` is disabled, then filter pushdown will also be disabled as the `predicate` argument of `ParquetExec::new()` will be `None`. -- This is an automated

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
SteveLauC commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962240794 Well, centralizing the configuration to ParquetExec is not what I want, I simply don't want these 2 predicates to be created if the pushdowns are disabled. Actually

Re: [PR] object_store: (GCP) Add support for Workload Identity Federation from AWS [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed pull request #3802: object_store: (GCP) Add support for Workload Identity Federation from AWS URL: https://github.com/apache/arrow-rs/pull/3802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] object_store: (GCP) Add support for Workload Identity Federation from AWS [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on PR #3802: URL: https://github.com/apache/arrow-rs/pull/3802#issuecomment-1962238777 Whilst we don't have first-party support for this, the CredentialProvider abstractions allow users to extend the credential support out-of-tree. As such, and given this PR hasn't had an

Re: [PR] feat: issue #9282 not creating page_pruning_predicate when pruning is… [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms commented on PR #9314: URL: https://github.com/apache/arrow-datafusion/pull/9314#issuecomment-1962238226 > Could you help me to understand the impact of this change? From what I can tell, `ParquetFormat::create_physical_plan()` already checks this config before passing in the opt

Re: [I] Parquet: Match generated RecordBatch number of rows to Parquet row group size [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed issue #5356: Parquet: Match generated RecordBatch number of rows to Parquet row group size URL: https://github.com/apache/arrow-rs/issues/5356 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] Parquet: Match generated RecordBatch number of rows to Parquet row group size [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on issue #5356: URL: https://github.com/apache/arrow-rs/issues/5356#issuecomment-1962237652 Closing as I believe the question has been answered -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [I] Parquet: Match generated RecordBatch number of rows to Parquet row group size [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed issue #5356: Parquet: Match generated RecordBatch number of rows to Parquet row group size URL: https://github.com/apache/arrow-rs/issues/5356 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [I] fails to build with latest nightly [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on issue #5416: URL: https://github.com/apache/arrow-rs/issues/5416#issuecomment-1962237527 Closing as I believe this has been resolved -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [I] fails to build with latest nightly [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed issue #5416: fails to build with latest nightly URL: https://github.com/apache/arrow-rs/issues/5416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

Re: [I] FlightSQL: consume self in client close [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on issue #5421: URL: https://github.com/apache/arrow-rs/issues/5421#issuecomment-1962237215 This appears to have been present in the initial version added by @avantgardnerio, perhaps he might be able to shed some light on its purpose. It doesn't appear to do anything curr

[I] Dyn Comparison of Nested Arrays [arrow-rs]

2024-02-23 Thread via GitHub
tustvold opened a new issue, #5426: URL: https://github.com/apache/arrow-rs/issues/5426 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently `build_compare` can be used to construct a `DynComparator` that can be used

Re: [PR] GH-40221: [C++][CMake] Use arrow/util/config.h.cmake instead of add_definitions() [arrow]

2024-02-23 Thread via GitHub
github-actions[bot] commented on PR #40222: URL: https://github.com/apache/arrow/pull/40222#issuecomment-1962230409 Revision: ac45e334bb46a922cb0fae6d70917146acf9822f Submitted crossbow builds: [ursacomputing/crossbow @ actions-6cc9fbba1b](https://github.com/ursacomputing/crossbow/bra

[PR] fix: issue #9327 throw error when incursion happen in dataframe api [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms opened a new pull request, #9330: URL: https://github.com/apache/arrow-datafusion/pull/9330 ## Which issue does this PR close? #9327 Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these chang

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-02-23 Thread via GitHub
kou commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1962229562 I extracted the `arrow/util/config.h.cmake` part to GH-40221/GH-40222. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] GH-31538: [Python][Docs] Documents ParquetWriteOptions class [arrow]

2024-02-23 Thread via GitHub
Divyansh200102 commented on PR #38279: URL: https://github.com/apache/arrow/pull/38279#issuecomment-1962229390 > Could you show the exact command line your used for formatting? Sure ```powershell Divyansh@LAPTOP-8JH824JM MINGW64 ~/Documents/GitHub/arrow (Document-ParquetWrit

Re: [PR] GH-40221: [C++][CMake] Use arrow/util/config.h.cmake instead of add_definitions() [arrow]

2024-02-23 Thread via GitHub
kou commented on PR #40222: URL: https://github.com/apache/arrow/pull/40222#issuecomment-1962229355 @github-actions crossbow submit -g cpp -g r -g linux -g python -g r -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

Re: [PR] GH-40221: [C++][CMake] Use arrow/util/config.h.cmake instead of add_definitions() [arrow]

2024-02-23 Thread via GitHub
github-actions[bot] commented on PR #40222: URL: https://github.com/apache/arrow/pull/40222#issuecomment-1962229341 :warning: GitHub issue #40221 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-40221: [C++][CMake] Use arrow/util/config.h.cmake instead of add_definitions() [arrow]

2024-02-23 Thread via GitHub
kou opened a new pull request, #40222: URL: https://github.com/apache/arrow/pull/40222 ### Rationale for this change It's easy to maintain. ### What changes are included in this PR? Use `#cmakedefine` in `cpp/src/arrow/util/config.h.cmake` and `#include "arrow/util/confi

Re: [I] [C++][CMake] Use "RapidJSON" CMake target for RapidJSON [arrow]

2024-02-23 Thread via GitHub
kou commented on issue #40209: URL: https://github.com/apache/arrow/issues/40209#issuecomment-1962225806 Issue resolved by pull request 40210 https://github.com/apache/arrow/pull/40210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-40209: [C++][CMake] Use "RapidJSON" CMake target for RapidJSON [arrow]

2024-02-23 Thread via GitHub
kou commented on PR #40210: URL: https://github.com/apache/arrow/pull/40210#issuecomment-1962225727 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [PR] GH-40209: [C++][CMake] Use "RapidJSON" CMake target for RapidJSON [arrow]

2024-02-23 Thread via GitHub
kou merged PR #40210: URL: https://github.com/apache/arrow/pull/40210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
DDtKey commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962221660 The naming is definitely negotiable, I'm not making any claims to the truth with the current version. Subjectively, it looks like this: we `spawn` a task and then we have

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962221412 Docs preview: http://crossbow.voltrondata.com/pr_docs/40219/format/Columnar.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] arrow-ord: `lt` and `eq` for nested list [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on PR #5408: URL: https://github.com/apache/arrow-rs/pull/5408#issuecomment-1962221266 Actually I remember now why DynComparator lacks support for nested types, the ordering of nulls is not well defined. Tricky... 🤔 -- This is an automated message from the Apache Git Se

Re: [PR] feat: support `FixedSizeList` Type Coercion [arrow-datafusion]

2024-02-23 Thread via GitHub
Weijun-H commented on code in PR #9108: URL: https://github.com/apache/arrow-datafusion/pull/9108#discussion_r1501320634 ## datafusion/physical-expr/src/array_expressions.rs: ## @@ -433,6 +435,7 @@ pub fn array_element(args: &[ArrayRef]) -> Result { let indexes = a

Re: [PR] Refactor integer type inference logic to fit smallest type [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on PR #5406: URL: https://github.com/apache/arrow-rs/pull/5406#issuecomment-1962219389 Marking as draft as not waiting on review, please feel to mark as ready for review when you would like me to take another look -- This is an automated message from the Apache Git Serv

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#issuecomment-1962218897 See https://github.com/apache/arrow-rs/pull/5408#issuecomment-1962218269 I think we should probably start by adding support to DynComparator. In the case of StructArray there is likely

Re: [PR] arrow-ord: `lt` and `eq` for nested list [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on PR #5408: URL: https://github.com/apache/arrow-rs/pull/5408#issuecomment-1962218269 Ok, so I have had a play with this and I think I would recommend the following course of action: * Add support to DynComparator for List / LargeList / etc... * Add a fallback m

Re: [PR] arrow-ord: `lt` and `eq` for nested list [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5408: URL: https://github.com/apache/arrow-rs/pull/5408#discussion_r1501313910 ## arrow-ord/src/cmp.rs: ## @@ -702,4 +784,216 @@ mod tests { neq(&col.slice(0, col.len() - 1), &col.slice(1, col.len() - 1)).unwrap(); } + Review C

Re: [PR] arrow-ord: `lt` and `eq` for nested list [arrow-rs]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #5408: URL: https://github.com/apache/arrow-rs/pull/5408#discussion_r1501315441 ## arrow-ord/src/cmp.rs: ## @@ -702,4 +784,216 @@ mod tests { neq(&col.slice(0, col.len() - 1), &col.slice(1, col.len() - 1)).unwrap(); } + Review

Re: [PR] arrow-ord: `lt` and `eq` for nested list [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5408: URL: https://github.com/apache/arrow-rs/pull/5408#discussion_r1501313910 ## arrow-ord/src/cmp.rs: ## @@ -702,4 +784,216 @@ mod tests { neq(&col.slice(0, col.len() - 1), &col.slice(1, col.len() - 1)).unwrap(); } + Review C

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
DDtKey commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962205925 But a load of places were already refactored here https://github.com/apache/arrow-datafusion/pull/6750 🤔 We have a clippy warning and can specify to use SpawnedTask in case

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501312449 ## arrow-ord/src/cmp.rs: ## @@ -198,7 +198,43 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { +return (0..l.num_columns()).f

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501310961 ## arrow-ord/src/cmp.rs: ## @@ -198,7 +198,43 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { +return (0..l.num_columns()).fol

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
tustvold commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962201346 > . AbortOnDrop didn't provide the same guarantees, and easily can be misused. How about adding an AbortOnDrop::spawn method that handles this, and potentially deprecate

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
jayzhan211 commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501308440 ## arrow-ord/src/cmp.rs: ## @@ -198,7 +198,38 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { +let mut res = vec![true; len]

Re: [I] Remove usages of lexical_core for parsing integers [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on issue #5422: URL: https://github.com/apache/arrow-rs/issues/5422#issuecomment-1962199766 Switching to atoi seems sensible to me, this will likely regress performance, but we could potentially explore an option to switch to an unchecked version (I believe `atoi` support

Re: [I] Parquet: Bring ArrowWriter methods over to AsyncArrowWriter [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed issue #5099: Parquet: Bring ArrowWriter methods over to AsyncArrowWriter URL: https://github.com/apache/arrow-rs/issues/5099 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

Re: [PR] Bring some methods over from ArrowWriter to the async version [arrow-rs]

2024-02-23 Thread via GitHub
tustvold merged PR #5251: URL: https://github.com/apache/arrow-rs/pull/5251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501306029 ## arrow-ord/src/cmp.rs: ## @@ -210,17 +210,19 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { -let mut res = BooleanArray::fr

Re: [PR] feat(r): Add bindings for IPC reader [arrow-nanoarrow]

2024-02-23 Thread via GitHub
paleolimbot commented on PR #390: URL: https://github.com/apache/arrow-nanoarrow/pull/390#issuecomment-1962195050 The ADBC drivers also use that pattern ( https://github.com/apache/arrow-adbc/tree/main/r )...if there's a good place to PR to either add those to the Apache universe or enable

Re: [PR] feat(r): Add bindings for IPC reader [arrow-nanoarrow]

2024-02-23 Thread via GitHub
paleolimbot commented on PR #390: URL: https://github.com/apache/arrow-nanoarrow/pull/390#issuecomment-1962190709 Thanks for the heads up! I don't see the output of `Rscript bootstrap.R` in the log...after this PR there are too many files (flatcc includes) to `curl` as a fallback, so

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
DDtKey commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962176561 I found current interface less boilerplaite actually, but anyway - it's not a main point. Safety is more important. `AbortOnDrop` didn't provide the same guarantees, we eve

Re: [PR] [object_store] Enables anonymous access for MicrosoftAzure store [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5425: URL: https://github.com/apache/arrow-rs/pull/5425#discussion_r1501283438 ## object_store/src/azure/credential.rs: ## @@ -127,6 +127,10 @@ pub enum AzureCredential { /// ///

Re: [I] OOB access in `Buffer::from_iter` [arrow-rs]

2024-02-23 Thread via GitHub
tustvold closed issue #5412: OOB access in `Buffer::from_iter` URL: https://github.com/apache/arrow-rs/issues/5412 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

Re: [PR] Ensure addition/multiplications in when allocating buffers don't overflow [arrow-rs]

2024-02-23 Thread via GitHub
tustvold merged PR #5417: URL: https://github.com/apache/arrow-rs/pull/5417 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501284691 ## arrow-ord/src/cmp.rs: ## @@ -198,7 +198,38 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { +let mut res = vec![true; len];

Re: [PR] `eq` for struct [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5423: URL: https://github.com/apache/arrow-rs/pull/5423#discussion_r1501284691 ## arrow-ord/src/cmp.rs: ## @@ -198,7 +198,38 @@ fn compare_op(op: Op, lhs: &dyn Datum, rhs: &dyn Datum) -> Result { +let mut res = vec![true; len];

Re: [PR] [object_store] Enables anonymous access for MicrosoftAzure store [arrow-rs]

2024-02-23 Thread via GitHub
tustvold commented on code in PR #5425: URL: https://github.com/apache/arrow-rs/pull/5425#discussion_r1501283438 ## object_store/src/azure/credential.rs: ## @@ -127,6 +127,10 @@ pub enum AzureCredential { /// ///

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
tustvold commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962154301 I might be missing something, but what is the issue with the AbortOnDrop interfaces? They seem like less boilerplate than the lroposed solution in this PR? -- This is an aut

Re: [I] Comet returns different results for 3 TPCDS queries compared with Spark [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
viirya commented on issue #74: URL: https://github.com/apache/arrow-datafusion-comet/issues/74#issuecomment-1962153753 The latest excluded TPCDS query list: "q34", "q66", "q64", "q71", "q88", "q90", "q96" -- This is an automated message from the Apache Git Service. To respond to the mess

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
github-actions[bot] commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962146570 Revision: 29ee31e02f57d3e2cb87d029228ff52017d5d79b Submitted crossbow builds: [ursacomputing/crossbow @ actions-5c8393ebe7](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962145358 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] fix: use `JoinSet` to make spawned tasks cancel-safe [arrow-datafusion]

2024-02-23 Thread via GitHub
devinjdangelo commented on PR #9318: URL: https://github.com/apache/arrow-datafusion/pull/9318#issuecomment-1962142430 The `SpawnedTask` abstraction looks great! Agreed that your API is more intuitive and `Vec` is sufficient without an additional wrapper. Thanks again for knocking this out

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962141152 Docs preview: http://crossbow.voltrondata.com/pr_docs/40219/format/Columnar.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] feat(r): Add bindings for IPC reader [arrow-nanoarrow]

2024-02-23 Thread via GitHub
jeroen commented on PR #390: URL: https://github.com/apache/arrow-nanoarrow/pull/390#issuecomment-1962140841 I think this is causing some issues: https://github.com/r-universe/apache/actions/runs/8024701816/job/21923807244 -- This is an automated message from the Apache Git Service. To re

Re: [I] [JS] The package.json file incorrectly specifies sideEffects: false [arrow]

2024-02-23 Thread via GitHub
domoritz commented on issue #38936: URL: https://github.com/apache/arrow/issues/38936#issuecomment-1962135857 All arrow packages I'm this repo are released together every three months. This is an important issue but I haven't found a solution yet. -- This is an automated message fr

Re: [I] Nested explain possible via DataFrame API [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms commented on issue #9327: URL: https://github.com/apache/arrow-datafusion/issues/9327#issuecomment-1962135372 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] feat: issue_9285: port builtin reg function into datafusion-function-* (1/3 regexpmatch) [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms opened a new pull request, #9329: URL: https://github.com/apache/arrow-datafusion/pull/9329 …* crate (1/3: RegexpMatch part) ## Which issue does this PR close? #9328 Closes #. ## Rationale for this change ## What changes are included in th

Re: [I] port reg_related function [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms commented on issue #9328: URL: https://github.com/apache/arrow-datafusion/issues/9328#issuecomment-1962130664 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[I] port reg_related function [arrow-datafusion]

2024-02-23 Thread via GitHub
Lordworms opened a new issue, #9328: URL: https://github.com/apache/arrow-datafusion/issues/9328 ### Is your feature request related to a problem or challenge? related to #9285 ### Describe the solution you'd like _No response_ ### Describe alternatives you've con

Re: [PR] build: Add CI for TPCDS queries [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
viirya commented on code in PR #99: URL: https://github.com/apache/arrow-datafusion-comet/pull/99#discussion_r1501264547 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -26,7 +26,8 @@ import org.apache.comet.CometConf class CometTPCDSQuerySuite

Re: [PR] build: Add CI for TPCDS queries [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
viirya commented on code in PR #99: URL: https://github.com/apache/arrow-datafusion-comet/pull/99#discussion_r1501264301 ## spark/src/test/scala/org/apache/spark/sql/CometTPCDSQuerySuite.scala: ## @@ -26,7 +26,8 @@ import org.apache.comet.CometConf class CometTPCDSQuerySuite

Re: [PR] GH-40089: [Go] Concurrent Recordset for receiving huge recordset [arrow]

2024-02-23 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40090: URL: https://github.com/apache/arrow/pull/40090#issuecomment-1962106358 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 036a22eaff16165dbb8ddbcf4156766079ec7577. There were no

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501242477 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] feat: add support for fixed list wildcard in type signature [arrow-datafusion]

2024-02-23 Thread via GitHub
gruuya commented on code in PR #9312: URL: https://github.com/apache/arrow-datafusion/pull/9312#discussion_r1501242477 ## datafusion/expr/src/type_coercion/functions.rs: ## @@ -372,13 +374,20 @@ fn coerced_from<'a>( List(_) if matches!(type_from, FixedSizeList(_, _)) =>

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
github-actions[bot] commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962090961 Revision: 1f79f267a421e36dfbedb5440f0295c09f88c1d3 Submitted crossbow builds: [ursacomputing/crossbow @ actions-101cf35369](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962089341 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] sql("explain select 1") raise an error [arrow-datafusion]

2024-02-23 Thread via GitHub
l1t1 closed issue #9319: sql("explain select 1") raise an error URL: https://github.com/apache/arrow-datafusion/issues/9319 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
github-actions[bot] commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962079225 Revision: 1f79f267a421e36dfbedb5440f0295c09f88c1d3 Submitted crossbow builds: [ursacomputing/crossbow @ actions-3d478d7c91](https://github.com/ursacomputing/crossbow/bra

Re: [PR] feat: Introduce `CometTaskMemoryManager` and native side memory pool [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
viirya commented on PR #83: URL: https://github.com/apache/arrow-datafusion-comet/pull/83#issuecomment-1962078518 Thanks @sunchao. I will review this in next days (or week). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook commented on PR #40219: URL: https://github.com/apache/arrow/pull/40219#issuecomment-1962077158 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] GH-40215: [Format][Docs] Arrow Columnar Format version history docs page [arrow]

2024-02-23 Thread via GitHub
ianmcook opened a new pull request, #40219: URL: https://github.com/apache/arrow/pull/40219 This adds a **Version History** section near the top of the Arrow Columnar Format docs page. It also adds a few associated link anchors and "New in version" info boxes. * GitHub Issue: #40215 --

[I] Add CI for TPCDS queries [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
viirya opened a new issue, #101: URL: https://github.com/apache/arrow-datafusion-comet/issues/101 ### What is the problem the feature request solves? We need a CI pipeline that could help us verify Comet query correctness. ### Describe the potential solution Add a CI

Re: [PR] feat: Introduce `CometTaskMemoryManager` and native side memory pool [arrow-datafusion-comet]

2024-02-23 Thread via GitHub
sunchao commented on PR #83: URL: https://github.com/apache/arrow-datafusion-comet/pull/83#issuecomment-1962074048 cc @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

  1   2   3   >