Re: [I] Remove git-commit-id-maven-plugin [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
SemyonSinchenko commented on issue #191: URL: https://github.com/apache/arrow-datafusion-comet/issues/191#issuecomment-1990890240 Can I make a PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[PR] Update the comment and Add a check [arrow-datafusion]

2024-03-11 Thread via GitHub
colommar opened a new pull request, #9571: URL: https://github.com/apache/arrow-datafusion/pull/9571 Update the comment on TableProvider::supports_filters_pushdown() to say that the returned vector much have the same size as the filters argument. Add a check at the callsite that the vec ret

Re: [PR] Systematic Configuration in 'Create External Table' and 'Copy To' Options [arrow-datafusion]

2024-03-11 Thread via GitHub
metesynnada commented on PR #9382: URL: https://github.com/apache/arrow-datafusion/pull/9382#issuecomment-1990867925 I visited @ozankabak review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang commented on issue #9566: URL: https://github.com/apache/arrow-datafusion/issues/9566#issuecomment-1990862541 datafusion keep as pg ``` DataFusion CLI v36.0.0 ❯ select sum(Null); Error during planning: No function matches the given name and argument types 'SUM(Null)'.

Re: [I] Add a runtime error check for `TableProvider::supports_filters_pushdown()` (was for "AND" operators does not work) [arrow-datafusion]

2024-03-11 Thread via GitHub
colommar commented on issue #9405: URL: https://github.com/apache/arrow-datafusion/issues/9405#issuecomment-1990860199 Hi! I was a newbie at datafusion. I want to have a try! : ) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang closed issue #9566: Support sum(Null) URL: https://github.com/apache/arrow-datafusion/issues/9566 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

Re: [I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang commented on issue #9566: URL: https://github.com/apache/arrow-datafusion/issues/9566#issuecomment-1990855227 But postgersql not support this ``` Output: psql:commands.sql:17: ERROR: function sum(unknown) is not unique LINE 1: select sum(Null) ^

Re: [I] Enable recursive CTE support by default [arrow-datafusion]

2024-03-11 Thread via GitHub
l1t1 commented on issue #9554: URL: https://github.com/apache/arrow-datafusion/issues/9554#issuecomment-1990789938 it's a bit slow. ``` with recursive t(a) as (select 1 as a union all select 1+a from t where a<1) select count(*) from t; +--+ | COUNT(*) |

Re: [PR] feat: use effective memory size for memory management purpose [arrow-datafusion]

2024-03-11 Thread via GitHub
yjshen closed pull request #9481: feat: use effective memory size for memory management purpose URL: https://github.com/apache/arrow-datafusion/pull/9481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [I] Port `ArrayResize` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari commented on issue #9569: URL: https://github.com/apache/arrow-datafusion/issues/9569#issuecomment-1990685494 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] Port `ArrayResize` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari opened a new pull request, #9570: URL: https://github.com/apache/arrow-datafusion/pull/9570 ## Which issue does this PR close? Closes #9569. ## What changes are included in this PR? This PR aims to do following changes in terms of Epic https://github.com/apache

Re: [PR] feat(9493): provide access to FileMetaData for files written with ParquetSink [arrow-datafusion]

2024-03-11 Thread via GitHub
wiedld commented on code in PR #9548: URL: https://github.com/apache/arrow-datafusion/pull/9548#discussion_r1520870685 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -541,6 +542,8 @@ async fn fetch_statistics( pub struct ParquetSink { /// Config options fo

[PR] Add simple Ruby server example [arrow-experiments]

2024-03-11 Thread via GitHub
kou opened a new pull request, #17: URL: https://github.com/apache/arrow-experiments/pull/17 fix apache/arrow#40479 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

[I] Port `ArrayResize` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari opened a new issue, #9569: URL: https://github.com/apache/arrow-datafusion/issues/9569 ### Is your feature request related to a problem or challenge? `ArrayResize` function needs to be ported to new `function-arrays` subcreate in terms of https://github.com/apache/ar

Re: [PR] GH-39444: [C++][Parquet] Fix Segment Fault in Modular Encryption [arrow]

2024-03-11 Thread via GitHub
wgtmac commented on PR #39623: URL: https://github.com/apache/arrow/pull/39623#issuecomment-1990420179 @adamreeve @westonpace Do you want to take another pass? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] GH-40328: [C++][Parquet] Allow use of FileDecryptionProperties after the CryptoFactory is destroyed [arrow]

2024-03-11 Thread via GitHub
wgtmac commented on code in PR #40329: URL: https://github.com/apache/arrow/pull/40329#discussion_r1520846640 ## cpp/src/parquet/encryption/key_management_test.cc: ## @@ -324,6 +324,37 @@ TEST_F(TestEncryptionKeyManagement, KeyRotationWithInternalMaterial) { EXPECT_THROW(thi

Re: [PR] MINOR: [Java] Bump commons-io:commons-io from 2.7 to 2.15.1 in /java [arrow]

2024-03-11 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40463: URL: https://github.com/apache/arrow/pull/40463#issuecomment-1990382434 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 10d141ce586245c319d72766f4e16d8dd0b46845. There were no

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#issuecomment-1990380938 (the CI failure is unrelated) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#issuecomment-1990379843 @advancedxy there are still some issues when enabling shuffle in Spark SQL tests. I'll address them separately later in a follow-up. Let me know what you think of the latest

Re: [PR] MINOR: [Java] Bump org.apache.maven.plugins:maven-shade-plugin from 3.2.4 to 3.5.2 in /java [arrow]

2024-03-11 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40462: URL: https://github.com/apache/arrow/pull/40462#issuecomment-1990368445 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 8ee9679d401183220a4566681ca7ef9e887ba4d2. There were no

Re: [I] Remove git-commit-id-maven-plugin [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on issue #191: URL: https://github.com/apache/arrow-datafusion-comet/issues/191#issuecomment-1990260048 `git-commit-id-maven-plugin` is an optional feature so we can consider removing it, cc @snmvaughan -- This is an automated message from the Apache Git Service. To re

Re: [I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang commented on issue #9566: URL: https://github.com/apache/arrow-datafusion/issues/9566#issuecomment-1990142219 @alamb some situation like will block ``` ❯ SELECT sum(t) FROM (SELECT null AS t) ; Error during planning: No function matches the given name and argu

[PR] Port `ArrayRepeat` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari opened a new pull request, #9568: URL: https://github.com/apache/arrow-datafusion/pull/9568 ## Which issue does this PR close? Closes #9565. ## What changes are included in this PR? This PR aims to do following changes in terms of Epic https://github.com/apache

Re: [PR] fix: Try to convert a static list into a set in Rust [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on PR #184: URL: https://github.com/apache/arrow-datafusion-comet/pull/184#issuecomment-1990071870 Merged, thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [I] Perform InSet optimization in the native side [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao closed issue #183: Perform InSet optimization in the native side URL: https://github.com/apache/arrow-datafusion-comet/issues/183 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] fix: Try to convert a static list into a set in Rust [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao merged PR #184: URL: https://github.com/apache/arrow-datafusion-comet/pull/184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr..

Re: [I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang commented on issue #9566: URL: https://github.com/apache/arrow-datafusion/issues/9566#issuecomment-1990071200 I originally think this would effect subquery pattern sql like ``` ❯ create table test as values (1),(2),(3); 0 rows in set. Query took 0.021 seconds. ❯ sel

[PR] fix: incorrect null handling in `range` and `generate_series` [arrow-datafusion]

2024-03-11 Thread via GitHub
jonahgao opened a new pull request, #9567: URL: https://github.com/apache/arrow-datafusion/pull/9567 ## Which issue does this PR close? Closes #9553. ## Rationale for this change If any of the arguments is NULL, these two functions should return NULL. Maintain the

[I] Support sum(Null) [arrow-datafusion]

2024-03-11 Thread via GitHub
Ted-Jiang opened a new issue, #9566: URL: https://github.com/apache/arrow-datafusion/issues/9566 ### Is your feature request related to a problem or challenge? in spark-sql ``` spark-sql (default)> select sum(Null); NULL Time taken: 1.525 seconds, Fetched 1 row(s) ```

Re: [I] [Java][FlightRPC] Flight SQL JDBC driver parameter getting an exception: parameter ordinal 1 out of range [arrow]

2024-03-11 Thread via GitHub
MaoMiMao commented on issue #40118: URL: https://github.com/apache/arrow/issues/40118#issuecomment-1989953937 I encountered the same problem https://github.com/apache/arrow/assets/23352189/7a47ca96-042d-4029-8b97-1f948c810747";> -- This is an automated message from the Apache Gi

Re: [I] [Java][FlightRPC] Flight SQL JDBC driver parameter getting an exception: parameter ordinal 1 out of range [arrow]

2024-03-11 Thread via GitHub
MaoMiMao commented on issue #40118: URL: https://github.com/apache/arrow/issues/40118#issuecomment-1989951147 I encountered the same problem https://github.com/apache/arrow/assets/23352189/bffebd71-937c-4540-94c0-e92c76000ebb";> -- This is an automated message from the Apache Gi

Re: [PR] refactor: unify some plan optimization in CommonSubexprEliminate [arrow-datafusion]

2024-03-11 Thread via GitHub
waynexia commented on code in PR #9556: URL: https://github.com/apache/arrow-datafusion/pull/9556#discussion_r1520779234 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -44,13 +42,13 @@ use datafusion_expr::{col, Expr, ExprSchemable}; /// - DataType of this expre

Re: [PR] fix: Try to convert a static list into a set in rust [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
advancedxy commented on code in PR #184: URL: https://github.com/apache/arrow-datafusion-comet/pull/184#discussion_r1520753574 ## core/src/execution/datafusion/planner.rs: ## @@ -496,7 +493,16 @@ impl PhysicalPlanner { .iter() .map(|x|

Re: [PR] Feat/make dfschema wrap schemaref [arrow-datafusion]

2024-03-11 Thread via GitHub
haohuaijin commented on PR #8905: URL: https://github.com/apache/arrow-datafusion/pull/8905#issuecomment-1989869443 @matthewmturner, thank you very much for your reply and openness to my participation in this PR. I'm excited for the opportunity to help move this project forward. In the

Re: [PR] Support ORDER BY in AggregateUDF [arrow-datafusion]

2024-03-11 Thread via GitHub
jayzhan211 commented on PR #9249: URL: https://github.com/apache/arrow-datafusion/pull/9249#issuecomment-1989859176 short summary for review: I think the current to discuss issue is to decide how can we have a flexible design for `create_accumulator`. FirstAccumulator introduce `ignore_nul

Re: [PR] Port `ArraySort` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
jayzhan211 commented on PR #9551: URL: https://github.com/apache/arrow-datafusion/pull/9551#issuecomment-1989835737 Thanks @erenavsarogullari -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] Port `ArraySort` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
jayzhan211 merged PR #9551: URL: https://github.com/apache/arrow-datafusion/pull/9551 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [I] Port `ArraySort` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
jayzhan211 closed issue #9550: Port `ArraySort` to `function-arrays` subcrate URL: https://github.com/apache/arrow-datafusion/issues/9550 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-40205: [Python] ListView arrow-to-pandas conversion [arrow]

2024-03-11 Thread via GitHub
danepitkin commented on code in PR #40482: URL: https://github.com/apache/arrow/pull/40482#discussion_r1520728304 ## python/pyarrow/src/arrow/python/arrow_to_pandas.cc: ## @@ -826,6 +796,145 @@ Status ConvertListsLike(PandasOptions options, const ChunkedArray& data, return S

Re: [PR] GH-40205: [Python] ListView arrow-to-pandas conversion [arrow]

2024-03-11 Thread via GitHub
danepitkin commented on code in PR #40482: URL: https://github.com/apache/arrow/pull/40482#discussion_r1520729931 ## cpp/src/arrow/array/array_nested.cc: ## @@ -351,122 +481,21 @@ Result> FlattenListViewArray(const ListViewArrayT& list_v return Concatenate(slices, memory_poo

Re: [PR] GH-40205: [Python] ListView arrow-to-pandas conversion [arrow]

2024-03-11 Thread via GitHub
danepitkin commented on code in PR #40482: URL: https://github.com/apache/arrow/pull/40482#discussion_r1520727478 ## cpp/src/arrow/array/array_nested.h: ## @@ -58,6 +58,16 @@ void SetListData(VarLengthListLikeArray* self, const std::shared_ptr& data,

Re: [PR] GH-40205: [Python] ListView arrow-to-pandas conversion [arrow]

2024-03-11 Thread via GitHub
github-actions[bot] commented on PR #40482: URL: https://github.com/apache/arrow/pull/40482#issuecomment-1989801966 :warning: GitHub issue #40205 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [I] Port `ArrayRepeat` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari commented on issue #9565: URL: https://github.com/apache/arrow-datafusion/issues/9565#issuecomment-1989801147 take -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] GH-40205: [Python] ListView arrow-to-pandas conversion [arrow]

2024-03-11 Thread via GitHub
danepitkin opened a new pull request, #40482: URL: https://github.com/apache/arrow/pull/40482 # WIP Seeking early feedback. ### Rationale for this change ListView should support converting to pandas/numpy in pyarrow. ### What changes are included in this PR?

[I] Port `ArrayRepeat` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari opened a new issue, #9565: URL: https://github.com/apache/arrow-datafusion/issues/9565 ### Is your feature request related to a problem or challenge? `ArrayRepeat` function needs to be ported to new `function-arrays` subcreate in terms of https://github.com/apache/ar

Re: [PR] [task #9539] Move starts_with, to_hex, trim, upper to datafusion-func… [arrow-datafusion]

2024-03-11 Thread via GitHub
Tangruilin commented on code in PR #9541: URL: https://github.com/apache/arrow-datafusion/pull/9541#discussion_r1520706223 ## datafusion/functions/src/string/starts_with.rs: ## @@ -0,0 +1,92 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

Re: [PR] feat(python): Add user-facing `Array` class [arrow-nanoarrow]

2024-03-11 Thread via GitHub
codecov-commenter commented on PR #396: URL: https://github.com/apache/arrow-nanoarrow/pull/396#issuecomment-1989757380 ## [Codecov](https://app.codecov.io/gh/apache/arrow-nanoarrow/pull/396?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_t

Re: [PR] feat(python): Add user-facing `Array` class [arrow-nanoarrow]

2024-03-11 Thread via GitHub
paleolimbot commented on code in PR #396: URL: https://github.com/apache/arrow-nanoarrow/pull/396#discussion_r1520691818 ## python/src/nanoarrow/array.py: ## @@ -0,0 +1,243 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

[I] Support Rust UDF [arrow-ballista]

2024-03-11 Thread via GitHub
yongda-fan opened a new issue, #993: URL: https://github.com/apache/arrow-ballista/issues/993 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [

Re: [PR] feat(python): Add user-facing `Array` class [arrow-nanoarrow]

2024-03-11 Thread via GitHub
paleolimbot commented on code in PR #396: URL: https://github.com/apache/arrow-nanoarrow/pull/396#discussion_r1520689687 ## python/src/nanoarrow/_lib.pyx: ## @@ -1066,10 +1068,15 @@ cdef class CArray: return out def __getitem__(self, k): +self._assert_val

Re: [PR] feat(python): Add user-facing `Array` class [arrow-nanoarrow]

2024-03-11 Thread via GitHub
paleolimbot commented on code in PR #396: URL: https://github.com/apache/arrow-nanoarrow/pull/396#discussion_r1520689186 ## python/src/nanoarrow/array.py: ## @@ -0,0 +1,243 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements

Re: [PR] feat(9493): provide access to FileMetaData for files written with ParquetSink [arrow-datafusion]

2024-03-11 Thread via GitHub
devinjdangelo commented on code in PR #9548: URL: https://github.com/apache/arrow-datafusion/pull/9548#discussion_r1520647339 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -1789,4 +1816,182 @@ mod tests { let format = ParquetFormat::default();

Re: [PR] Port `ArrayDistinct` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari commented on code in PR #9549: URL: https://github.com/apache/arrow-datafusion/pull/9549#discussion_r1520674606 ## docs/source/user-guide/sql/scalar_functions.md: ## @@ -2204,6 +2204,34 @@ array_dims(array) - list_dims +### `array_distinct` Review Comment

Re: [PR] Port `ArraySort` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari commented on code in PR #9551: URL: https://github.com/apache/arrow-datafusion/pull/9551#discussion_r1520650527 ## datafusion/functions-array/src/udf.rs: ## @@ -286,6 +286,70 @@ impl ScalarUDFImpl for ArrayDims { } } +make_udf_function!( +ArraySort,

Re: [PR] Port `ArraySort` to `function-arrays` subcrate [arrow-datafusion]

2024-03-11 Thread via GitHub
erenavsarogullari commented on code in PR #9551: URL: https://github.com/apache/arrow-datafusion/pull/9551#discussion_r1520650021 ## datafusion/functions-array/src/udf.rs: ## @@ -286,6 +286,70 @@ impl ScalarUDFImpl for ArrayDims { } } +make_udf_function!( +ArraySort,

[I] Parallel read for json compressed files when it should not [arrow-datafusion]

2024-03-11 Thread via GitHub
yongda-fan opened a new issue, #9564: URL: https://github.com/apache/arrow-datafusion/issues/9564 ### Describe the bug When reading a compressed json file, with `repartition_file_scans = true` (default value), datafusion try to uncompress the file with parallel read. This will cause

Re: [PR] MINOR: [Java] Bump org.codehaus.mojo:properties-maven-plugin from 1.1.0 to 1.2.1 in /java [arrow]

2024-03-11 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40461: URL: https://github.com/apache/arrow/pull/40461#issuecomment-1989692438 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 49b5619eed7556fcb3c7625cafc80e92d9d3f6c5. There were no

Re: [PR] MINOR: [Java] Bump org.apache.commons:commons-dbcp2 from 2.9.0 to 2.12.0 in /java [arrow]

2024-03-11 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #40460: URL: https://github.com/apache/arrow/pull/40460#issuecomment-1989690173 After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 8ed45601f906a5c1b7ff8b5bff53d4cf193ce526. There were no

Re: [PR] GH-40454: [CI][Debian] Update Debian to 12 from 11 [arrow]

2024-03-11 Thread via GitHub
github-actions[bot] commented on PR #40455: URL: https://github.com/apache/arrow/pull/40455#issuecomment-1989672750 Revision: 429fcda9dc0cef01e659d5e9e4735ca1c603f334 Submitted crossbow builds: [ursacomputing/crossbow @ actions-81621861bb](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-40454: [CI][Debian] Update Debian to 12 from 11 [arrow]

2024-03-11 Thread via GitHub
kou commented on PR #40455: URL: https://github.com/apache/arrow/pull/40455#issuecomment-1989669555 @github-actions crossbow submit -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] Systematic Configuration in 'Create External Table' and 'Copy To' Options [arrow-datafusion]

2024-03-11 Thread via GitHub
ozankabak commented on code in PR #9382: URL: https://github.com/apache/arrow-datafusion/pull/9382#discussion_r1520574998 ## datafusion/sqllogictest/test_files/copy.slt: ## @@ -469,8 +474,8 @@ select * from validate_arrow; # Error cases: # Copy from table with options -query

Re: [I] [C++] Crashed at TempStack alloc when use Hashing32::HashBatch independently [arrow]

2024-03-11 Thread via GitHub
kou commented on issue #40431: URL: https://github.com/apache/arrow/issues/40431#issuecomment-1989663465 Sorry. I missed this. Thanks. I could run the code: ```diff diff --git a/cpp/src/arrow/compute/key_hash_test.cc b/cpp/src/arrow/compute/key_hash_test.cc index c998df71

Re: [I] Use file statistics in query planning [arrow-datafusion]

2024-03-11 Thread via GitHub
matthewmturner commented on issue #7490: URL: https://github.com/apache/arrow-datafusion/issues/7490#issuecomment-1989650318 I can look into this. @alamb since this issue has been created are you aware of any work that has been completed that would impact this / the proposed solution

[PR] Remove physical expr of NamedStructField, convert to struct function [arrow-datafusion]

2024-03-11 Thread via GitHub
yyy1000 opened a new pull request, #9563: URL: https://github.com/apache/arrow-datafusion/pull/9563 ## Which issue does this PR close? Closes #9532 . ## Rationale for this change ## What changes are included in this PR? ## Are these changes

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
viirya commented on code in PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#discussion_r1520540137 ## .github/workflows/spark_sql_test.yml: ## @@ -0,0 +1,217 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on code in PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#discussion_r1520535347 ## dev/diffs/3.4.2.diff: ## @@ -0,0 +1,1306 @@ +diff --git a/pom.xml b/pom.xml +index fab98342498..f2156d790d1 100644 +--- a/pom.xml b/pom.xml +@@ -1

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on code in PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#discussion_r1520534259 ## .github/workflows/spark_sql_test.yml: ## @@ -0,0 +1,217 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license

Re: [PR] build: Run Spark SQL tests for 3.4 [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
sunchao commented on code in PR #166: URL: https://github.com/apache/arrow-datafusion-comet/pull/166#discussion_r1520533983 ## dev/diffs/3.4.2.diff: ## @@ -0,0 +1,1306 @@ +diff --git a/pom.xml b/pom.xml +index fab98342498..f2156d790d1 100644 +--- a/pom.xml b/pom.xml +@@ -1

Re: [PR] GH-40333: [Docs] Improve env var docs for ARROW_USER_SIMD_LEVEL [arrow]

2024-03-11 Thread via GitHub
github-actions[bot] commented on PR #40374: URL: https://github.com/apache/arrow/pull/40374#issuecomment-1989615930 Revision: 471126f28376bdcf2793199527fe4112ac2a74e4 Submitted crossbow builds: [ursacomputing/crossbow @ actions-6fac2d8e3b](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-40333: [Docs] Improve env var docs for ARROW_USER_SIMD_LEVEL [arrow]

2024-03-11 Thread via GitHub
amoeba commented on PR #40374: URL: https://github.com/apache/arrow/pull/40374#issuecomment-1989613662 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] feat: initial support string_view and binary_view, supports layout and basic construction + tests [arrow-rs]

2024-03-11 Thread via GitHub
tustvold commented on code in PR #5481: URL: https://github.com/apache/arrow-rs/pull/5481#discussion_r1520462566 ## arrow-array/src/types.rs: ## @@ -1544,6 +1546,101 @@ pub type BinaryType = GenericBinaryType; /// An arrow binary array with i64 offsets pub type LargeBinaryType

Re: [PR] chore(csharp): bump Google.Cloud.BigQuery.V2 from 3.6.0 to 3.7.0 in /csharp [arrow-adbc]

2024-03-11 Thread via GitHub
lidavidm merged PR #1607: URL: https://github.com/apache/arrow-adbc/pull/1607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] chore(csharp): bump Apache.Arrow from 15.0.0 to 15.0.1 in /csharp [arrow-adbc]

2024-03-11 Thread via GitHub
lidavidm merged PR #1606: URL: https://github.com/apache/arrow-adbc/pull/1606 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.a

Re: [PR] Add stateless prepared statements [arrow-rs]

2024-03-11 Thread via GitHub
matthewmturner commented on PR #5497: URL: https://github.com/apache/arrow-rs/pull/5497#issuecomment-1989518393 Apologies, accidentally created this here instead of our fork -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Add stateless prepared statements [arrow-rs]

2024-03-11 Thread via GitHub
matthewmturner commented on PR #5497: URL: https://github.com/apache/arrow-rs/pull/5497#issuecomment-1989514912 This is based on a diff from https://github.com/apache/arrow-rs/pull/5433 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Add stateless prepared statements [arrow-rs]

2024-03-11 Thread via GitHub
matthewmturner closed pull request #5497: Add stateless prepared statements URL: https://github.com/apache/arrow-rs/pull/5497 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] Add stateless prepared statements [arrow-rs]

2024-03-11 Thread via GitHub
matthewmturner opened a new pull request, #5497: URL: https://github.com/apache/arrow-rs/pull/5497 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing c

Re: [PR] Change port number to 8008 [arrow-experiments]

2024-03-11 Thread via GitHub
ianmcook merged PR #16: URL: https://github.com/apache/arrow-experiments/pull/16 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

Re: [I] [C++][FS][Azure] Add AzureFileSystem support to FileSystemFromUri() [arrow]

2024-03-11 Thread via GitHub
kou commented on issue #40028: URL: https://github.com/apache/arrow/issues/40028#issuecomment-1989486778 Issue resolved by pull request 40325 https://github.com/apache/arrow/pull/40325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] GH-40028: [C++][FS][Azure] Add AzureFileSystem support to FileSystemFromUri() [arrow]

2024-03-11 Thread via GitHub
kou merged PR #40325: URL: https://github.com/apache/arrow/pull/40325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [I] [R] Create simple example of R HTTP GET Arrow client [arrow]

2024-03-11 Thread via GitHub
ianmcook commented on issue #40477: URL: https://github.com/apache/arrow/issues/40477#issuecomment-1989481317 Completed in https://github.com/apache/arrow-experiments/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] GH-40454: [CI][Debian] Update Debian to 12 from 11 [arrow]

2024-03-11 Thread via GitHub
github-actions[bot] commented on PR #40455: URL: https://github.com/apache/arrow/pull/40455#issuecomment-1989479376 Revision: 3a5b56801dc1d1d87eb88cf1a0409f0cd1c9980a Submitted crossbow builds: [ursacomputing/crossbow @ actions-01f19214ef](https://github.com/ursacomputing/crossbow/bra

Re: [I] [Python] Create simple example of Python HTTP GET Arrow server [arrow]

2024-03-11 Thread via GitHub
ianmcook commented on issue #40476: URL: https://github.com/apache/arrow/issues/40476#issuecomment-1989478581 Added support for HTTP/1.1 and chunked transfer encoding in https://github.com/apache/arrow-experiments/pull/12 -- This is an automated message from the Apache Git Service. To res

Re: [I] [Python] Create simple example of Python HTTP GET Arrow server [arrow]

2024-03-11 Thread via GitHub
ianmcook commented on issue #40476: URL: https://github.com/apache/arrow/issues/40476#issuecomment-1989477463 Completed in https://github.com/apache/arrow-experiments/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [I] [Python] Create simple example of Python HTTP GET Arrow client [arrow]

2024-03-11 Thread via GitHub
ianmcook commented on issue #40475: URL: https://github.com/apache/arrow/issues/40475#issuecomment-1989476319 Completed in https://github.com/apache/arrow-experiments/pull/1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] GH-40454: [CI][Debian] Update Debian to 12 from 11 [arrow]

2024-03-11 Thread via GitHub
kou commented on PR #40455: URL: https://github.com/apache/arrow/pull/40455#issuecomment-1989475457 @github-actions crossbow submit test-debian-12-python-3-amd64 -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

Re: [PR] refactor: Skipping slicing on shuffle arrays in shuffle reader [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
viirya commented on PR #189: URL: https://github.com/apache/arrow-datafusion-comet/pull/189#issuecomment-1989464629 Merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] refactor: Skipping slicing on shuffle arrays in shuffle reader [arrow-datafusion-comet]

2024-03-11 Thread via GitHub
viirya merged PR #189: URL: https://github.com/apache/arrow-datafusion-comet/pull/189 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

Re: [PR] feat(9493): provide access to FileMetaData for files written with ParquetSink [arrow-datafusion]

2024-03-11 Thread via GitHub
alamb commented on code in PR #9548: URL: https://github.com/apache/arrow-datafusion/pull/9548#discussion_r1520436449 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -717,7 +734,18 @@ impl DataSink for ParquetSink { while let Some(result) = file_write_ta

Re: [PR] GH-40376: [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize [arrow]

2024-03-11 Thread via GitHub
jorisvandenbossche commented on PR #40418: URL: https://github.com/apache/arrow/pull/40418#issuecomment-1989453066 ``` C:\Python310\Lib\site-packages\numpy\_core\include\numpy\ndarraytypes.h(1250,38): warning C4200: nonstandard extension used: zero-sized array in struct/union [C:\arrow\

Re: [PR] Support Serde for ScalarUDF in Physical Expressions [arrow-datafusion]

2024-03-11 Thread via GitHub
thinkharderdev commented on PR #9436: URL: https://github.com/apache/arrow-datafusion/pull/9436#issuecomment-1989449071 > > I think it is fine to replace `ScalarFunctionImplementation` with `ScalarUDF` and end up with something like: > > ``` > > pub struct ScalarFunctionExpr { > >

Re: [PR] GH-40376: [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize [arrow]

2024-03-11 Thread via GitHub
jorisvandenbossche commented on PR #40418: URL: https://github.com/apache/arrow/pull/40418#issuecomment-1989447919 @seberg indeed there are windows 2.0.0b1 wheels, and those should be used in the failing build at https://github.com/ursacomputing/crossbow/actions/runs/8235147762/job/22518586

Re: [PR] GH-40333: [Docs] Improve env var docs for ARROW_USER_SIMD_LEVEL [arrow]

2024-03-11 Thread via GitHub
github-actions[bot] commented on PR #40374: URL: https://github.com/apache/arrow/pull/40374#issuecomment-1989447756 Revision: 46bcac8bec4896cec739ebb1151e82f1e54362f2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-25dc790af5](https://github.com/ursacomputing/crossbow/bra

[PR] chore(csharp): bump Apache.Arrow from 15.0.0 to 15.0.1 in /csharp [arrow-adbc]

2024-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1606: URL: https://github.com/apache/arrow-adbc/pull/1606 Bumps [Apache.Arrow](https://github.com/apache/arrow) from 15.0.0 to 15.0.1. Commits https://github.com/apache/arrow/commit/5ce6ff434c1e7daaa2d7f134349f3ce4c22683da";>5ce6ff4 M

Re: [PR] GH-40333: [Docs] Improve env var docs for ARROW_USER_SIMD_LEVEL [arrow]

2024-03-11 Thread via GitHub
amoeba commented on PR #40374: URL: https://github.com/apache/arrow/pull/40374#issuecomment-1989444389 Thanks for taking another look at this @pitrou. I think the new paragraph you added addresses both of your most recent comments. With that, I think this is very close if not ready to be me

[PR] chore(csharp): bump Google.Cloud.BigQuery.V2 from 3.6.0 to 3.7.0 in /csharp [arrow-adbc]

2024-03-11 Thread via GitHub
dependabot[bot] opened a new pull request, #1607: URL: https://github.com/apache/arrow-adbc/pull/1607 Bumps [Google.Cloud.BigQuery.V2](https://github.com/googleapis/google-cloud-dotnet) from 3.6.0 to 3.7.0. Commits https://github.com/googleapis/google-cloud-dotnet/commit/bf2b9

Re: [PR] GH-40376: [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize [arrow]

2024-03-11 Thread via GitHub
kou commented on PR #40418: URL: https://github.com/apache/arrow/pull/40418#issuecomment-1989445066 FYI: We can see uploaded wheels at https://anaconda.org/scientific-python-nightly-wheels/numpy/files -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] GH-40333: [Docs] Improve env var docs for ARROW_USER_SIMD_LEVEL [arrow]

2024-03-11 Thread via GitHub
amoeba commented on PR #40374: URL: https://github.com/apache/arrow/pull/40374#issuecomment-1989444528 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] GH-40376: [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize [arrow]

2024-03-11 Thread via GitHub
kou commented on code in PR #40418: URL: https://github.com/apache/arrow/pull/40418#discussion_r1520432018 ## python/pyarrow/src/arrow/python/numpy_interop.h: ## @@ -67,6 +67,13 @@ #define NPY_INT32_IS_INT 0 #endif +// Backported NumPy 2 API (can be removed if numpy 2 is req

Re: [PR] GH-40376: [Python] Update for NumPy 2.0 ABI change in PyArray_Descr->elsize [arrow]

2024-03-11 Thread via GitHub
seberg commented on PR #40418: URL: https://github.com/apache/arrow/pull/40418#issuecomment-1989443829 > numpy-2.0.0b1 should also be good enough Yes for sure. But it might be that the previous `.dev` wheels didn't upload successfully for windows? It seems like there should be `b1`

  1   2   3   4   >