Re: [PR] GH-20127: [Python] Remove deprecated pyarrow.filesystem legacy implementations [arrow]

2024-01-29 Thread via GitHub
jorisvandenbossche commented on code in PR #39825: URL: https://github.com/apache/arrow/pull/39825#discussion_r1470718399 ## python/pyarrow/fs.py: ## @@ -123,15 +123,6 @@ def _ensure_filesystem( return LocalFileSystem(use_mmap=use_mmap) return PyFil

Re: [I] [Python] Failed to build pyarrow using python:3.10-alpine docker image [arrow]

2024-01-29 Thread via GitHub
hteeyeoh commented on issue #39846: URL: https://github.com/apache/arrow/issues/39846#issuecomment-1916255800 Hi @kou, below are the example of Dockerfile and requirement.txt that I have [Dockerfile.txt](https://github.com/apache/arrow/files/14094462/Dockerfile.txt) [requirement.txt](

Re: [I] [Python] Failed to build pyarrow using python:3.10-alpine docker image [arrow]

2024-01-29 Thread via GitHub
kou commented on issue #39846: URL: https://github.com/apache/arrow/issues/39846#issuecomment-1916243718 Could you provide a `Dockerfile` that reproduces this error? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[PR] Handle invalid types for negation [arrow-datafusion]

2024-01-29 Thread via GitHub
trungda opened a new pull request, #9066: URL: https://github.com/apache/arrow-datafusion/pull/9066 ## Which issue does this PR close? https://github.com/apache/arrow-datafusion/issues/9060 Closes #. ## Rationale for this change For non-numeric/non-interval/non-timesta

Re: [PR] [MINOR] Alter a SHJ test for relaxing "on" condition [arrow-datafusion]

2024-01-29 Thread via GitHub
jackwener merged PR #9065: URL: https://github.com/apache/arrow-datafusion/pull/9065 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
jackwener commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1916225662 I think use `expr` instead of `column` in `condition` is ok for me. Some databases materialize `expressions` into `columns` in projects mainly to facilitate the calculat

Re: [PR] GH-39845: [C++][Parquet] Minor: avoid creating a new Reader object in Decoder::SetData [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39847: URL: https://github.com/apache/arrow/pull/39847#issuecomment-1916211268 :warning: GitHub issue #39845 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-39845: [C++][Parquet] Minor: avoid creating a new Reader object in Decoder::SetData [arrow]

2024-01-29 Thread via GitHub
mapleFU opened a new pull request, #39847: URL: https://github.com/apache/arrow/pull/39847 ### Rationale for this change avoid creating a new Reader object in Decoder::SetData ### What changes are included in this PR? avoid creating a new Reader ob

Re: [PR] MINOR: [Java] Bump io.netty:netty-bom from 4.1.105.Final to 4.1.106.Final in /java [arrow]

2024-01-29 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39834: URL: https://github.com/apache/arrow/pull/39834#issuecomment-1916199425 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 91d65b79f71a1be6a0bf7426e0ee91dd2e65a852. There were no

[PR] [MINOR] Alter a SHJ test for relaxing "on" condition [arrow-datafusion]

2024-01-29 Thread via GitHub
metesynnada opened a new pull request, #9065: URL: https://github.com/apache/arrow-datafusion/pull/9065 ## Which issue does this PR close? Alter a test to check the functionality of https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915736893 ## Rational

Re: [PR] Return proper number of expressions for nth_value_agg [arrow-datafusion]

2024-01-29 Thread via GitHub
mustafasrepo commented on code in PR #9044: URL: https://github.com/apache/arrow-datafusion/pull/9044#discussion_r1470641737 ## datafusion/proto/tests/cases/roundtrip_physical_plan.rs: ## @@ -337,16 +338,16 @@ fn rountrip_aggregate() -> Result<()> { DataType::Float6

Re: [PR] Support Copy with Remote Object Stores in datafusion-cli [arrow-datafusion]

2024-01-29 Thread via GitHub
manoj-inukolunu commented on PR #9064: URL: https://github.com/apache/arrow-datafusion/pull/9064#issuecomment-1916134127 Hello @alamb , Here is a draft for https://github.com/apache/arrow-datafusion/issues/8907 . I am yet to add tests. Currently `DefaultObjectStoreRegistry` ``` pub

Re: [I] Parquet: How to do concurrent decoding over columns? [arrow-rs]

2024-01-29 Thread via GitHub
Liyixin95 commented on issue #5120: URL: https://github.com/apache/arrow-rs/issues/5120#issuecomment-1916130932 > Ok I'll have a think about how to approach this and get back to you in a couple of days hello, have you found any approach? -- This is an automated message from the Apa

Re: [PR] GH-39843: [C++][Parquet] Parquet binary length overflow exception should contain the length of binary [arrow]

2024-01-29 Thread via GitHub
wgtmac commented on code in PR #39844: URL: https://github.com/apache/arrow/pull/39844#discussion_r1470625648 ## cpp/src/parquet/encoding.cc: ## @@ -2411,7 +2414,11 @@ class DeltaBitPackDecoder : public DecoderImpl, virtual public TypedDecodernum_values_ = num_values; -deco

[PR] Support Copy with Remote Object Stores in datafusion-cli [arrow-datafusion]

2024-01-29 Thread via GitHub
manoj-inukolunu opened a new pull request, #9064: URL: https://github.com/apache/arrow-datafusion/pull/9064 ## Which issue does this PR close? Closes #. ## Rationale for this change ## What changes are included in this PR? ## Are these chang

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
kou commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1916122900 `cpp/src/arrow/**/*.cc` are built before jemalloc is built: https://github.com/apache/arrow/actions/runs/7705131604/job/20998591461?pr=39824#step:6:1465 ```text [48/1143] Bui

Re: [PR] Minor: support `FixedSizeList` type coercion [arrow-datafusion]

2024-01-29 Thread via GitHub
Weijun-H commented on PR #8902: URL: https://github.com/apache/arrow-datafusion/pull/8902#issuecomment-1916094521 @jayzhan211 Thank you for your review -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] MINOR: [Java] Bump org.apache.hadoop:hadoop-common from 2.7.1 to 3.3.6 in /java [arrow]

2024-01-29 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39833: URL: https://github.com/apache/arrow/pull/39833#issuecomment-1916059299 After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 3b8b700348f5d73fa4cfdb2780b0bde5d83a7f22. There were no

Re: [PR] Minor: support `FixedSizeList` type coercion [arrow-datafusion]

2024-01-29 Thread via GitHub
Weijun-H commented on code in PR #8902: URL: https://github.com/apache/arrow-datafusion/pull/8902#discussion_r1470567514 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -577,19 +578,14 @@ fn coerce_arguments_for_fun( let mut expressions: Vec = expressions.to_

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1916025311 Revision: 9e0a933f7ae607bcd54572505b8add19d6e89227 Submitted crossbow builds: [ursacomputing/crossbow @ actions-72cbf5639a](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
kou commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1916023543 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] Eliminating multi-column sort when major column is a one-to-one and monotonic expression [arrow-datafusion]

2024-01-29 Thread via GitHub
Lordworms commented on issue #8838: URL: https://github.com/apache/arrow-datafusion/issues/8838#issuecomment-1916013954 This issue should come with three demands, two new function and one new bug - [x] support disable single expr function (like CAST(x) BIGINT) - [] support disable comp

Re: [PR] GH-39843: [C++][Parquet] Parquet binary length overflow exception should contain the length of binary [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39844: URL: https://github.com/apache/arrow/pull/39844#issuecomment-1915993355 :warning: GitHub issue #39843 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-39843: [C++][Parquet] Parquet binary length overflow exception should contain the length of binary [arrow]

2024-01-29 Thread via GitHub
mapleFU opened a new pull request, #39844: URL: https://github.com/apache/arrow/pull/39844 ### Rationale for this change See https://github.com/apache/arrow/issues/39843 It will be great to contain a string length in decoder. ### What changes are included

Re: [PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
kou merged PR #39842: URL: https://github.com/apache/arrow/pull/39842 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
kou commented on PR #39842: URL: https://github.com/apache/arrow/pull/39842#issuecomment-1915982588 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mai

Re: [I] [Python] Parsing timestamp with microsecond in CSV fails even with %f [arrow]

2024-01-29 Thread via GitHub
kou commented on issue #39839: URL: https://github.com/apache/arrow/issues/39839#issuecomment-1915981816 The `strptime()` format is processed by C's `strptime()` not Python's `strptime`. So we can't use `%f` in it... -- This is an automated message from the Apache Git Service. To respo

Re: [PR] GH-39712: [Java] Enable code review and formatting code through Spotless Maven plugin [arrow]

2024-01-29 Thread via GitHub
davisusanibar commented on PR #39713: URL: https://github.com/apache/arrow/pull/39713#issuecomment-1915978100 > The documentation has been added, please review if more details need to be added. > > CC: @vibhatha @lidavidm @danepitkin File at: https://github.com/apache/arrow/bl

Re: [PR] GH-39712: [Java] Enable code review and formatting code through Spotless Maven plugin [arrow]

2024-01-29 Thread via GitHub
davisusanibar commented on PR #39713: URL: https://github.com/apache/arrow/pull/39713#issuecomment-1915975328 The documentation has been added, please review if more details need to be added. CC: @vibhatha @lidavidm @danepitkin -- This is an automated message from the Apache Git S

Re: [PR] Minor: support `FixedSizeList` type coercion [arrow-datafusion]

2024-01-29 Thread via GitHub
jayzhan211 commented on code in PR #8902: URL: https://github.com/apache/arrow-datafusion/pull/8902#discussion_r1470525879 ## datafusion/optimizer/src/analyzer/type_coercion.rs: ## @@ -577,19 +578,14 @@ fn coerce_arguments_for_fun( let mut expressions: Vec = expressions.t

Re: [I] [Python][FlightRPC] Feature request: Add some sort of fork protection to PyArrow Flight [arrow]

2024-01-29 Thread via GitHub
amoeba commented on issue #38617: URL: https://github.com/apache/arrow/issues/38617#issuecomment-1915973767 > When you say "Forking then using a FlightClient from the child process" above, was the FlightClient created in the parent before forking or in the child? Yeah, I should've ma

Re: [PR] doc: update Rust version compatibility [arrow-datafusion]

2024-01-29 Thread via GitHub
Weijun-H commented on code in PR #9062: URL: https://github.com/apache/arrow-datafusion/pull/9062#discussion_r1470525269 ## README.md: ## @@ -94,4 +94,4 @@ Optional features: ## Rust Version Compatibility -This crate is tested with the latest stable version of Rust. We do n

[I] Read row groups with random access from parquet files [arrow-rs]

2024-01-29 Thread via GitHub
xmakro opened a new issue, #5343: URL: https://github.com/apache/arrow-rs/issues/5343 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I'd like to read row groups from a parquet file with random access. Ideally both for async a

Re: [PR] MINOR: [Java] Bump org.apache.maven.plugins:maven-gpg-plugin from 1.5 to 3.1.0 in /java [arrow]

2024-01-29 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39832: URL: https://github.com/apache/arrow/pull/39832#issuecomment-1915929245 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 7fd59739fddf4b614c68d57e24068542b4cf2884. There were no

[PR] Add extension context to SessionState. [arrow-datafusion]

2024-01-29 Thread via GitHub
yuyang-ok opened a new pull request, #9063: URL: https://github.com/apache/arrow-datafusion/pull/9063 This trying to fix #8926. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Minor: improve scalar functions document [arrow-datafusion]

2024-01-29 Thread via GitHub
comphead merged PR #9029: URL: https://github.com/apache/arrow-datafusion/pull/9029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

Re: [PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39842: URL: https://github.com/apache/arrow/pull/39842#issuecomment-1915905716 Revision: a43d1ea814c08fb3dc1980053717e48aafa41774 Submitted crossbow builds: [ursacomputing/crossbow @ actions-c00f143729](https://github.com/ursacomputing/crossbow/bra

Re: [PR] doc: update Rust version compatibility [arrow-datafusion]

2024-01-29 Thread via GitHub
comphead commented on PR #9062: URL: https://github.com/apache/arrow-datafusion/pull/9062#issuecomment-1915902379 Related to https://github.com/apache/arrow-datafusion/pull/8997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39842: URL: https://github.com/apache/arrow/pull/39842#issuecomment-1915900961 :warning: GitHub issue #39841 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
kou commented on PR #39842: URL: https://github.com/apache/arrow/pull/39842#issuecomment-1915900817 @github-actions crossbow submit -g linux -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] doc: update Rust version compatibility [arrow-datafusion]

2024-01-29 Thread via GitHub
comphead opened a new pull request, #9062: URL: https://github.com/apache/arrow-datafusion/pull/9062 ## Which issue does this PR close? Closes #. ## Rationale for this change Updates the Rust version compatibility section to point to minimum supported Rust version ex

[PR] GH-39841: [GLib] Add support for GLib 2.56 again [arrow]

2024-01-29 Thread via GitHub
kou opened a new pull request, #39842: URL: https://github.com/apache/arrow/pull/39842 ### Rationale for this change It's still used in CentOS 7 and AlmaLinux 8. ### What changes are included in this PR? Don't use `g_time_zone_get_identifier()` with GLib < 2.58. ##

Re: [I] How to provide a custom `Context` in `TableProvider`'s `scan`? [arrow-datafusion]

2024-01-29 Thread via GitHub
yuyang-ok commented on issue #8926: URL: https://github.com/apache/arrow-datafusion/issues/8926#issuecomment-1915892318 sure. @alamb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] `NestedLoopsJoin` memory tracking may be insufficient [arrow-datafusion]

2024-01-29 Thread via GitHub
yyy1000 commented on issue #8952: URL: https://github.com/apache/arrow-datafusion/issues/8952#issuecomment-1915867842 I'd like a try to help it. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] How to call a asynchronous function in a synchronous function? [arrow-datafusion]

2024-01-29 Thread via GitHub
xiaobai721 commented on issue #9042: URL: https://github.com/apache/arrow-datafusion/issues/9042#issuecomment-1915848628 @alamb Thanks for your formatting code. Actually I have to spawn several threads because workers have to handle zmq requests, router & dealer module. So I'd like to know

[PR] chore(date_trunc): turn panic into unimplemented error for ambiguous time [arrow-datafusion]

2024-01-29 Thread via GitHub
appletreeisyellow opened a new pull request, #9061: URL: https://github.com/apache/arrow-datafusion/pull/9061 ## Which issue does this PR close? Part of #8899 ## Rationale for this change `date_trunc` panics when time is ambiguous between "daylight saving time" and stand

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
github-actions[bot] commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1915826618 Revision: 6ac50a97caf45580c25bb41b30df5d6b439f86dd Submitted crossbow builds: [ursacomputing/crossbow @ actions-ba496ab751](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
kou commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1915824227 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [I] [C++] Don't wait for building all thirdparty depenedencies to build Arrow itself [arrow]

2024-01-29 Thread via GitHub
kou commented on issue #39823: URL: https://github.com/apache/arrow/issues/39823#issuecomment-1915812165 `arrow_shared`/`arrow_static` CMake targets depend on all thirdparty libraries (e.g.: `target_link_libraries(arrow_shared PUBLIC Azure::azure-storage-files-datalake)`). If we do it, `cpp

Re: [PR] MINOR: [Java] Bump org.immutables:value from 2.8.2 to 2.10.0 in /java [arrow]

2024-01-29 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39831: URL: https://github.com/apache/arrow/pull/39831#issuecomment-1915762591 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit fc3278ffb78e6f4f79cd20160bf911efa5a09ba1. There were no

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915736893 > Yeah, but presumably you will have to update the Join operators in DataFusion to take Expressions (rather than `Column`s) and evaluate them on the inputs I think this is

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
Omega359 commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470329669 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarVal

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470319653 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarValue:

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
Omega359 commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470315212 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarVal

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915696558 > As this PR relaxes the join key constraints for DataFusion Join operators, we don't need to add such Projection during translating Spark query plan in Comet. Yeah, but pr

Re: [PR] Remove some recursive cloning from logical planning [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on PR #9050: URL: https://github.com/apache/arrow-datafusion/pull/9050#issuecomment-1915694810 Here are the results of running the planning benchmark: Command: ```shell cargo bench --bench sql_planner ``` I compared to 92104a54469c8401b799fb3fa855f3f1cb

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915694340 So in Spark the physical query plan doesn't have additional Projection under Join operator for the join key expressions, in generally we would like to keep it as it without chang

Re: [PR] feat(go/adbc/driver/flightsql): propagate cookies to sub-clients [arrow-adbc]

2024-01-29 Thread via GitHub
paleolimbot commented on PR #1497: URL: https://github.com/apache/arrow-adbc/pull/1497#issuecomment-1915688418 Thanks for the R changes...before 0.10.0 / Arrow 16 I will try to update arrow/go so that `go mod vendor` doesn't skip those file! -- This is an automated message from the Apache

Re: [PR] Remove some recursive cloning from logical planning [arrow-datafusion]

2024-01-29 Thread via GitHub
simonvandel commented on PR #9050: URL: https://github.com/apache/arrow-datafusion/pull/9050#issuecomment-1915686537 > what comes to my mind if we have a `let x = vec![0, 1, 2, 3]` > and call `s.swap_remove(0)` the remaining collection can be either `vec![3, 1, 2]` or `vec![3, 2, 1]`. So

Re: [PR] Update minimum rust version to 1.71 [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #8997: URL: https://github.com/apache/arrow-datafusion/pull/8997#discussion_r1470305188 ## datafusion/core/Cargo.toml: ## @@ -27,7 +27,7 @@ homepage = { workspace = true } repository = { workspace = true } license = { workspace = true } authors =

Re: [PR] GH-34865: [C++][Java][Flight RPC] Add Session management messages [arrow]

2024-01-29 Thread via GitHub
indigophox commented on PR #34817: URL: https://github.com/apache/arrow/pull/34817#issuecomment-1915684555 @lidavidm I think that's about it, then. Saw you set yourself up for another review, should all be final now and good to go. -- This is an automated message from the Apache Git Serv

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470303689 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470303456 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

[PR] GH-38703: [C++][FS][Azure] Implement DeleteFile() [arrow]

2024-01-29 Thread via GitHub
av8or1 opened a new pull request, #39840: URL: https://github.com/apache/arrow/pull/39840 Modifications to support the deletion of a file. To support the deletion of files on Azure. Resolution of the conflict in ~cpp/src/arrow/filesystem/azurefs_test.cc. Her

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470303232 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470302950 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -149,6 +153,11 @@ impl PhysicalExpr for ScalarFunctionExpr { { vec![Colu

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
Omega359 commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470302230 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarVal

Re: [PR] Remove some recursive cloning from logical planning [arrow-datafusion]

2024-01-29 Thread via GitHub
comphead commented on PR #9050: URL: https://github.com/apache/arrow-datafusion/pull/9050#issuecomment-191567 > We basically drain the vector so there shouldn't be any leftover collection if I am not misunderstanding your concern (we make N calls to swap_remove when dealing with an ope

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470295968 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does not in

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915670650 > Hmm, except for joining keys, I think you still can list other columns (e.g., the original 4 columns) into selection list? So they are not always able to be removed from shuffle

Re: [I] Internal error when negating strings [arrow-datafusion]

2024-01-29 Thread via GitHub
yyy1000 commented on issue #9060: URL: https://github.com/apache/arrow-datafusion/issues/9060#issuecomment-1915668419 > > Can I help it ? (Or left it as an issue for a new comer) 😃 > > > > > > how about we wait a day or two to see if anyone else is interested ? >

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
Omega359 commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470289270 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarVal

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470288307 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()>

Re: [I] Internal error when negating strings [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on issue #9060: URL: https://github.com/apache/arrow-datafusion/issues/9060#issuecomment-1915663223 > Can I help it ? (Or left it as an issue for a new comer) 😃 how about we wait a day or two to see if anyone else is interested ? Thank you ! -- This is a

Re: [I] Internal error when negating strings [arrow-datafusion]

2024-01-29 Thread via GitHub
yyy1000 commented on issue #9060: URL: https://github.com/apache/arrow-datafusion/issues/9060#issuecomment-1915656016 Can I help it ? (Or left it as an issue for a new comer) 😃 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Add a make_date function #9024 [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9040: URL: https://github.com/apache/arrow-datafusion/pull/9040#discussion_r1470271635 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -497,6 +502,99 @@ pub fn make_current_time( move |_arg| Ok(ColumnarValue::Scalar(ScalarValue:

Re: [PR] Remove single_file_output option from FileSinkConfig and Copy statement [arrow-datafusion]

2024-01-29 Thread via GitHub
yyy1000 commented on PR #9041: URL: https://github.com/apache/arrow-datafusion/pull/9041#issuecomment-1915652448 Haha, I'm happy that I could help it. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470269524 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470269524 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] Feat/make dfschema wrap schemaref [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #8905: URL: https://github.com/apache/arrow-datafusion/pull/8905#discussion_r1470266137 ## datafusion/common/src/dfschema.rs: ## @@ -106,10 +107,11 @@ pub type DFSchemaRef = Arc; /// ``` #[derive(Debug, Clone, PartialEq, Eq)] pub struct DFSchema

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470259744 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] Speedup `DFSchema::merge` using HashSet indices [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on PR #9020: URL: https://github.com/apache/arrow-datafusion/pull/9020#issuecomment-1915632345 > Speeds up benchmarks in sql_planner.rs by 18-75%. However, physical planning benchmarks seem to regress by 1.8%. I wonder if that is an acceptable trade? I think

[I] Unnecessary ownership makes it harder to use `RecordBatch::schema` and `Schema::try_merge` than it needs to be [arrow-rs]

2024-01-29 Thread via GitHub
carols10cents opened a new issue, #5342: URL: https://github.com/apache/arrow-rs/issues/5342 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** I have a bunch of record batches, and I'd like to unify their schemas. I would like t

Re: [PR] Remove some recursive cloning from logical planning [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9050: URL: https://github.com/apache/arrow-datafusion/pull/9050#discussion_r1470254179 ## datafusion/optimizer/src/push_down_filter.rs: ## @@ -519,28 +497,19 @@ fn push_down_all_join( // it always will be the last element, otherwise resul

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470259744 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-01-29 Thread via GitHub
kou commented on code in PR #39824: URL: https://github.com/apache/arrow/pull/39824#discussion_r1470258915 ## cpp/src/arrow/compute/kernels/CMakeLists.txt: ## @@ -126,7 +130,8 @@ add_arrow_compute_test(aggregate_test SOURCES aggreg

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470256270 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470249302 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] MINOR: [CI] Bump matlab-actions/setup-matlab from 1 to 2 [arrow]

2024-01-29 Thread via GitHub
sgilmore10 commented on PR #39829: URL: https://github.com/apache/arrow/pull/39829#issuecomment-1915615888 > @kevingurney @sgilmore10 Should we update this and #39830 at once? Hi @kou, Yes, I think it would make sense to bump the versions of both `setup-matlab` and `run-tests

Re: [PR] Split physical_plan_tpch into separate benchmarks [arrow-datafusion]

2024-01-29 Thread via GitHub
matthewmturner commented on code in PR #9043: URL: https://github.com/apache/arrow-datafusion/pull/9043#discussion_r1470244954 ## datafusion/core/benches/sql_planner.rs: ## @@ -224,53 +224,17 @@ fn criterion_benchmark(c: &mut Criterion) { }) }); -let q1_sql =

Re: [I] Document streaming usecase (like `UNBOUNDED` tables) [arrow-datafusion]

2024-01-29 Thread via GitHub
trungda commented on issue #9016: URL: https://github.com/apache/arrow-datafusion/issues/9016#issuecomment-1915607969 I've realized that `StreamEncoding` is not supported for Parquet. Is it intentional? -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470242498 ## datafusion/physical-expr/src/scalar_function.rs: ## @@ -58,6 +58,8 @@ pub struct ScalarFunctionExpr { // and it specifies the effect of an increase or d

Re: [PR] Remove custom doubling strategy + add examples to `VecAllocEx` [arrow-datafusion]

2024-01-29 Thread via GitHub
kallisti-dev commented on code in PR #9058: URL: https://github.com/apache/arrow-datafusion/pull/9058#discussion_r1470242305 ## datafusion/execution/src/memory_pool/proxy.rs: ## @@ -36,24 +55,37 @@ pub trait VecAllocExt { /// Note this calculation is not recursive, and does

Re: [PR] Remove some recursive cloning from logical planning [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on PR #9050: URL: https://github.com/apache/arrow-datafusion/pull/9050#issuecomment-1915605735 > It would be interesting see what the effects are on the planning benchmarks in sql_planning. I will run these and report my findings here -- This is an automated m

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470241743 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470240693 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470241531 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()

Re: [PR] Relax join keys constraint from Column to any physical expression for physical join operators [arrow-datafusion]

2024-01-29 Thread via GitHub
viirya commented on PR #8991: URL: https://github.com/apache/arrow-datafusion/pull/8991#issuecomment-1915602228 > wouldn't it actually make more sense to compute the expressions prior to the networked shuffle so only 2 columns of data (`lcol_1 + lcol_2` and `rcol_1 + rcol_2`) need to be se

Re: [PR] ScalarUDF with zero arguments should be provided with one null array as parameter [arrow-datafusion]

2024-01-29 Thread via GitHub
alamb commented on code in PR #9031: URL: https://github.com/apache/arrow-datafusion/pull/9031#discussion_r1470226144 ## datafusion/core/tests/user_defined/user_defined_scalar_functions.rs: ## @@ -392,6 +396,112 @@ async fn test_user_defined_functions_with_alias() -> Result<()>

Re: [PR] GH-39837: [Go][Flight] Allow cloning existing cookies in middleware [arrow]

2024-01-29 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39838: URL: https://github.com/apache/arrow/pull/39838#issuecomment-1915593791 After merging your PR, Conbench analyzed the 1 benchmarking run that has been run so far on merge-commit c2ca9bcedeb004f9d7f5d3e1aafc7b83ce6c1e3f. There were no b

  1   2   3   4   >