Re: [PR] GH-39823: [C++] Allow building cpp/src/arrow/**/*.cc without waiting bundled libraries [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39824: URL: https://github.com/apache/arrow/pull/39824#issuecomment-1925618665 @github-actions crossbow submit -g cpp -g r -g linux -g python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
llama90 commented on code in PR #39192: URL: https://github.com/apache/arrow/pull/39192#discussion_r1477194281 ## cpp/src/arrow/scalar.cc: ## @@ -884,9 +885,24 @@ std::string Scalar::ToString() const { return dict_scalar->value.dictionary->ToString() + "[" + dic

Re: [PR] GH-38309: [C++] build filesystems as separate modules [arrow]

2024-02-03 Thread via GitHub
kou commented on code in PR #39067: URL: https://github.com/apache/arrow/pull/39067#discussion_r1477193180 ## cpp/src/arrow/filesystem/filesystem.cc: ## @@ -738,6 +761,109 @@ Result> FileSystemFromUriReal(const Uri& uri, } // namespace +Status RegisterFileSystemFactory(st

Re: [PR] GH-38309: [C++] build filesystems as separate modules [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39067: URL: https://github.com/apache/arrow/pull/39067#issuecomment-1925609044 It makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] GH-39621: [CI][Packaging] Update vcpkg to 2024.01.12 release [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39622: URL: https://github.com/apache/arrow/pull/39622#issuecomment-1925608175 It seems that `arm64-linux` on aarch64 is processed as cross compiling... And guessed compiler command names (`aarch64-linux-gnu-{gcc,g++}`) are wrong (`aarch64-redhat-linux-{gcc,g++}` exist)

Re: [I] [GLib] GLib Installation Failure: Errors in gtkdoc Helper Script [arrow]

2024-02-03 Thread via GitHub
llama90 commented on issue #39935: URL: https://github.com/apache/arrow/issues/39935#issuecomment-1925602938 Aha... The `/opt/homebrew/etc/xml/catalog` is exist. ```bash ls /opt/homebrew/etc/xml/catalog /opt/homebrew/etc/xml/catalog cat /opt/homebrew/etc/xml/catalog

Re: [PR] GH-39928: [C++][Gandiva] Accept LLVM 18 [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39934: URL: https://github.com/apache/arrow/pull/39934#issuecomment-1925599908 @niyue Do you want to review this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-39909: [Java][CI] Update reference to Float16 testing file reference on Testing submodule [arrow]

2024-02-03 Thread via GitHub
kou merged PR #39911: URL: https://github.com/apache/arrow/pull/39911 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
kou commented on code in PR #39192: URL: https://github.com/apache/arrow/pull/39192#discussion_r1477187843 ## cpp/src/arrow/scalar.cc: ## @@ -884,9 +885,24 @@ std::string Scalar::ToString() const { return dict_scalar->value.dictionary->ToString() + "[" + dict_sc

Re: [PR] GH-39759: [Docs] Update pydata-sphinx-theme to 0.15.x [arrow]

2024-02-03 Thread via GitHub
kou commented on code in PR #39879: URL: https://github.com/apache/arrow/pull/39879#discussion_r1477187035 ## docs/requirements.txt: ## @@ -5,7 +5,7 @@ breathe ipython numpydoc -pydata-sphinx-theme~=0.14 +pydata-sphinx-theme~=0.15.2 Review Comment: Could you try this?

Re: [PR] MINOR: [DOCS] Updated "struct_field" kernel documentation [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39851: URL: https://github.com/apache/arrow/pull/39851#issuecomment-1925596897 Could you use `GH-33745: ` instead of `MINOR: `? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] GH-39759: [Docs] Update pydata-sphinx-theme to 0.15.x [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39879: URL: https://github.com/apache/arrow/pull/39879#issuecomment-1925595666 Revision: 102094668f7677aac543c17cd8947330203d17ff Submitted crossbow builds: [ursacomputing/crossbow @ actions-37f7ce3b94](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39759: [Docs] Update pydata-sphinx-theme to 0.15.x [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39879: URL: https://github.com/apache/arrow/pull/39879#issuecomment-1925595236 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [I] [GLib] GLib Installation Failure: Errors in gtkdoc Helper Script [arrow]

2024-02-03 Thread via GitHub
kou commented on issue #39935: URL: https://github.com/apache/arrow/issues/39935#issuecomment-1925594500 Hmm. If we use `$(brew --prefix)/etc/xml/catalog`, we don't need to download any files... Does `/opt/homebrew/etc/xml/catalog` exist? It seems that we should migrate our documen

Re: [I] [GLib] GLib Installation Failure: Errors in gtkdoc Helper Script [arrow]

2024-02-03 Thread via GitHub
llama90 commented on issue #39935: URL: https://github.com/apache/arrow/issues/39935#issuecomment-1925592259 The `XML_CATALOG_FILES` environment variable is set as follows. ```bash echo $XML_CATALOG_FILES /opt/homebrew/etc/xml/catalog ``` I have a suspicion that when at

Re: [PR] GH-39936: [C++] Refactor arrow::Datum by std::visit [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39937: URL: https://github.com/apache/arrow/pull/39937#issuecomment-1925589695 :warning: GitHub issue #39936 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-39936: [C++] Refactor arrow::Datum by std::visit [arrow]

2024-02-03 Thread via GitHub
JackDrogon opened a new pull request, #39937: URL: https://github.com/apache/arrow/pull/39937 Refactor arrow::Datum by std::visit, improve some performance by avoiding use (Datum::kind() + std::get) and also check type in compile time, so we can guarantee that the type is safe and do not ne

Re: [I] `ParquetExec::statistics()` does not read statistics for many column types (like timstamps, strings, etc) [arrow-datafusion]

2024-02-03 Thread via GitHub
Weijun-H commented on issue #8295: URL: https://github.com/apache/arrow-datafusion/issues/8295#issuecomment-1925585482 Could I pick this ticket up? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [I] [GLib] GLib Installation Failure: Errors in gtkdoc Helper Script [arrow]

2024-02-03 Thread via GitHub
kou commented on issue #39935: URL: https://github.com/apache/arrow/issues/39935#issuecomment-1925582619 Could you check whether you define `XML_CATALOG_FILES=$(brew --prefix)/etc/xml/catalog` or not? See also: https://github.com/apache/arrow/tree/main/c_glib#how-to-build-by-develope

Re: [I] [C++][Parquet] Static link of parquet failure [arrow]

2024-02-03 Thread via GitHub
kou commented on issue #39918: URL: https://github.com/apache/arrow/issues/39918#issuecomment-1925582287 Could you show all link error messages? > ```text > -larrow -larrow_bundled_dependencies -lparquet > ``` It should be atleast `-lparquet -larrow -larrow_bundled_depende

Re: [PR] GH-39928: [C++][Gandiva] Accept LLVM 18 [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39934: URL: https://github.com/apache/arrow/pull/39934#issuecomment-1925581552 Revision: 8bb0ca0bcd64d86a7ab7ec6eb1d6bf8ed1b59d06 Submitted crossbow builds: [ursacomputing/crossbow @ actions-7c9965348b](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39928: [C++][Gandiva] Accept LLVM 18 [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39934: URL: https://github.com/apache/arrow/pull/39934#issuecomment-1925581240 @github-actions crossbow submit ubuntu-noble-amd64 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] GH-39928: [C++][Gandiva] Accept LLVM 18 [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39934: URL: https://github.com/apache/arrow/pull/39934#issuecomment-1925581229 :warning: GitHub issue #39928 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-39928: [C++][Gandiva] Accept LLVM 18 [arrow]

2024-02-03 Thread via GitHub
kou opened a new pull request, #39934: URL: https://github.com/apache/arrow/pull/39934 ### Rationale for this change LLVM 18.1 will be released soon. ### What changes are included in this PR? Accept LLVM 18.1. ### Are these changes tested? Yes. ### Ar

Re: [I] [C++] arrow::fs::FileSystemFromUri() not thread-safe with s3 URIs [arrow]

2024-02-03 Thread via GitHub
kou commented on issue #39897: URL: https://github.com/apache/arrow/issues/39897#issuecomment-1925576948 I think that it's application's responsibility by design: https://github.com/apache/arrow/blob/22f2cfd1e1ebe49016b6d97c49f494287a98d02f/cpp/src/arrow/filesystem/s3fs.h#L374-L384

Re: [PR] GH-39909: [Java][CI] Update reference to Float16 testing file reference on Testing submodule [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39911: URL: https://github.com/apache/arrow/pull/39911#issuecomment-1925576436 Revision: b4038e0810c053b2b135dab744395dcd2bbcaa4c Submitted crossbow builds: [ursacomputing/crossbow @ actions-4c3794edda](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39930: [C++] Use Requires instead of Libs for system RE2 in arrow.pc [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39932: URL: https://github.com/apache/arrow/pull/39932#issuecomment-1925576306 Revision: 943d9275d2097a1bb253730b03e92013a5751aca Submitted crossbow builds: [ursacomputing/crossbow @ actions-d1674cc5d7](https://github.com/ursacomputing/crossbow/bra

Re: [PR] GH-39909: [Java][CI] Update reference to Float16 testing file reference on Testing submodule [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39911: URL: https://github.com/apache/arrow/pull/39911#issuecomment-1925576085 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] GH-39930: [C++] Use Requires instead of Libs for system RE2 in arrow.pc [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39932: URL: https://github.com/apache/arrow/pull/39932#issuecomment-1925575900 :warning: GitHub issue #39930 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] GH-39930: [C++] Use Requires instead of Libs for system RE2 in arrow.pc [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39932: URL: https://github.com/apache/arrow/pull/39932#issuecomment-1925575854 @github-actions crossbow submit -g cpp -g r -g linux -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] GH-39930: [C++] Use Requires instead of Libs for system RE2 in arrow.pc [arrow]

2024-02-03 Thread via GitHub
kou opened a new pull request, #39932: URL: https://github.com/apache/arrow/pull/39932 ### Rationale for this change We chose Libs{,.private} with libre2.a for system RE2 in GH-10626. Because "Require{,.private} re2" may add "-std=c++11". If "-std=c++11" was added, users can't build

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925570819 Thanks for your patience. Conbench analyzed the 1 benchmarking run that has been run so far on PR commit ddeef70948b2a072db1254ed7846b199ebe747f7. There were 2 be

[PR] MINOR: [C++][Parquet] Remove undefined GetArrowType from schema_internal.h [arrow]

2024-02-03 Thread via GitHub
wgtmac opened a new pull request, #39931: URL: https://github.com/apache/arrow/pull/39931 ### Rationale for this change We have redundant declarations below and the 1st one should be removed: ```cpp Result> GetArrowType(Type::type physical_type,

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
ursabot commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925556695 Benchmark runs are scheduled for commit ddeef70948b2a072db1254ed7846b199ebe747f7. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be po

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
Cerdore commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925556681 @ursabot please benchmark lang=C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
Cerdore commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925556399 > @Cerdore Could you rebase on main? Already rebased -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [PR] Return an error instead of a panic when reading a corrupted Parquet file with mismatched column counts [arrow-rs]

2024-02-03 Thread via GitHub
Jefffrey commented on code in PR #5362: URL: https://github.com/apache/arrow-rs/pull/5362#discussion_r1477158222 ## parquet/src/file/metadata.rs: ## @@ -349,7 +349,13 @@ impl RowGroupMetaData { /// Method to convert from Thrift. pub fn from_thrift(schema_descr: Schem

[PR] Return an error instead of a panic when reading a corrupted Parquet file with mismatched column counts [arrow-rs]

2024-02-03 Thread via GitHub
mmaitre314 opened a new pull request, #5362: URL: https://github.com/apache/arrow-rs/pull/5362 # Which issue does this PR close? Closes #5315 # Rationale for this change Data pipelines reading Parquet files may encounter corrupted Parquet files as part of their regular a

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
kou commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925477461 @Cerdore Could you rebase on main? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-39863: [C++] Thirdparty: Bump google benchmark to 1.8.3 [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39878: URL: https://github.com/apache/arrow/pull/39878#issuecomment-1925477463 :warning: GitHub issue #39863 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Fix the handling of partitioned ListingTables (only Parquet - currently) [arrow-ballista]

2024-02-03 Thread via GitHub
bcmcmill commented on PR #966: URL: https://github.com/apache/arrow-ballista/pull/966#issuecomment-1925469188 It would likely be better to update `parse_protobuf_file_scan_config` in `datafusion-proto` to be like the following, so that the fix actually extends to all file types. ```

Re: [PR] IP clearance form [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
viirya commented on code in PR #2: URL: https://github.com/apache/arrow-datafusion-comet/pull/2#discussion_r1477132885 ## ip-clearance-arrow-datafusion-comet.xml: ## @@ -0,0 +1,259 @@ + + + +Apache Arrow DataFusion Comet Codebase Intellectual Property (IP) Clearanc

Re: [PR] IP clearance form [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
viirya commented on code in PR #2: URL: https://github.com/apache/arrow-datafusion-comet/pull/2#discussion_r1477132885 ## ip-clearance-arrow-datafusion-comet.xml: ## @@ -0,0 +1,259 @@ + + + +Apache Arrow DataFusion Comet Codebase Intellectual Property (IP) Clearanc

Re: [PR] Add a README to the cache crate [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove merged PR #969: URL: https://github.com/apache/arrow-ballista/pull/969 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[PR] Add a README to the cache crate [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove opened a new pull request, #969: URL: https://github.com/apache/arrow-ballista/pull/969 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

Re: [PR] Update release verification script to try and publish `cache` crate not `core` crate [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove merged PR #968: URL: https://github.com/apache/arrow-ballista/pull/968 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[PR] fix release issue [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove opened a new pull request, #968: URL: https://github.com/apache/arrow-ballista/pull/968 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

Re: [PR] Update deployment docs for 0.12.0 [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove merged PR #967: URL: https://github.com/apache/arrow-ballista/pull/967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

Re: [PR] Initial PR [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
sunchao commented on PR #1: URL: https://github.com/apache/arrow-datafusion-comet/pull/1#issuecomment-1925445318 > could you add https://github.com/apache/arrow-datafusion/blob/main/LICENSE.txt to the root of the repo in the PR? Sure @andygrove, just added the `LICENSE.txt`

Re: [PR] Initial PR [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/arrow-datafusion-comet/pull/1#issuecomment-1925445213 I manually checked the maven dependencies are licenses are all good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Initial PR [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/arrow-datafusion-comet/pull/1#issuecomment-1925444681 License check for the Rust dependencies: ``` $ cargo license (MIT OR Apache-2.0) AND Unicode-DFS-2016 (1): unicode-ident 0BSD OR Apache-2.0 OR MIT (1): adler

Re: [PR] IP clearance form [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove commented on code in PR #2: URL: https://github.com/apache/arrow-datafusion-comet/pull/2#discussion_r1477120675 ## ip-clearance-arrow-datafusion-comet.xml: ## @@ -0,0 +1,259 @@ + + + +Apache Arrow DataFusion Comet Codebase Intellectual Property (IP) Clear

[PR] build(deps): bump tokio from 1.35.1 to 1.36.0 [arrow-datafusion-python]

2024-02-03 Thread via GitHub
dependabot[bot] opened a new pull request, #577: URL: https://github.com/apache/arrow-datafusion-python/pull/577 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.35.1 to 1.36.0. Release notes Sourced from https://github.com/tokio-rs/tokio/releases";>tokio's releases. T

Re: [PR] feat: support `LargeList` in `flatten` [arrow-datafusion]

2024-02-03 Thread via GitHub
comphead commented on code in PR #9110: URL: https://github.com/apache/arrow-datafusion/pull/9110#discussion_r1477111388 ## datafusion/sqllogictest/test_files/array.slt: ## @@ -5345,19 +5356,41 @@ select array_concat(column1, [7]) from arrays_values_v2; [7] # flatten +# foll

Re: [PR] IP clearance form [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove commented on code in PR #2: URL: https://github.com/apache/arrow-datafusion-comet/pull/2#discussion_r1477110837 ## ip-clearance-arrow-datafusion-comet.xml: ## @@ -0,0 +1,259 @@ + + + +Apache Arrow DataFusion Comet Codebase Intellectual Property (IP) Clear

Re: [PR] Initial PR [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove commented on PR #1: URL: https://github.com/apache/arrow-datafusion-comet/pull/1#issuecomment-1925421425 The [vote to accept the donation](https://lists.apache.org/thread/sk70pkhwmt8vgn0thtr04qg4mpqsgfvx) has passed and the next step is to complete the IP clearance process.

[PR] IP clearance form [arrow-datafusion-comet]

2024-02-03 Thread via GitHub
andygrove opened a new pull request, #2: URL: https://github.com/apache/arrow-datafusion-comet/pull/2 IP Clearance Form that will be submitted once complete. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

Re: [PR] docs: add docs and example showing how to get the expression data type [arrow-datafusion]

2024-02-03 Thread via GitHub
r3stl355 commented on PR #9118: URL: https://github.com/apache/arrow-datafusion/pull/9118#issuecomment-1925413721 Few issues with the docstrings. I can resolve some by using lib names instead of fully qualified (e.g. `datafusion_expr` instead of `datafusion::logical_expr`) but `arrow_schem

Re: [I] [R] Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath = DLLpath, ...): [arrow]

2024-02-03 Thread via GitHub
juanfcocontreras commented on issue #39206: URL: https://github.com/apache/arrow/issues/39206#issuecomment-1925408070 Thank you so much for the suggestion! It's working great!! > Just as a side note, while Homebrew's R formula has the issues mentioned above, the Cask version works gre

Re: [PR] Update deployment docs for 0.12.0 [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove commented on code in PR #967: URL: https://github.com/apache/arrow-ballista/pull/967#discussion_r1477101518 ## docker-compose.yml: ## @@ -31,7 +31,7 @@ services: - "80:80" - "50050:50050" environment: - - RUST_LOG=ballista=debug,info + - RU

[PR] Update deployment docs for 0.12.0 [arrow-ballista]

2024-02-03 Thread via GitHub
andygrove opened a new pull request, #967: URL: https://github.com/apache/arrow-ballista/pull/967 # Which issue does this PR close? N/A # Rationale for this change Improve docs based on experience using Ballista today. # What changes are included in th

Re: [I] Dataset fragment filtering performance feels somewhat slow [arrow]

2024-02-03 Thread via GitHub
eeroel commented on issue #39906: URL: https://github.com/apache/arrow/issues/39906#issuecomment-1925398859 I did some more digging, and as far as I understand most of the time spent is not from the comparisons per se, but the overhead from these CallFunction invocations, e.g. https://gith

Re: [I] [R] Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath = DLLpath, ...): [arrow]

2024-02-03 Thread via GitHub
amoeba commented on issue #39206: URL: https://github.com/apache/arrow/issues/39206#issuecomment-1925394915 Just as a side note, while Homebrew's R formula has the issues mentioned above, the Cask version works great (it's just the CRAN `.pkg` installer), so saying homebrew's R isn't recomm

Re: [PR] fix(csharp/src/Drivers/BigQuery): fix support for large results [arrow-adbc]

2024-02-03 Thread via GitHub
davidhcoe commented on code in PR #1507: URL: https://github.com/apache/arrow-adbc/pull/1507#discussion_r1477091102 ## csharp/src/Drivers/BigQuery/BigQueryStatement.cs: ## @@ -185,6 +186,33 @@ static IArrowReader ReadChunk(BigQueryReadClient readClient, string streamName)

Re: [PR] [r] Return package version as a package_version object [arrow-nanoarrow]

2024-02-03 Thread via GitHub
eddelbuettel closed pull request #382: [r] Return package version as a package_version object URL: https://github.com/apache/arrow-nanoarrow/pull/382 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [r] Return package version as a package_version object [arrow-nanoarrow]

2024-02-03 Thread via GitHub
eddelbuettel commented on PR #382: URL: https://github.com/apache/arrow-nanoarrow/pull/382#issuecomment-1925361231 I see now that the `-SNAPSHOT` throws a spanner so I am closing this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [Performance] Optimize DFSchema search by field [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb commented on PR #9104: URL: https://github.com/apache/arrow-datafusion/pull/9104#issuecomment-1925351142 > Thanks @alamb I'll try suggestions 👍 > EDIT: How do you compare results from `cargo bench --bench sql_planner` for different commits? You run the bench for both br

Re: [PR] [Performance] Optimize DFSchema search by field [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb commented on code in PR #9104: URL: https://github.com/apache/arrow-datafusion/pull/9104#discussion_r1477077964 ## datafusion/common/src/dfschema.rs: ## @@ -112,6 +112,10 @@ pub struct DFSchema { metadata: HashMap, /// Stores functional dependencies in the schema

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
AlenkaF commented on PR #39192: URL: https://github.com/apache/arrow/pull/39192#issuecomment-1925346864 > Could you help me identify any broken workflows related to Python? I searched the codebase for `CastTo`, but couldn't find it in the Python parts. I'm puzzled as to why the workflow err

[PR] [r] Return package version as a package_version object [arrow-nanoarrow]

2024-02-03 Thread via GitHub
eddelbuettel opened a new pull request, #382: URL: https://github.com/apache/arrow-nanoarrow/pull/382 The R package currently returns the (run-time or compile-time) package version as a string. There is a simple 'quality of life' improvement to be had by wrapping this into `as.package_vers

Re: [PR] GH-39759: [Docs] Update pydata-sphinx-theme to 0.15.x [arrow]

2024-02-03 Thread via GitHub
AlenkaF commented on code in PR #39879: URL: https://github.com/apache/arrow/pull/39879#discussion_r1477071595 ## docs/requirements.txt: ## @@ -5,7 +5,7 @@ breathe ipython numpydoc -pydata-sphinx-theme~=0.14 +pydata-sphinx-theme~=0.15.2 Review Comment: > Could you please

Re: [PR] feat: improve `make_date` performance [arrow-datafusion]

2024-02-03 Thread via GitHub
Omega359 commented on PR #9112: URL: https://github.com/apache/arrow-datafusion/pull/9112#issuecomment-1925319315 > Sure, I can do that but `array_size` is also used in the 3 calls to `to_primitive_array`. Alternatively, I could just move the variable into `else` as `let array_size = len.u

Re: [PR] GH-39416: [GLib][Docs] Fixed Broken Link in README Content [arrow]

2024-02-03 Thread via GitHub
conbench-apache-arrow[bot] commented on PR #39896: URL: https://github.com/apache/arrow/pull/39896#issuecomment-1925311360 After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 22f2cfd1e1ebe49016b6d97c49f494287a98d02f. There were no

Re: [I] [C++] Enhance Cast function [arrow]

2024-02-03 Thread via GitHub
llama90 commented on issue #39463: URL: https://github.com/apache/arrow/issues/39463#issuecomment-1925305796 Hello @ShaiviAgarwal2 Are you still working on this issue? If it has been stopped, I would like to take it over and work on it. -- This is an automated message from th

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
llama90 commented on PR #39192: URL: https://github.com/apache/arrow/pull/39192#issuecomment-1925305107 Could you help me identify any broken workflows related to Python? I searched the codebase for `CastTo`, but couldn't find it in the Python parts. I'm puzzled as to why the workflow error

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
llama90 commented on code in PR #39192: URL: https://github.com/apache/arrow/pull/39192#discussion_r1477007607 ## cpp/src/arrow/scalar.cc: ## @@ -884,9 +997,11 @@ std::string Scalar::ToString() const { return dict_scalar->value.dictionary->ToString() + "[" + dic

Re: [PR] GH-39883: [CI][R][Windows] Use ci/scripts/install_minio.sh with Git bash [arrow]

2024-02-03 Thread via GitHub
github-actions[bot] commented on PR #39929: URL: https://github.com/apache/arrow/pull/39929#issuecomment-1925301280 :warning: GitHub issue #39883 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[PR] GH-39883: [CI][R][Windows] Use ci/scripts/install_minio.sh with Git bash [arrow]

2024-02-03 Thread via GitHub
kou opened a new pull request, #39929: URL: https://github.com/apache/arrow/pull/39929 ### Rationale for this change `curl` in Rtools can't be used on non Rtools' MSYS2 environment. Because `curl` in Rtools can't refer `/usr/ssl/certs/ca-bundle.crt` on non Rtools' MSYS2 environment.

Re: [PR] GH-39416: [GLib][Docs] Fixed Broken Link in README Content [arrow]

2024-02-03 Thread via GitHub
kou merged PR #39896: URL: https://github.com/apache/arrow/pull/39896 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

Re: [PR] Add `ColumnarValue::values_to_arrays`, deprecate `columnar_values_to_array` [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb commented on code in PR #9114: URL: https://github.com/apache/arrow-datafusion/pull/9114#discussion_r1477046476 ## datafusion/expr/src/columnar_value.rs: ## @@ -75,4 +75,166 @@ impl ColumnarValue { pub fn create_null_array(num_rows: usize) -> Self { ColumnarV

Re: [PR] Cleanup regex_expressions.rs to remove _regexp_match function [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb merged PR #9107: URL: https://github.com/apache/arrow-datafusion/pull/9107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

Re: [I] Cleanup regex_expressions.rs to remove _regexp_match function [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb closed issue #9106: Cleanup regex_expressions.rs to remove _regexp_match function URL: https://github.com/apache/arrow-datafusion/issues/9106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Cleanup regex_expressions.rs to remove _regexp_match function [arrow-datafusion]

2024-02-03 Thread via GitHub
alamb commented on PR #9107: URL: https://github.com/apache/arrow-datafusion/pull/9107#issuecomment-1925277108 Thank you @Omega359 and @viirya 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[PR] docs: add docs and example showing how to get the expression data type [arrow-datafusion]

2024-02-03 Thread via GitHub
r3stl355 opened a new pull request, #9118: URL: https://github.com/apache/arrow-datafusion/pull/9118 ## Which issue does this PR close? Closes #7725 ## Rationale for this change Make it easier to understand what the `get_type` returns ## What changes are included i

Re: [PR] GH-39182: [C++] Remove Legacy CastTo function [arrow]

2024-02-03 Thread via GitHub
llama90 commented on PR #39192: URL: https://github.com/apache/arrow/pull/39192#issuecomment-1925247240 ### For recoding glib test log test log ```bash  BUNDLE_GEMFILE=../c_glib/Gemfile bundle exec ../c_glib/test/run-test.sh [72/81] Generating arrow-flight-gli

Re: [PR] GH-39843: [C++][Parquet] Parquet binary length overflow exception should contain the length of binary [arrow]

2024-02-03 Thread via GitHub
mapleFU commented on PR #39844: URL: https://github.com/apache/arrow/pull/39844#issuecomment-1925246190 Why do conbench has error =_= -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] GH-39860: [C++] Expression ExecuteScalarExpression execute empty args function with a wrong result [arrow]

2024-02-03 Thread via GitHub
mapleFU commented on code in PR #39908: URL: https://github.com/apache/arrow/pull/39908#discussion_r1477029042 ## cpp/src/arrow/compute/expression_test.cc: ## @@ -863,6 +863,23 @@ TEST(Expression, ExecuteCall) { ])")); } +TEST(Expression, ExecuteCallWithNoArguments) { + c

Re: [PR] feat: improve `make_date` performance [arrow-datafusion]

2024-02-03 Thread via GitHub
r3stl355 commented on PR #9112: URL: https://github.com/apache/arrow-datafusion/pull/9112#issuecomment-1925216039 > The > > `let array_size = if is_scalar { 1 } else { len.unwrap() };` > > variable could now be removed and the for loop could just use len.unwrap() I think.

Re: [PR] [task #8987] add to_date_function [arrow-datafusion]

2024-02-03 Thread via GitHub
Tangruilin commented on code in PR #9019: URL: https://github.com/apache/arrow-datafusion/pull/9019#discussion_r1477017335 ## datafusion/physical-expr/src/datetime_expressions.rs: ## @@ -1337,6 +1376,36 @@ fn validate_to_timestamp_data_types( None } +/// to_date SQL func