[GitHub] [arrow-ballista] yahoNanJing commented on pull request #778: Refine create_datafusion_context()

2023-05-16 Thread via GitHub
yahoNanJing commented on PR #778: URL: https://github.com/apache/arrow-ballista/pull/778#issuecomment-1549124956 retest please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [arrow] ursabot commented on pull request #35520: GH-35519: [C++][Parquet] Fixing exception handling in parquet FileSerializer

2023-05-16 Thread via GitHub
ursabot commented on PR #35520: URL: https://github.com/apache/arrow/pull/35520#issuecomment-1549137274 Benchmark runs are scheduled for baseline = e2e3a9df28db03b09b5a83a60fa8293dfef6d0d8 and contender = 2d76d9a526f9827283bb7dfac60715b6ad4aec34. 2d76d9a526f9827283bb7dfac60715b6ad4aec34 is

[GitHub] [arrow] westonpace commented on pull request #34834: GH-33985: [C++] Add substrait serialization/deserialization for expressions

2023-05-16 Thread via GitHub
westonpace commented on PR #34834: URL: https://github.com/apache/arrow/pull/34834#issuecomment-1549144409 I've added python bindings. Now all that is needed is documentation / examples -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] rtpsw opened a new pull request, #35608: GH-35607: [C++] Support simple Substrait aggregate extensions

2023-05-16 Thread via GitHub
rtpsw opened a new pull request, #35608: URL: https://github.com/apache/arrow/pull/35608 ### Rationale for this change See #35607. ### What changes are included in this PR? A simple `SubstraitAggregateToArrow` converter for `urn:arrow:substrait_simple_extension_function`

[GitHub] [arrow] github-actions[bot] commented on pull request #35608: GH-35607: [C++] Support simple Substrait aggregate extensions

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35608: URL: https://github.com/apache/arrow/pull/35608#issuecomment-1549168838 * Closes: #35607 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] github-actions[bot] commented on pull request #35608: GH-35607: [C++] Support simple Substrait aggregate extensions

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35608: URL: https://github.com/apache/arrow/pull/35608#issuecomment-1549168886 :warning: GitHub issue #35607 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] rtpsw commented on pull request #35608: GH-35607: [C++] Support simple Substrait aggregate extensions

2023-05-16 Thread via GitHub
rtpsw commented on PR #35608: URL: https://github.com/apache/arrow/pull/35608#issuecomment-1549171403 cc @icexelloss -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow] rtpsw commented on pull request #35513: GH-35506: [C++] Support First and Last aggregators in Substrait

2023-05-16 Thread via GitHub
rtpsw commented on PR #35513: URL: https://github.com/apache/arrow/pull/35513#issuecomment-1549173459 > Thanks, @westonpace. I'll work on a revision accordingly. This PR is now waiting on #35608, which implements @westonpace's [suggestion](https://github.com/apache/arrow/pull/35513#is

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6352: Support CREATE TABLE via SQL for infinite streams

2023-05-16 Thread via GitHub
mustafasrepo commented on code in PR #6352: URL: https://github.com/apache/arrow-datafusion/pull/6352#discussion_r1194774776 ## datafusion/expr/src/logical_plan/ddl.rs: ## @@ -192,6 +192,8 @@ pub struct CreateExternalTable { pub order_exprs: Vec, /// File compression t

[GitHub] [arrow-datafusion] aprimadi commented on a diff in pull request #6352: Support CREATE TABLE via SQL for infinite streams

2023-05-16 Thread via GitHub
aprimadi commented on code in PR #6352: URL: https://github.com/apache/arrow-datafusion/pull/6352#discussion_r1194778613 ## datafusion/expr/src/logical_plan/ddl.rs: ## @@ -192,6 +192,8 @@ pub struct CreateExternalTable { pub order_exprs: Vec, /// File compression type

[GitHub] [arrow-datafusion] aprimadi opened a new pull request, #6360: Switch to non-recursive on heap virtual stack to parse expr

2023-05-16 Thread via GitHub
aprimadi opened a new pull request, #6360: URL: https://github.com/apache/arrow-datafusion/pull/6360 # Which issue does this PR close? TODO # Rationale for this change TODO # What changes are included in this PR? TODO # Are these changes tested?

[GitHub] [arrow] pitrou commented on pull request #35597: GH-35596: [C++][CI] Improve compilation caching with PCG

2023-05-16 Thread via GitHub
pitrou commented on PR #35597: URL: https://github.com/apache/arrow/pull/35597#issuecomment-1549200863 The `static_arbitrary_seed` feature seems rather useless as it creates a rather bad-quality seed. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6352: Support CREATE TABLE via SQL for infinite streams

2023-05-16 Thread via GitHub
mustafasrepo commented on code in PR #6352: URL: https://github.com/apache/arrow-datafusion/pull/6352#discussion_r1194783100 ## datafusion/sql/src/parser.rs: ## @@ -427,39 +435,72 @@ impl<'a> DFParser<'a> { } loop { -if self.parser.parse_keyword(K

[GitHub] [arrow-datafusion] mustafasrepo commented on pull request #6352: Support CREATE TABLE via SQL for infinite streams

2023-05-16 Thread via GitHub
mustafasrepo commented on PR #6352: URL: https://github.com/apache/arrow-datafusion/pull/6352#issuecomment-1549204696 This PR is LGTM!. Thanks @aprimadi for this work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [arrow] pitrou merged pull request #35597: GH-35596: [C++][CI] Improve compilation caching with PCG

2023-05-16 Thread via GitHub
pitrou merged PR #35597: URL: https://github.com/apache/arrow/pull/35597 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] pitrou commented on issue #34715: [CI] Evaluate new GHA backend for sccache

2023-05-16 Thread via GitHub
pitrou commented on issue #34715: URL: https://github.com/apache/arrow/issues/34715#issuecomment-1549213241 Reference: * https://github.com/mozilla/sccache/blob/main/docs/GHA.md * https://github.com/marketplace/actions/sccache-action Also it might not be a good idea to invest tim

[GitHub] [arrow] mapleFU commented on issue #35606: [CI][C++] TestDecimalFromReal for Float type failed in "AMD64 Windows MinGW MINGW32 C++"

2023-05-16 Thread via GitHub
mapleFU commented on issue #35606: URL: https://github.com/apache/arrow/issues/35606#issuecomment-1549230366 ( https://github.com/apache/arrow/pull/35605 . Just for float point and parquet decryption, not for this issue ) -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] pitrou commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
pitrou commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549232052 ``` values at index 28 not equal read is 30.81, expect 30.79 ``` This is not right. Reading Parquet values should be bit-precise, not approximate. -- This is an

[GitHub] [arrow] mapleFU commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
mapleFU commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549243172 I guess the error and https://github.com/apache/arrow/issues/35606 have same underlying reason. Maybe there is some prevision loss in MinGW -- This is an automated message from the Apac

[GitHub] [arrow] jarohen commented on pull request #35590: GH-35588: [Java] returning a constant hashCode for null values, resolves #35588

2023-05-16 Thread via GitHub
jarohen commented on PR #35590: URL: https://github.com/apache/arrow/pull/35590#issuecomment-1549265727 Cheers @lidavidm! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [arrow] jorisvandenbossche commented on pull request #35565: GH-35498: [C++] Relax EnsureAlignment check in Acero from requiring 64-byte aligned buffers to requiring value-aligned buffers

2023-05-16 Thread via GitHub
jorisvandenbossche commented on PR #35565: URL: https://github.com/apache/arrow/pull/35565#issuecomment-1549273891 > I'm running the benchmarks again but, as best I can tell, these regressions are noise, though it is quite difficult to say for sure. I am not sure why it doesn't show u

[GitHub] [arrow] pitrou commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
pitrou commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549276994 There shouldn't be any precision loss simply when copying values. Perhaps the tests are doing something wrong. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] AlenkaF opened a new pull request, #35610: GH-35609: [Docs] Enable the build of subsections of the documentation

2023-05-16 Thread via GitHub
AlenkaF opened a new pull request, #35610: URL: https://github.com/apache/arrow/pull/35610 ### Rationale for this change Ease the process of building the documentation for dev purposes. ### What changes are included in this PR? - `make` options to build subsections of the

[GitHub] [arrow] github-actions[bot] commented on pull request #35610: GH-35609: [Docs] Enable the build of subsections of the documentation

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35610: URL: https://github.com/apache/arrow/pull/35610#issuecomment-1549278065 * Closes: #35609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] jorisvandenbossche commented on pull request #35565: GH-35498: [C++] Relax EnsureAlignment check in Acero from requiring 64-byte aligned buffers to requiring value-aligned buffers

2023-05-16 Thread via GitHub
jorisvandenbossche commented on PR #35565: URL: https://github.com/apache/arrow/pull/35565#issuecomment-1549278536 > I am not sure why it doesn't show up on the landing page linked from the bot comment Whoops, they of course _do_ show up, just have to sort by the z-score or change pe

[GitHub] [arrow] AlenkaF commented on issue #30627: [Docs] Splitting the sphinx-based Arrow docs into separate sphinx projects

2023-05-16 Thread via GitHub
AlenkaF commented on issue #30627: URL: https://github.com/apache/arrow/issues/30627#issuecomment-1549278794 Just to note, I have created a new issue for alternative easy first solution: https://github.com/apache/arrow/issues/35609 The PR: https://github.com/apache/arrow/pull/35610 --

[GitHub] [arrow-ballista] yahoNanJing closed pull request #778: Refine create_datafusion_context()

2023-05-16 Thread via GitHub
yahoNanJing closed pull request #778: Refine create_datafusion_context() URL: https://github.com/apache/arrow-ballista/pull/778 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] github-actions[bot] commented on pull request #35612: GH-35594: [R] Issue with tzdb 0.4.0 and the shipped arrow tz.cpp

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35612: URL: https://github.com/apache/arrow/pull/35612#issuecomment-1549301885 * Closes: #35594 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] github-actions[bot] commented on pull request #35612: GH-35594: [R] Issue with tzdb 0.4.0 and the shipped arrow tz.cpp

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35612: URL: https://github.com/apache/arrow/pull/35612#issuecomment-1549301961 :warning: GitHub issue #35594 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] thisisnic opened a new pull request, #35612: GH-35594: [R] Issue with tzdb 0.4.0 and the shipped arrow tz.cpp

2023-05-16 Thread via GitHub
thisisnic opened a new pull request, #35612: URL: https://github.com/apache/arrow/pull/35612 This PR bumps the vendored version of the date library to commit `cc4685a21e4a4fdae707ad1233c61bbaff241f93`. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] mapleFU commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
mapleFU commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549302360 Got it, I'm at office and didn't have a Windows machine, I'll try to reproduce it later after I'm home -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] js8544 opened a new pull request, #35613: GH-35611: [C++] Remove unnecessary safe operations for ListBuilder and BinaryBuilder

2023-05-16 Thread via GitHub
js8544 opened a new pull request, #35613: URL: https://github.com/apache/arrow/pull/35613 ### Rationale for this change There are several safety checks/operations that can be optimized to enhance performance of ListBuilder and BinaryBuild ### What changes are includ

[GitHub] [arrow] github-actions[bot] commented on pull request #35613: GH-35611: [C++] Remove unnecessary safe operations for ListBuilder and BinaryBuilder

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35613: URL: https://github.com/apache/arrow/pull/35613#issuecomment-1549302946 * Closes: #35611 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] github-actions[bot] commented on pull request #35613: GH-35611: [C++] Remove unnecessary safe operations for ListBuilder and BinaryBuilder

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35613: URL: https://github.com/apache/arrow/pull/35613#issuecomment-1549303000 :warning: GitHub issue #35611 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow-datafusion] e1ijah1 commented on issue #6299: Port tests in `avro.rs` to sqllogictest

2023-05-16 Thread via GitHub
e1ijah1 commented on issue #6299: URL: https://github.com/apache/arrow-datafusion/issues/6299#issuecomment-1549340325 Hi, I'd like to work on this issue :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] lokax opened a new pull request, #6361: Fix shadowing in test_code

2023-05-16 Thread via GitHub
lokax opened a new pull request, #6361: URL: https://github.com/apache/arrow-datafusion/pull/6361 # Which issue does this PR close? # Rationale for this change # What changes are included in this PR? Due to variable shadowing, the test code always pass. Ch

[GitHub] [arrow] ursabot commented on pull request #35255: GH-35193: [Python][Packaging] Enable GCS on Windows wheels

2023-05-16 Thread via GitHub
ursabot commented on PR #35255: URL: https://github.com/apache/arrow/pull/35255#issuecomment-1549410868 Benchmark runs are scheduled for baseline = 2d76d9a526f9827283bb7dfac60715b6ad4aec34 and contender = 8be70c137289adba92871555ce74055719172f56. 8be70c137289adba92871555ce74055719172f56 is

[GitHub] [arrow] AlenkaF opened a new pull request, #35614: GH-32739: [CI][Docs] Document Docs PR Preview

2023-05-16 Thread via GitHub
AlenkaF opened a new pull request, #35614: URL: https://github.com/apache/arrow/pull/35614 Add Pull Request Docs Preview section to developers/documentation.rst. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #35614: GH-32739: [CI][Docs] Document Docs PR Preview

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35614: URL: https://github.com/apache/arrow/pull/35614#issuecomment-1549419901 * Closes: #32739 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow-datafusion] e1ijah1 opened a new pull request, #6362: [sqllogictest] port tests in avro.rs to sqllogictest

2023-05-16 Thread via GitHub
e1ijah1 opened a new pull request, #6362: URL: https://github.com/apache/arrow-datafusion/pull/6362 # Which issue does this PR close? Closes #6299 # Rationale for this change #6195 # What changes are included in this PR? Port the Rust u

[GitHub] [arrow-datafusion] tustvold commented on issue #6350: Advice on using external catalogue as Catalog/Schema provider

2023-05-16 Thread via GitHub
tustvold commented on issue #6350: URL: https://github.com/apache/arrow-datafusion/issues/6350#issuecomment-1549437555 > I more or less just copied the client and auth methods out of object_store I created https://github.com/apache/arrow-rs/issues/4223 to track adding support for ex

[GitHub] [arrow-rs] tustvold opened a new pull request, #4225: Standardise credentials API (#4223) (#4163)

2023-05-16 Thread via GitHub
tustvold opened a new pull request, #4225: URL: https://github.com/apache/arrow-rs/pull/4225 # Which issue does this PR close? Part of #4223 Part of #4163 # Rationale for this change In order to expose extension points that relate to authorization and

[GitHub] [arrow-datafusion] alamb opened a new issue, #6363: Rewrite large OR chains as `IN` lists

2023-05-16 Thread via GitHub
alamb opened a new issue, #6363: URL: https://github.com/apache/arrow-datafusion/issues/6363 ### Is your feature request related to a problem or challenge? Sometimes automatic tools create queries like this (where `` is a different value) ``` WHERE ((tenant = '') OR (tenant

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #4225: Standardise credentials API (#4223) (#4163)

2023-05-16 Thread via GitHub
tustvold commented on code in PR #4225: URL: https://github.com/apache/arrow-rs/pull/4225#discussion_r1194995402 ## object_store/src/client/mod.rs: ## @@ -503,6 +506,90 @@ impl GetOptionsExt for RequestBuilder { } } +/// Provides credentials for use when signing requests

[GitHub] [arrow-datafusion] alamb commented on issue #6363: Rewrite large OR chains as `IN` lists

2023-05-16 Thread via GitHub
alamb commented on issue #6363: URL: https://github.com/apache/arrow-datafusion/issues/6363#issuecomment-1549455620 I think this is a good first issue because it is well specified and the code can follow the existing patterns -- This is an automated message from the Apache Git Service. T

[GitHub] [arrow-rs] tustvold commented on pull request #4225: Standardise credentials API (#4223) (#4163)

2023-05-16 Thread via GitHub
tustvold commented on PR #4225: URL: https://github.com/apache/arrow-rs/pull/4225#issuecomment-1549462193 @roeap perhaps you might be able to give this one a look, I believe you were interested in #4223 -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #6362: [sqllogictest] port tests in avro.rs to sqllogictest

2023-05-16 Thread via GitHub
alamb commented on code in PR #6362: URL: https://github.com/apache/arrow-datafusion/pull/6362#discussion_r1195007277 ## datafusion/core/tests/sql/avro.rs: ## @@ -1,157 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contributor license agreeme

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6354: INSERT returns number of rows written, add `InsertExec` to handle common case.

2023-05-16 Thread via GitHub
mustafasrepo commented on code in PR #6354: URL: https://github.com/apache/arrow-datafusion/pull/6354#discussion_r1195008368 ## datafusion/core/tests/sqllogictests/test_files/insert.slt: ## @@ -0,0 +1,182 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or mor

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6354: INSERT returns number of rows written, add `InsertExec` to handle common case.

2023-05-16 Thread via GitHub
mustafasrepo commented on code in PR #6354: URL: https://github.com/apache/arrow-datafusion/pull/6354#discussion_r1195020404 ## datafusion/core/src/physical_plan/insert.rs: ## @@ -0,0 +1,203 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

[GitHub] [arrow-rs] yah01 opened a new pull request, #4226: Support to read/write parquet for FixedSizeList type

2023-05-16 Thread via GitHub
yah01 opened a new pull request, #4226: URL: https://github.com/apache/arrow-rs/pull/4226 # Which issue does this PR close? Closes #4214 # Rationale for this change As mentioned in the issue # What changes are included in this PR? - Build levels for `FixedS

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #6354: INSERT returns number of rows written, add `InsertExec` to handle common case.

2023-05-16 Thread via GitHub
tustvold commented on code in PR #6354: URL: https://github.com/apache/arrow-datafusion/pull/6354#discussion_r1195034855 ## datafusion/core/src/physical_plan/insert.rs: ## @@ -0,0 +1,203 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[GitHub] [arrow-datafusion] crepererum commented on pull request #6226: feat: min/max agg for bool

2023-05-16 Thread via GitHub
crepererum commented on PR #6226: URL: https://github.com/apache/arrow-datafusion/pull/6226#issuecomment-1549505665 Rebased and added more tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #4226: Support to read/write parquet for FixedSizeList type

2023-05-16 Thread via GitHub
tustvold commented on code in PR #4226: URL: https://github.com/apache/arrow-rs/pull/4226#discussion_r1195040927 ## parquet/src/arrow/array_reader/list_array.rs: ## @@ -227,8 +228,13 @@ impl ArrayReader for ListArrayReader { let list_data = unsafe { data_builder.buil

[GitHub] [arrow-rs] tustvold merged pull request #4161: Object Store (AWS): Support region configured via named profile

2023-05-16 Thread via GitHub
tustvold merged PR #4161: URL: https://github.com/apache/arrow-rs/pull/4161 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-rs] tustvold closed issue #4158: object_store: When using an AWS profile, obtain the default AWS region from the active profile

2023-05-16 Thread via GitHub
tustvold closed issue #4158: object_store: When using an AWS profile, obtain the default AWS region from the active profile URL: https://github.com/apache/arrow-rs/issues/4158 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] lidavidm commented on issue #35615: Inconsistent behavior of VectorSchemaRoot Slice

2023-05-16 Thread via GitHub
lidavidm commented on issue #35615: URL: https://github.com/apache/arrow/issues/35615#issuecomment-1549515451 This was fixed by https://github.com/apache/arrow/pull/35476 for 13.0.0. Duplicate of https://github.com/apache/arrow/issues/35275 -- This is an automated message from the A

[GitHub] [arrow] pribor commented on issue #35615: [Java] Inconsistent behavior of VectorSchemaRoot Slice

2023-05-16 Thread via GitHub
pribor commented on issue #35615: URL: https://github.com/apache/arrow/issues/35615#issuecomment-1549531267 Awesome, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] lidavidm commented on pull request #35603: GH-35559: [Java] Implementing JDBC Flight Stream Result Set asynchronous VectorSchemaRoot Producer

2023-05-16 Thread via GitHub
lidavidm commented on PR #35603: URL: https://github.com/apache/arrow/pull/35603#issuecomment-1549534625 So gRPC should already be buffering data, meaning that this change is mostly about pipelining the IPC work - except IPC deserialization should be very cheap. That could explain why the b

[GitHub] [arrow-julia] DrChainsaw closed pull request #436: Add handling of len = -1 in uncompress

2023-05-16 Thread via GitHub
DrChainsaw closed pull request #436: Add handling of len = -1 in uncompress URL: https://github.com/apache/arrow-julia/pull/436 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] lidavidm commented on pull request #35603: GH-35559: [Java] Implementing JDBC Flight Stream Result Set asynchronous VectorSchemaRoot Producer

2023-05-16 Thread via GitHub
lidavidm commented on PR #35603: URL: https://github.com/apache/arrow/pull/35603#issuecomment-1549539028 > So gRPC should already be buffering data The flip side is that due to a design decision in gRPC-Java, it's actually hard for a Java _server_ to properly saturate the connection,

[GitHub] [arrow-julia] DrChainsaw commented on pull request #436: Add handling of len = -1 in uncompress

2023-05-16 Thread via GitHub
DrChainsaw commented on PR #436: URL: https://github.com/apache/arrow-julia/pull/436#issuecomment-1549535374 CI timeout on macos seemed to happen during precompilation of dependencies. Trying to restart through open-close. -- This is an automated message from the Apache Git Service. To re

[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #679: feat(c/driver/postgres): Implement GetObjectsDbSchemas for Postgres

2023-05-16 Thread via GitHub
lidavidm commented on code in PR #679: URL: https://github.com/apache/arrow-adbc/pull/679#discussion_r1195067047 ## c/driver/postgresql/connection.cc: ## @@ -182,7 +216,12 @@ AdbcStatusCode PostgresConnectionGetObjectsImpl( if (depth == ADBC_OBJECT_DEPTH_CATALOGS) {

[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #679: feat(c/driver/postgres): Implement GetObjectsDbSchemas for Postgres

2023-05-16 Thread via GitHub
lidavidm commented on code in PR #679: URL: https://github.com/apache/arrow-adbc/pull/679#discussion_r1195071157 ## c/driver/postgresql/connection.cc: ## @@ -182,7 +216,12 @@ AdbcStatusCode PostgresConnectionGetObjectsImpl( if (depth == ADBC_OBJECT_DEPTH_CATALOGS) {

[GitHub] [arrow] hqx871 commented on issue #35616: [c++] arrow::int32 throws exc_bad_access

2023-05-16 Thread via GitHub
hqx871 commented on issue #35616: URL: https://github.com/apache/arrow/issues/35616#issuecomment-1549546987 I know this is too old. Have anyone solved this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [arrow] hqx871 commented on issue #35616: [c++] arrow::int32 throws exc_bad_access

2023-05-16 Thread via GitHub
hqx871 commented on issue #35616: URL: https://github.com/apache/arrow/issues/35616#issuecomment-1549550794 The code is very simple. ``` void printParquetFile(const std::string &path) { arrow::Status st; // Open Parquet file reader std::unique_ptr arrow_reader; auto

[GitHub] [arrow-adbc] lidavidm commented on a diff in pull request #683: refactor(c/driver/postgresql): Implement InputIterator for ResultHelper

2023-05-16 Thread via GitHub
lidavidm commented on code in PR #683: URL: https://github.com/apache/arrow-adbc/pull/683#discussion_r1195072804 ## c/driver/postgresql/connection.cc: ## @@ -170,13 +241,10 @@ AdbcStatusCode PostgresConnectionGetObjectsImpl( PqResultHelper result_helper = PqResultHelper{c

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #6354: INSERT returns number of rows written, add `InsertExec` to handle common case.

2023-05-16 Thread via GitHub
mustafasrepo commented on code in PR #6354: URL: https://github.com/apache/arrow-datafusion/pull/6354#discussion_r1195083475 ## datafusion/core/src/physical_plan/insert.rs: ## @@ -0,0 +1,203 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

[GitHub] [arrow-datafusion] tustvold commented on pull request #6183: Faster ListingTable partition listing (#6182)

2023-05-16 Thread via GitHub
tustvold commented on PR #6183: URL: https://github.com/apache/arrow-datafusion/pull/6183#issuecomment-1549565875 > Azure VM Aah, I didn't realise this was Azure... Azure Blob Storage is notoriously slow, still we should be able to at least match fsspec. -- This is an automated me

[GitHub] [arrow] AlenkaF commented on pull request #35582: GH-34787: [Python] Accept zero_copy_only=False for ChunkedArray.to_numpy

2023-05-16 Thread via GitHub
AlenkaF commented on PR #35582: URL: https://github.com/apache/arrow/pull/35582#issuecomment-1549566500 Thank you for the contribution @jjerphan! > I think writable can be added as a default-valued parameter to pa.ChunkedArray.to_numpy. Yes, I think that would be useful to add

[GitHub] [arrow] DavisVaughan commented on pull request #35612: GH-35594: [R] Issue with tzdb 0.4.0 and the shipped arrow tz.cpp

2023-05-16 Thread via GitHub
DavisVaughan commented on PR #35612: URL: https://github.com/apache/arrow/pull/35612#issuecomment-1549567386 It seems like almost all of the files in https://github.com/apache/arrow/tree/main/cpp/src/arrow/vendored/datetime are from `` so it might be worth updating all of them rather than j

[GitHub] [arrow] pitrou commented on pull request #35612: GH-35594: [R][C++] Bump vendored date library

2023-05-16 Thread via GitHub
pitrou commented on PR #35612: URL: https://github.com/apache/arrow/pull/35612#issuecomment-1549580741 Indeed, it seems there are a couple other changes to include: ```console $ git diff --stat 2e19c006e2218447ee31f864191859517603f59f cc4685a21e4a4fdae707ad1233c61bbaff241f93 includ

[GitHub] [arrow] jjerphan commented on pull request #35582: GH-34787: [Python] Accept zero_copy_only=False for ChunkedArray.to_numpy

2023-05-16 Thread via GitHub
jjerphan commented on PR #35582: URL: https://github.com/apache/arrow/pull/35582#issuecomment-1549585325 > Thank you for the contribution @jjerphan! > > > I think writable can be added as a default-valued parameter to pa.ChunkedArray.to_numpy. > > Yes, I think that would be use

[GitHub] [arrow-nanoarrow] paleolimbot merged pull request #194: feat(r): Add ArrowArrayStream implementation to support keeping a dependent object in scope

2023-05-16 Thread via GitHub
paleolimbot merged PR #194: URL: https://github.com/apache/arrow-nanoarrow/pull/194 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow] raulcd commented on pull request #35555: increase required cmake version in C++ build docs

2023-05-16 Thread via GitHub
raulcd commented on PR #3: URL: https://github.com/apache/arrow/pull/3#issuecomment-1549600769 Hi, thanks for the PR. From my understanding you are trying to use presets. On the documentation it says that you can build with presets using CMake `3.21.0` or higher: https://arrow.a

[GitHub] [arrow] raulcd commented on pull request #35549: GH-35438: [Docs] Make corrections to the source docs

2023-05-16 Thread via GitHub
raulcd commented on PR #35549: URL: https://github.com/apache/arrow/pull/35549#issuecomment-1549605434 @github-actions crossbow submit preview-docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #35549: GH-35438: [Docs] Make corrections to the source docs

2023-05-16 Thread via GitHub
github-actions[bot] commented on PR #35549: URL: https://github.com/apache/arrow/pull/35549#issuecomment-1549608932 Revision: a71fc25396996c23dee036fd42d14a0e98c2f2f6 Submitted crossbow builds: [ursacomputing/crossbow @ actions-0fa7b0b2e2](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] tdhock commented on issue #34689: [R] R Session Aborted, R encountered a fatal error after write_dataset command

2023-05-16 Thread via GitHub
tdhock commented on issue #34689: URL: https://github.com/apache/arrow/issues/34689#issuecomment-1549616697 yes good idea that is probly more robust than grepping lscpu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [arrow] felipecrv commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
felipecrv commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549625441 > There shouldn't be any precision loss simply when copying values. Perhaps the tests are doing something wrong. Could this be tracked in a separate issue and the CI fix merged so

[GitHub] [arrow] felipecrv commented on a diff in pull request #35565: GH-35498: [C++] Relax EnsureAlignment check in Acero from requiring 64-byte aligned buffers to requiring value-aligned buffers

2023-05-16 Thread via GitHub
felipecrv commented on code in PR #35565: URL: https://github.com/apache/arrow/pull/35565#discussion_r1192610272 ## cpp/src/arrow/util/align_util.cc: ## @@ -30,12 +32,120 @@ bool CheckAlignment(const Buffer& buffer, int64_t alignment) { return buffer.address() % alignment ==

[GitHub] [arrow] pitrou commented on pull request #35605: GH-35571: [C++][CI][Parquet] Change `EQ` to `FLOAT_EQ` in Decryption tests

2023-05-16 Thread via GitHub
pitrou commented on PR #35605: URL: https://github.com/apache/arrow/pull/35605#issuecomment-1549671379 Rather than complicating the test routine, I'd rather generate data that doesn't trigger truncation issues (wherever they might happen): ```diff diff --git a/cpp/src/parquet/encryptio

[GitHub] [arrow] thisisnic commented on pull request #35612: GH-35594: [R][C++] Issue with tzdb 0.4.0 and the shipped arrow tz.cpp

2023-05-16 Thread via GitHub
thisisnic commented on PR #35612: URL: https://github.com/apache/arrow/pull/35612#issuecomment-1549674713 I think I've got them all now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow] pitrou merged pull request #35612: GH-35594: [R][C++] Bump vendored date library

2023-05-16 Thread via GitHub
pitrou merged PR #35612: URL: https://github.com/apache/arrow/pull/35612 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow-datafusion] kylebrooks-8451 commented on pull request #6183: Faster ListingTable partition listing (#6182)

2023-05-16 Thread via GitHub
kylebrooks-8451 commented on PR #6183: URL: https://github.com/apache/arrow-datafusion/pull/6183#issuecomment-1549730700 Still debugging this, I noticed this error: ``` Error: ObjectStore(Generic { store: "MicrosoftAzure", source: ListRequest { source: Error { retries: 0, message: "re

[GitHub] [arrow] pitrou commented on a diff in pull request #35614: GH-32739: [CI][Docs] Document Docs PR Preview

2023-05-16 Thread via GitHub
pitrou commented on code in PR #35614: URL: https://github.com/apache/arrow/pull/35614#discussion_r1195208773 ## docs/source/developers/documentation.rst: ## @@ -102,6 +102,32 @@ The final output is located under the ``${PWD}/docs`` directory. :ref:`docker-builds`. +Pul

[GitHub] [arrow-datafusion] crepererum commented on pull request #6226: feat: min/max agg for bool

2023-05-16 Thread via GitHub
crepererum commented on PR #6226: URL: https://github.com/apache/arrow-datafusion/pull/6226#issuecomment-1549737138 ```text thread 'datasource::file_format::avro::tests::read_null_binary_alltypes_plain_avro' panicked at 'called `Result::unwrap()` on an `Err` value: Canonicalize { path:

[GitHub] [arrow-nanoarrow] codecov-commenter commented on pull request #195: feat(r): Union array support

2023-05-16 Thread via GitHub
codecov-commenter commented on PR #195: URL: https://github.com/apache/arrow-nanoarrow/pull/195#issuecomment-1549741382 ## [Codecov](https://app.codecov.io/gh/apache/arrow-nanoarrow/pull/195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_t

[GitHub] [arrow] ursabot commented on pull request #35597: GH-35596: [C++][CI] Improve compilation caching with PCG

2023-05-16 Thread via GitHub
ursabot commented on PR #35597: URL: https://github.com/apache/arrow/pull/35597#issuecomment-1549747918 Benchmark runs are scheduled for baseline = 8be70c137289adba92871555ce74055719172f56 and contender = f6e447944f2a2ab108d5971daf351b7443bc96fb. f6e447944f2a2ab108d5971daf351b7443bc96fb is

[GitHub] [arrow] pitrou commented on pull request #35592: GH-35539: [C++] remove use of _internal.h header file from public header file

2023-05-16 Thread via GitHub
pitrou commented on PR #35592: URL: https://github.com/apache/arrow/pull/35592#issuecomment-1549749734 Thanks @ildipo . I also removed use of `arrow/util/logging.h` which should be avoided in public headers. -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [arrow] ursabot commented on pull request #35597: GH-35596: [C++][CI] Improve compilation caching with PCG

2023-05-16 Thread via GitHub
ursabot commented on PR #35597: URL: https://github.com/apache/arrow/pull/35597#issuecomment-1549753233 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/a72a9403956d4b4fb03657acac94dd3c...f53c84274b474803b2a347516e893e87/)

[GitHub] [arrow-datafusion] tustvold commented on pull request #6183: Faster ListingTable partition listing (#6182)

2023-05-16 Thread via GitHub
tustvold commented on PR #6183: URL: https://github.com/apache/arrow-datafusion/pull/6183#issuecomment-1549754370 > Still debugging this, I noticed this error: Aah, I worried that might happen, we should probably limit the maximum number of concurrent requests when listing the partit

[GitHub] [arrow] pitrou merged pull request #35578: MINOR: [Doc] Update Parquet documentation website links

2023-05-16 Thread via GitHub
pitrou merged PR #35578: URL: https://github.com/apache/arrow/pull/35578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] pitrou merged pull request #35449: GH-35448: [C++] Fix detection of %z in strptime format

2023-05-16 Thread via GitHub
pitrou merged PR #35449: URL: https://github.com/apache/arrow/pull/35449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected]

[GitHub] [arrow] pitrou commented on pull request #35422: MINOR: [C++] Use [] instead of exception-throwing at(i) in concatenate.cc

2023-05-16 Thread via GitHub
pitrou commented on PR #35422: URL: https://github.com/apache/arrow/pull/35422#issuecomment-1549767142 Can you rebase on git main so as to get a slightly less failing CI? :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] pitrou commented on pull request #35414: GH-35413: [Python] Add concrete floating point array types to pyarrow public API

2023-05-16 Thread via GitHub
pitrou commented on PR #35414: URL: https://github.com/apache/arrow/pull/35414#issuecomment-1549770565 @spenczar Thanks for this. We should probably also change https://github.com/apache/arrow/blob/main/docs/source/python/api/arrays.rst -- This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #35312: GH-35409: [Doc] Clarify S3FileSystem Credentials chain for EC2

2023-05-16 Thread via GitHub
pitrou commented on PR #35312: URL: https://github.com/apache/arrow/pull/35312#issuecomment-1549775031 @jorisvandenbossche Would you like to take another look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [arrow-datafusion] tustvold opened a new pull request, #6364: Cleanup ExternalSorter metrics (#5885)

2023-05-16 Thread via GitHub
tustvold opened a new pull request, #6364: URL: https://github.com/apache/arrow-datafusion/pull/6364 # Which issue does this PR close? Part of #5885 Part of #5108 # Rationale for this change In preparation for improving the memory accounting in Externa

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #6364: Cleanup ExternalSorter metrics (#5885)

2023-05-16 Thread via GitHub
tustvold commented on code in PR #6364: URL: https://github.com/apache/arrow-datafusion/pull/6364#discussion_r1195249166 ## datafusion/core/src/physical_plan/metrics/baseline.rs: ## @@ -43,25 +43,16 @@ use crate::error::Result; /// // when operator is finished: /// baseline_me

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #6364: Cleanup ExternalSorter metrics (#5885)

2023-05-16 Thread via GitHub
tustvold commented on code in PR #6364: URL: https://github.com/apache/arrow-datafusion/pull/6364#discussion_r1195251814 ## datafusion/core/src/physical_plan/metrics/composite.rs: ## @@ -1,205 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more con

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #6364: Cleanup ExternalSorter metrics (#5885)

2023-05-16 Thread via GitHub
tustvold commented on code in PR #6364: URL: https://github.com/apache/arrow-datafusion/pull/6364#discussion_r1195254773 ## datafusion/core/src/physical_plan/metrics/tracker.rs: ## @@ -1,104 +0,0 @@ -// Licensed to the Apache Software Foundation (ASF) under one -// or more contr

[GitHub] [arrow] pitrou commented on a diff in pull request #35277: GH-35274: [Java][CI] Enable GCS on MacOS

2023-05-16 Thread via GitHub
pitrou commented on code in PR #35277: URL: https://github.com/apache/arrow/pull/35277#discussion_r1195253695 ## java/Brewfile: ## @@ -17,3 +17,4 @@ brew "openjdk@11" brew "sccache" +brew "curl" Review Comment: Perhaps keep this file in alphabetical order? ##

  1   2   3   4   >