[GitHub] [arrow-datafusion] matthewmturner opened a new pull request #1928: Add db benchmark script

2022-03-04 Thread GitBox
matthewmturner opened a new pull request #1928: URL: https://github.com/apache/arrow-datafusion/pull/1928 # Which issue does this PR close? Closes #1870 # Rationale for this change # What changes are included in this PR? # Are there any user-faci

[GitHub] [arrow] github-actions[bot] commented on pull request #12571: RFC: Add inlined data to flight.

2022-03-04 Thread GitBox
github-actions[bot] commented on pull request #12571: URL: https://github.com/apache/arrow/pull/12571#issuecomment-1059708399 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow] ursabot edited a comment on pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12490: URL: https://github.com/apache/arrow/pull/12490#issuecomment-1059610557 Benchmark runs are scheduled for baseline = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5 and contender = 4ef95eb89f9202dfcd9017633cf55671d56e337f. 4ef95eb89f9202dfcd9017633

[GitHub] [arrow] emkornfield opened a new pull request #12571: RFC: Add inlined data to flight.

2022-03-04 Thread GitBox
emkornfield opened a new pull request #12571: URL: https://github.com/apache/arrow/pull/12571 Trying to formalize discussion on from [mailing list thread](https://lists.apache.org/thread/o1rq3zzxppnvlfwrs9kzf802go44068h0 -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow] sanjibansg commented on a change in pull request #12560: ARROW-15281: [C++] Implement ability to retrieve fragment filename

2022-03-04 Thread GitBox
sanjibansg commented on a change in pull request #12560: URL: https://github.com/apache/arrow/pull/12560#discussion_r820056983 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -877,6 +882,15 @@ Result MakeScanNode(compute::ExecPlan* plan, batch->values.emplace_bac

[GitHub] [arrow] sanjibansg commented on a change in pull request #12560: ARROW-15281: [C++] Implement ability to retrieve fragment filename

2022-03-04 Thread GitBox
sanjibansg commented on a change in pull request #12560: URL: https://github.com/apache/arrow/pull/12560#discussion_r819725083 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -877,6 +882,15 @@ Result MakeScanNode(compute::ExecPlan* plan, batch->values.emplace_bac

[GitHub] [arrow] ursabot edited a comment on pull request #12488: PARQUET-2130: Fix crash in debug with non-standard key names.

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12488: URL: https://github.com/apache/arrow/pull/12488#issuecomment-1059557562 Benchmark runs are scheduled for baseline = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34 and contender = 348057aea798bf612eddcae42495234e5853fd76. 348057aea798bf612eddcae42

[GitHub] [arrow] ursabot edited a comment on pull request #12566: MINOR: [C++][CI] Pin CMake to fix builds failing due to aws-sdk-cpp

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12566: URL: https://github.com/apache/arrow/pull/12566#issuecomment-1059461824 Benchmark runs are scheduled for baseline = 1b796ec3f9caeb5e86e3348ba940bef8d95915c5 and contender = 9bd952c5b5e685658b9cc285e729a5baa8fc1795. 9bd952c5b5e685658b9cc285e

[GitHub] [arrow-rs] novemberkilo opened a new issue #1399: Parquet reader fails to read null list.

2022-03-04 Thread GitBox
novemberkilo opened a new issue #1399: URL: https://github.com/apache/arrow-rs/issues/1399 **Describe the bug** Parquet reader fails to read a parquet file that is generated from this json: ``` {"emptylist":[]} ``` **To Reproduce** I have written a test on a branch to r

[GitHub] [arrow-datafusion] matthewmturner commented on pull request #1922: Add write_csv to DataFrame

2022-03-04 Thread GitBox
matthewmturner commented on pull request #1922: URL: https://github.com/apache/arrow-datafusion/pull/1922#issuecomment-1059693774 @Jimexist im not opposed, but would the idea be to do this only for methods that are writing? or should this be generalized to all IO methods? Im just wonderin

[GitHub] [arrow-datafusion] Jimexist commented on pull request #1922: Add write_csv to DataFrame

2022-03-04 Thread GitBox
Jimexist commented on pull request #1922: URL: https://github.com/apache/arrow-datafusion/pull/1922#issuecomment-1059692734 Hi thanks for the contribution I wonder if we can use an extension trait for Csv writing method so that users can choose to use the method if they import the

[GitHub] [arrow] ursabot edited a comment on pull request #12521: ARROW-15795: [Java] Add a getter for the timeZone in timestamp with timezone vectors

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12521: URL: https://github.com/apache/arrow/pull/12521#issuecomment-1059579237 Benchmark runs are scheduled for baseline = 348057aea798bf612eddcae42495234e5853fd76 and contender = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5. 6734d0faede2030e202aee5a0

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
yjshen commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820040751 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
yjshen commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820040751 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow] ursabot edited a comment on pull request #12561: ARROW-15845: [Python][Packaging] Fix macOS wheel builds

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12561: URL: https://github.com/apache/arrow/pull/12561#issuecomment-1059552115 Benchmark runs are scheduled for baseline = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf and contender = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34. 762bb3d64f055db72ebb61ebe

[GitHub] [arrow-datafusion] matthewmturner commented on pull request #1922: Add write_csv to DataFrame

2022-03-04 Thread GitBox
matthewmturner commented on pull request #1922: URL: https://github.com/apache/arrow-datafusion/pull/1922#issuecomment-1059680356 @alamb thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-datafusion] viirya commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
viirya commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820035491 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow-datafusion] viirya commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
viirya commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820035032 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1927: Add content from design docs to datafusion site

2022-03-04 Thread GitBox
matthewmturner commented on issue #1927: URL: https://github.com/apache/arrow-datafusion/issues/1927#issuecomment-1059677044 @alamb @yjshen @houqp @rdettai @Dandandan @yahoNanJing @gaojun2048 If any of you know of other design documents (or know someone else who does) can you pleas

[GitHub] [arrow-datafusion] matthewmturner opened a new issue #1927: Add content from design docs to datafusion site

2022-03-04 Thread GitBox
matthewmturner opened a new issue #1927: URL: https://github.com/apache/arrow-datafusion/issues/1927 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A clear and concise description of what the problem is. Ex. I'm always frustrate

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
yjshen commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820028000 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1921: Add timeout to can_grow_directly when waiting for the Condvar.

2022-03-04 Thread GitBox
yjshen commented on a change in pull request #1921: URL: https://github.com/apache/arrow-datafusion/pull/1921#discussion_r820027341 ## File path: datafusion/src/execution/memory_manager.rs ## @@ -340,7 +341,13 @@ impl MemoryManager { } else if current < min_per_rqt

[GitHub] [arrow] ursabot edited a comment on pull request #12488: PARQUET-2130: Fix crash in debug with non-standard key names.

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12488: URL: https://github.com/apache/arrow/pull/12488#issuecomment-1059557562 Benchmark runs are scheduled for baseline = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34 and contender = 348057aea798bf612eddcae42495234e5853fd76. 348057aea798bf612eddcae42

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1652: ARROW2: Performance benchmark

2022-03-04 Thread GitBox
matthewmturner commented on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1059660855 checking in on this - has anyone rerun tpch on master and arrow2 branches since the arrow-rs 9 release? -- This is an automated message from the Apache Git Servic

[GitHub] [arrow] ursabot edited a comment on pull request #12569: ARROW-15850: [C++] Engine substrait headers missing from install

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12569: URL: https://github.com/apache/arrow/pull/12569#issuecomment-1059461834 Benchmark runs are scheduled for baseline = 9bd952c5b5e685658b9cc285e729a5baa8fc1795 and contender = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf. 8fce5938c00953761c4cb79ae

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1389: Introduce `ReadOptions` with builder API, filter row groups that satisfy all filters, and enable filter row groups by rang

2022-03-04 Thread GitBox
codecov-commenter edited a comment on pull request #1389: URL: https://github.com/apache/arrow-rs/pull/1389#issuecomment-1058957663 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1389?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow] GavinRay97 opened a new issue #12570: Arrow nightly Maven releases don't seem to work

2022-03-04 Thread GitBox
GavinRay97 opened a new issue #12570: URL: https://github.com/apache/arrow/issues/12570 Following the instructions listed here: - https://github.com/apache/arrow/blob/650f111b524fb1c5bfbfa6f533d15929c90ddc40/docs/source/java/install.rst#installing-nightly-packages I get the follow

[GitHub] [arrow] ursabot edited a comment on pull request #12561: ARROW-15845: [Python][Packaging] Fix macOS wheel builds

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12561: URL: https://github.com/apache/arrow/pull/12561#issuecomment-1059552115 Benchmark runs are scheduled for baseline = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf and contender = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34. 762bb3d64f055db72ebb61ebe

[GitHub] [arrow] ursabot edited a comment on pull request #12533: ARROW-15817: [R] Use TableSourceNode instead of InMemoryDataset

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12533: URL: https://github.com/apache/arrow/pull/12533#issuecomment-1059281689 Benchmark runs are scheduled for baseline = f5a0cafe159eb1fe8000d8db6bc3b683ee83bbf8 and contender = 1b796ec3f9caeb5e86e3348ba940bef8d95915c5. 1b796ec3f9caeb5e86e3348ba

[GitHub] [arrow] ursabot edited a comment on pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12490: URL: https://github.com/apache/arrow/pull/12490#issuecomment-1059610557 Benchmark runs are scheduled for baseline = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5 and contender = 4ef95eb89f9202dfcd9017633cf55671d56e337f. 4ef95eb89f9202dfcd9017633

[GitHub] [arrow] ursabot commented on pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
ursabot commented on pull request #12490: URL: https://github.com/apache/arrow/pull/12490#issuecomment-1059610557 Benchmark runs are scheduled for baseline = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5 and contender = 4ef95eb89f9202dfcd9017633cf55671d56e337f. 4ef95eb89f9202dfcd9017633cf55671

[GitHub] [arrow-rs] lidavidm commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
lidavidm commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059610103 I believe the :scheme pseudo-header is supposed to be 'http' or 'https', not 'grpc': https://grpc.github.io/grpc/cpp/md_doc__p_r_o_t_o_c_o_l-_h_t_t_p2.html It sounds like To

[GitHub] [arrow] emkornfield closed pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
emkornfield closed pull request #12490: URL: https://github.com/apache/arrow/pull/12490 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow] emkornfield commented on pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
emkornfield commented on pull request #12490: URL: https://github.com/apache/arrow/pull/12490#issuecomment-1059609543 +1, failures look unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] ursabot edited a comment on pull request #12569: ARROW-15850: [C++] Engine substrait headers missing from install

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12569: URL: https://github.com/apache/arrow/pull/12569#issuecomment-1059461834 Benchmark runs are scheduled for baseline = 9bd952c5b5e685658b9cc285e729a5baa8fc1795 and contender = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf. 8fce5938c00953761c4cb79ae

[GitHub] [arrow-rs] alamb commented on pull request #1383: Add write method to Json Writer

2022-03-04 Thread GitBox
alamb commented on pull request #1383: URL: https://github.com/apache/arrow-rs/pull/1383#issuecomment-1059597919 That is interesting -- @chmp -- I tried to do something similar with IOx (to create RecordBatches from vec's of objects we already had and then expose them as SQL queryable tab

[GitHub] [arrow-rs] alamb commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059594658 Even when I use arrow at https://github.com/apache/arrow/commit/e314d8d0d611c7f9ca7f2fbee174fcea3d0c66f2 which had a passing [Integration test](https://github.com/apache/arrow/runs/

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1912: Refactor the event channel

2022-03-04 Thread GitBox
yahoNanJing commented on a change in pull request #1912: URL: https://github.com/apache/arrow-datafusion/pull/1912#discussion_r819974810 ## File path: ballista/rust/scheduler/src/scheduler_server/mod.rs ## @@ -0,0 +1,142 @@ +// Licensed to the Apache Software Foundation (ASF) u

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1912: Refactor the event channel

2022-03-04 Thread GitBox
yahoNanJing commented on a change in pull request #1912: URL: https://github.com/apache/arrow-datafusion/pull/1912#discussion_r819974010 ## File path: ballista/rust/core/src/event_loop.rs ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow-rs] alamb edited a comment on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb edited a comment on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059592336 I tried backing out https://github.com/apache/arrow/commit/2462492389a8f2ca286c481852c84ba1f0d0eff9 (the first build that failed in arrow) from my local environment and it did

[GitHub] [arrow-rs] alamb commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059592336 I tried backing out https://github.com/apache/arrow/commit/2462492389a8f2ca286c481852c84ba1f0d0eff9 from my local environment and it did not help -- This is an automated message f

[GitHub] [arrow-rs] alamb edited a comment on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb edited a comment on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059589935 @matthewmturner -- I agree with your assesment. Interesting the integration tests appear to fail inhttps://github.com/apache/arrow/ as well with the same errors:

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1913: Refactor scheduler state mod

2022-03-04 Thread GitBox
yahoNanJing commented on a change in pull request #1913: URL: https://github.com/apache/arrow-datafusion/pull/1913#discussion_r819972373 ## File path: ballista/rust/scheduler/src/state/persistent_state.rs ## @@ -0,0 +1,312 @@ +// Licensed to the Apache Software Foundation (ASF)

[GitHub] [arrow-rs] alamb commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059589935 @matthewmturner -- I agree with your assesment. Interesting the integration tests appear to fail in apache as well with the same errors: https://github.com/apache/arro

[GitHub] [arrow] jonkeane commented on a change in pull request #12319: ARROW-14199 [R] bindings for format (where possible)

2022-03-04 Thread GitBox
jonkeane commented on a change in pull request #12319: URL: https://github.com/apache/arrow/pull/12319#discussion_r819969875 ## File path: r/tests/testthat/test-dplyr-funcs-type.R ## @@ -843,3 +843,90 @@ test_that("as.Date() converts successfully from date, timestamp, integer,

[GitHub] [arrow] ursabot edited a comment on pull request #12521: ARROW-15795: [Java] Add a getter for the timeZone in timestamp with timezone vectors

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12521: URL: https://github.com/apache/arrow/pull/12521#issuecomment-1059579237 Benchmark runs are scheduled for baseline = 348057aea798bf612eddcae42495234e5853fd76 and contender = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5. 6734d0faede2030e202aee5a0

[GitHub] [arrow-rs] matthewmturner commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
matthewmturner commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059587436 apologize if ive misunderstood any part of this or if i dont understand full intergration testing process - but given that both arrow and arrow2 started failing i am focusin

[GitHub] [arrow] dragosmg commented on a change in pull request #12482: ARROW-15701 [R] month() should allow integer inputs

2022-03-04 Thread GitBox
dragosmg commented on a change in pull request #12482: URL: https://github.com/apache/arrow/pull/12482#discussion_r819967589 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -105,7 +105,22 @@ register_bindings_datetime <- function() { (call_binding("yday", x) - 1) %/% 7 + 1

[GitHub] [arrow] dragosmg commented on a change in pull request #12482: ARROW-15701 [R] month() should allow integer inputs

2022-03-04 Thread GitBox
dragosmg commented on a change in pull request #12482: URL: https://github.com/apache/arrow/pull/12482#discussion_r819967589 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -105,7 +105,22 @@ register_bindings_datetime <- function() { (call_binding("yday", x) - 1) %/% 7 + 1

[GitHub] [arrow] jonkeane commented on a change in pull request #12482: ARROW-15701 [R] month() should allow integer inputs

2022-03-04 Thread GitBox
jonkeane commented on a change in pull request #12482: URL: https://github.com/apache/arrow/pull/12482#discussion_r819959724 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -105,7 +105,22 @@ register_bindings_datetime <- function() { (call_binding("yday", x) - 1) %/% 7 + 1

[GitHub] [arrow] ursabot commented on pull request #12521: ARROW-15795: [Java] Add a getter for the timeZone in timestamp with timezone vectors

2022-03-04 Thread GitBox
ursabot commented on pull request #12521: URL: https://github.com/apache/arrow/pull/12521#issuecomment-1059579237 Benchmark runs are scheduled for baseline = 348057aea798bf612eddcae42495234e5853fd76 and contender = 6734d0faede2030e202aee5a0c7a1ace8eefd2d5. 6734d0faede2030e202aee5a0c7a1ace

[GitHub] [arrow] BryanCutler closed pull request #12521: ARROW-15795: [Java] Add a getter for the timeZone in timestamp with timezone vectors

2022-03-04 Thread GitBox
BryanCutler closed pull request #12521: URL: https://github.com/apache/arrow/pull/12521 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow] BryanCutler commented on pull request #12521: ARROW-15795: [Java] Add a getter for the timeZone in timestamp with timezone vectors

2022-03-04 Thread GitBox
BryanCutler commented on pull request #12521: URL: https://github.com/apache/arrow/pull/12521#issuecomment-1059577150 Test failures look unrelated, merging to master. Thanks @fabiencelier ! -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] ursabot edited a comment on pull request #12488: PARQUET-2130: Fix crash in debug with non-standard key names.

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12488: URL: https://github.com/apache/arrow/pull/12488#issuecomment-1059557562 Benchmark runs are scheduled for baseline = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34 and contender = 348057aea798bf612eddcae42495234e5853fd76. 348057aea798bf612eddcae42

[GitHub] [arrow] ursabot edited a comment on pull request #12523: ARROW-15743: [R] `skip` not connected up to `skip_rows` on open_dataset despite error messages indicating otherwise

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12523: URL: https://github.com/apache/arrow/pull/12523#issuecomment-1059265226 Benchmark runs are scheduled for baseline = 28b77253d77bc53653934f70dd05eeb747f8707f and contender = f5a0cafe159eb1fe8000d8db6bc3b683ee83bbf8. f5a0cafe159eb1fe8000d8db6

[GitHub] [arrow] kszucs commented on pull request #12320: ARROW-15483: [Release] Revamp the verification scripts

2022-03-04 Thread GitBox
kszucs commented on pull request #12320: URL: https://github.com/apache/arrow/pull/12320#issuecomment-1059567844 > +1 > > Can we merge this? If you think that the changes are okay than yes :) -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-rs] alamb commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059563416 The output from rust is: ``` Error: Status { code: Unknown, message: "transport error", source: Some(tonic::transport::Error(Transport, hyper::Error(Http2, Error { kind: Re

[GitHub] [arrow] emkornfield commented on pull request #12490: PARQUET-2131: Number values decoded DCHECKs should be exceptions

2022-03-04 Thread GitBox
emkornfield commented on pull request #12490: URL: https://github.com/apache/arrow/pull/12490#issuecomment-1059563112 Will merge on green CI. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] ursabot edited a comment on pull request #12561: ARROW-15845: [Python][Packaging] Fix macOS wheel builds

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12561: URL: https://github.com/apache/arrow/pull/12561#issuecomment-1059552115 Benchmark runs are scheduled for baseline = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf and contender = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34. 762bb3d64f055db72ebb61ebe

[GitHub] [arrow] ursabot edited a comment on pull request #12559: ARROW-15844: [Release][Packaging] Use ASCII format for detached sign

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12559: URL: https://github.com/apache/arrow/pull/12559#issuecomment-1059115617 Benchmark runs are scheduled for baseline = b3ede5a65f87ba33ab5a5def096f9cc26407cfc9 and contender = 28b77253d77bc53653934f70dd05eeb747f8707f. 28b77253d77bc53653934f70d

[GitHub] [arrow] emkornfield commented on pull request #11982: ARROW-15313: [C++][Java][FlightRPC] Implement type info method to flight-sql

2022-03-04 Thread GitBox
emkornfield commented on pull request #11982: URL: https://github.com/apache/arrow/pull/11982#issuecomment-1059558663 > Hi @emkornfield, I've addressed all your comments. Is there any other concerns from your part? No, I think my concerns were addressed. -- This is an automated me

[GitHub] [arrow] ursabot commented on pull request #12488: PARQUET-2130: Fix crash in debug with non-standard key names.

2022-03-04 Thread GitBox
ursabot commented on pull request #12488: URL: https://github.com/apache/arrow/pull/12488#issuecomment-1059557562 Benchmark runs are scheduled for baseline = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34 and contender = 348057aea798bf612eddcae42495234e5853fd76. 348057aea798bf612eddcae42495234e

[GitHub] [arrow] emkornfield closed pull request #12488: PARQUET-2130: Fix crash in debug with non-standard key names.

2022-03-04 Thread GitBox
emkornfield closed pull request #12488: URL: https://github.com/apache/arrow/pull/12488 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-uns

[GitHub] [arrow-rs] alamb edited a comment on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb edited a comment on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059551674 FYI @jorgecarleitao I have been able to reproduce this locally. Here are some notes I have in case that is helpful: ``` # check out arrow # insta

[GitHub] [arrow] ursabot commented on pull request #12561: ARROW-15845: [Python][Packaging] Fix macOS wheel builds

2022-03-04 Thread GitBox
ursabot commented on pull request #12561: URL: https://github.com/apache/arrow/pull/12561#issuecomment-1059552115 Benchmark runs are scheduled for baseline = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf and contender = 762bb3d64f055db72ebb61ebe0d53a929ea8cd34. 762bb3d64f055db72ebb61ebe0d53a92

[GitHub] [arrow-rs] alamb commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
alamb commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059551674 FYI @jorgecarleitao I have been able to reproduce this locally. Here are some notes I have in case that is helpful: ``` # check out arrow # install arch

[GitHub] [arrow-rs] chmp commented on pull request #1383: Add write method to Json Writer

2022-03-04 Thread GitBox
chmp commented on pull request #1383: URL: https://github.com/apache/arrow-rs/pull/1383#issuecomment-1059548659 Sure happy to sum up `serde_arrow`. First: at the moment it's only an experiement. I found it useful for some private data processing, and thought maybe it's also helpful to othe

[GitHub] [arrow] ursabot edited a comment on pull request #12566: MINOR: [C++][CI] Pin CMake to fix builds failing due to aws-sdk-cpp

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12566: URL: https://github.com/apache/arrow/pull/12566#issuecomment-1059461824 Benchmark runs are scheduled for baseline = 1b796ec3f9caeb5e86e3348ba940bef8d95915c5 and contender = 9bd952c5b5e685658b9cc285e729a5baa8fc1795. 9bd952c5b5e685658b9cc285e

[GitHub] [arrow] kou closed pull request #12561: ARROW-15845: [Python][Packaging] Fix macOS wheel builds

2022-03-04 Thread GitBox
kou closed pull request #12561: URL: https://github.com/apache/arrow/pull/12561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow-datafusion] alamb commented on pull request #1922: Add write_csv to DataFrame

2022-03-04 Thread GitBox
alamb commented on pull request #1922: URL: https://github.com/apache/arrow-datafusion/pull/1922#issuecomment-1059542713 > I took the logic for writing csvs and put it in new function plan_to_csv within src/physical_plan/file_format/csv and moved testing there. Then just used that functi

[GitHub] [arrow-julia] codecov-commenter commented on pull request #296: Solving #295

2022-03-04 Thread GitBox
codecov-commenter commented on pull request #296: URL: https://github.com/apache/arrow-julia/pull/296#issuecomment-1059541727 # [Codecov](https://codecov.io/gh/apache/arrow-julia/pull/296?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_ter

[GitHub] [arrow] ursabot edited a comment on pull request #12566: MINOR: [C++][CI] Pin CMake to fix builds failing due to aws-sdk-cpp

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12566: URL: https://github.com/apache/arrow/pull/12566#issuecomment-1059461824 Benchmark runs are scheduled for baseline = 1b796ec3f9caeb5e86e3348ba940bef8d95915c5 and contender = 9bd952c5b5e685658b9cc285e729a5baa8fc1795. 9bd952c5b5e685658b9cc285e

[GitHub] [arrow-datafusion] realno commented on issue #1870: Add a script for running full db-benchmark suite

2022-03-04 Thread GitBox
realno commented on issue #1870: URL: https://github.com/apache/arrow-datafusion/issues/1870#issuecomment-1059526470 @matthewmturner I like the idea using docker, it should be easier for dependency management. -- This is an automated message from the Apache Git Service. To respond to th

[GitHub] [arrow] kou commented on pull request #12320: ARROW-15483: [Release] Revamp the verification scripts

2022-03-04 Thread GitBox
kou commented on pull request #12320: URL: https://github.com/apache/arrow/pull/12320#issuecomment-1059523565 +1 Can we merge this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow-datafusion] thinkharderdev commented on a change in pull request #1913: Refactor scheduler state mod

2022-03-04 Thread GitBox
thinkharderdev commented on a change in pull request #1913: URL: https://github.com/apache/arrow-datafusion/pull/1913#discussion_r819907547 ## File path: ballista/rust/scheduler/src/state/persistent_state.rs ## @@ -0,0 +1,312 @@ +// Licensed to the Apache Software Foundation (A

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1870: Add a script for running full db-benchmark suite

2022-03-04 Thread GitBox
matthewmturner commented on issue #1870: URL: https://github.com/apache/arrow-datafusion/issues/1870#issuecomment-1059518563 i just keep forgetting people. @realno i know you had expressed interest in benchmarking in the past as well so im curious if you have a preference on the abo

[GitHub] [arrow-datafusion] matthewmturner edited a comment on issue #1870: Add a script for running db-benchmark

2022-03-04 Thread GitBox
matthewmturner edited a comment on issue #1870: URL: https://github.com/apache/arrow-datafusion/issues/1870#issuecomment-1058821000 Given that db-benchmark uses R scripts to download the source data im wondering if i should assume that the user will take care of that or if i should wrap t

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #12409: ARROW-15668: Simplified skip logic in integration tests

2022-03-04 Thread GitBox
jorgecarleitao commented on a change in pull request #12409: URL: https://github.com/apache/arrow/pull/12409#discussion_r819901884 ## File path: dev/archery/archery/integration/runner.py ## @@ -129,6 +136,11 @@ def _gold_tests(self, gold_dir): skip = set()

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #12409: ARROW-15668: Simplified skip logic in integration tests

2022-03-04 Thread GitBox
jorgecarleitao commented on a change in pull request #12409: URL: https://github.com/apache/arrow/pull/12409#discussion_r819900978 ## File path: dev/.gitignore ## @@ -17,4 +17,5 @@ # Python virtual environments for dev tools .venv*/ +venv/ Review comment: both are c

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #12409: ARROW-15668: Simplified skip logic in integration tests

2022-03-04 Thread GitBox
jorgecarleitao commented on a change in pull request #12409: URL: https://github.com/apache/arrow/pull/12409#discussion_r819900424 ## File path: dev/archery/archery/integration/languages/cpp.yaml ## @@ -0,0 +1,19 @@ +# Licensed to the Apache Software Foundation (ASF) under one

[GitHub] [arrow] ursabot edited a comment on pull request #12569: ARROW-15850: [C++] Engine substrait headers missing from install

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12569: URL: https://github.com/apache/arrow/pull/12569#issuecomment-1059461834 Benchmark runs are scheduled for baseline = 9bd952c5b5e685658b9cc285e729a5baa8fc1795 and contender = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf. 8fce5938c00953761c4cb79ae

[GitHub] [arrow-rs] jorgecarleitao commented on issue #1398: Integration Test is failing on master branch

2022-03-04 Thread GitBox
jorgecarleitao commented on issue #1398: URL: https://github.com/apache/arrow-rs/issues/1398#issuecomment-1059503668 fwiw arrow2 is also failing on those two tests and we did not change anything in tonic, flight, IPC etc that could have caused this. I am investigating it as well. -- Thi

[GitHub] [arrow] paleolimbot commented on pull request #12564: ARROW-15818: [R] Implement initial Substrait consumer in the R bindings

2022-03-04 Thread GitBox
paleolimbot commented on pull request #12564: URL: https://github.com/apache/arrow/pull/12564#issuecomment-1059487075 Whee...this works! (Although until ARROW-15849 is merged it's necessary to copy the arrow/engine/substrait/ headers manually...). It currently just prints the output...befo

[GitHub] [arrow] ursabot edited a comment on pull request #12566: MINOR: [C++][CI] Pin CMake to fix builds failing due to aws-sdk-cpp

2022-03-04 Thread GitBox
ursabot edited a comment on pull request #12566: URL: https://github.com/apache/arrow/pull/12566#issuecomment-1059461824 Benchmark runs are scheduled for baseline = 1b796ec3f9caeb5e86e3348ba940bef8d95915c5 and contender = 9bd952c5b5e685658b9cc285e729a5baa8fc1795. 9bd952c5b5e685658b9cc285e

[GitHub] [arrow] amol- commented on pull request #12452: ARROW-14292: [C++][Python] Join foundation for Tables

2022-03-04 Thread GitBox
amol- commented on pull request #12452: URL: https://github.com/apache/arrow/pull/12452#issuecomment-1059480794 There seem to be a bunch of problems with CI unrelated to the PR, there are failures related to ``` Error: getCacheEntry failed: connect ETIMEDOUT 13.107.42.16:443 ```

[GitHub] [arrow-datafusion] thinkharderdev commented on a change in pull request #1912: Refactor the event channel

2022-03-04 Thread GitBox
thinkharderdev commented on a change in pull request #1912: URL: https://github.com/apache/arrow-datafusion/pull/1912#discussion_r819833301 ## File path: ballista/rust/core/src/event_loop.rs ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[GitHub] [arrow] nealrichardson commented on issue #12568: Support for automatic optimizations

2022-03-04 Thread GitBox
nealrichardson commented on issue #12568: URL: https://github.com/apache/arrow/issues/12568#issuecomment-1059473984 The arrow C++ engine supports predicate pushdown. The R package wraps that and supports calling it with dplyr methods, which push down. The pandas comparison in the duckdb bl

[GitHub] [arrow-rs] alamb edited a comment on pull request #1385: Move Parser trait and its implementations to utils module

2022-03-04 Thread GitBox
alamb edited a comment on pull request #1385: URL: https://github.com/apache/arrow-rs/pull/1385#issuecomment-1059472903 Thanks @sum12 -- the integration test is failing due to #1398 (not changes in this PR) -- I am trying to figure out what is happening -- This is an automated message f

[GitHub] [arrow-rs] alamb commented on pull request #1385: Move Parser trait and its implementations to utils module

2022-03-04 Thread GitBox
alamb commented on pull request #1385: URL: https://github.com/apache/arrow-rs/pull/1385#issuecomment-1059472903 Thanks @sum12 -- the integration test is failing due to #1398 -- I am trying to figure out what is happening -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-datafusion] alamb commented on pull request #1881: add udf/udaf plugin

2022-03-04 Thread GitBox
alamb commented on pull request #1881: URL: https://github.com/apache/arrow-datafusion/pull/1881#issuecomment-1059471984 Thank you @gaojun2048 for taking the feedback and making changes. I appreciate that building consensus took some non trivial effort in this case. -- This is an auto

[GitHub] [arrow-rs] yordan-pavlov commented on a change in pull request #1389: Introduce `ReadOptions` with builder API, filter row groups that satisfy all filters, and enable filter row groups by ran

2022-03-04 Thread GitBox
yordan-pavlov commented on a change in pull request #1389: URL: https://github.com/apache/arrow-rs/pull/1389#discussion_r819869426 ## File path: parquet/src/file/serialized_reader.rs ## @@ -138,25 +188,38 @@ impl SerializedFileReader { }) } -/// Filters row

[GitHub] [arrow-datafusion] alamb merged pull request #1863: Enhance MemorySchemaProvider to support `register_listing_table`

2022-03-04 Thread GitBox
alamb merged pull request #1863: URL: https://github.com/apache/arrow-datafusion/pull/1863 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb closed issue #1836: Register multiple tables into `ExecutionContext` at once

2022-03-04 Thread GitBox
alamb closed issue #1836: URL: https://github.com/apache/arrow-datafusion/issues/1836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1831: determine build side in hash join by `total_byte_size` instead of `num_rows`

2022-03-04 Thread GitBox
alamb merged pull request #1831: URL: https://github.com/apache/arrow-datafusion/pull/1831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-rs] alamb commented on pull request #1383: Add write method to Json Writer

2022-03-04 Thread GitBox
alamb commented on pull request #1383: URL: https://github.com/apache/arrow-rs/pull/1383#issuecomment-1059468873 Hi @v1gnesh I am not familiar with those crates. This is probably a question better directed at the authors of them (and if there is anything they wanted to contribute back to

[GitHub] [arrow-rs] yordan-pavlov commented on a change in pull request #1389: Introduce `ReadOptions` with builder API, filter row groups that satisfy all filters, and enable filter row groups by ran

2022-03-04 Thread GitBox
yordan-pavlov commented on a change in pull request #1389: URL: https://github.com/apache/arrow-rs/pull/1389#discussion_r819865424 ## File path: parquet/src/file/serialized_reader.rs ## @@ -127,6 +127,56 @@ pub struct SerializedFileReader { metadata: ParquetMetaData, }

[GitHub] [arrow-rs] yordan-pavlov commented on a change in pull request #1389: Introduce `ReadOptions` with builder API, filter row groups that satisfy all filters, and enable filter row groups by ran

2022-03-04 Thread GitBox
yordan-pavlov commented on a change in pull request #1389: URL: https://github.com/apache/arrow-rs/pull/1389#discussion_r819864343 ## File path: parquet/src/file/serialized_reader.rs ## @@ -127,6 +127,56 @@ pub struct SerializedFileReader { metadata: ParquetMetaData, }

[GitHub] [arrow] lidavidm commented on pull request #12534: ARROW-15178: [Java][Docs] Java Tutorial: Developer Docs for Java

2022-03-04 Thread GitBox
lidavidm commented on pull request #12534: URL: https://github.com/apache/arrow/pull/12534#issuecomment-1059462094 Also, if you rebase, CI should be fixed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] ursabot commented on pull request #12569: ARROW-15850: [C++] Engine substrait headers missing from install

2022-03-04 Thread GitBox
ursabot commented on pull request #12569: URL: https://github.com/apache/arrow/pull/12569#issuecomment-1059461834 Benchmark runs are scheduled for baseline = 9bd952c5b5e685658b9cc285e729a5baa8fc1795 and contender = 8fce5938c00953761c4cb79ae1dab793bcc6dfaf. 8fce5938c00953761c4cb79ae1dab793

[GitHub] [arrow] lidavidm commented on a change in pull request #12534: ARROW-15178: [Java][Docs] Java Tutorial: Developer Docs for Java

2022-03-04 Thread GitBox
lidavidm commented on a change in pull request #12534: URL: https://github.com/apache/arrow/pull/12534#discussion_r819859939 ## File path: docs/source/developers/java/building.rst ## @@ -0,0 +1,231 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more co

  1   2   3   >