[GitHub] [arrow] cyb70289 commented on a change in pull request #12438: ARROW-15688: [C++] add_checked doesn't error out on duration overflow

2022-02-15 Thread GitBox
cyb70289 commented on a change in pull request #12438: URL: https://github.com/apache/arrow/pull/12438#discussion_r807642118 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -28,13 +28,25 @@ #include "arrow/util/checked_cast.h" #include "arrow/util/fo

[GitHub] [arrow-datafusion] yahoNanJing commented on pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
yahoNanJing commented on pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#issuecomment-1041197277 Hi @realno, could you help review this error handling commit? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #1840: Ballista standalone mode tests fail: `context::tests::test_task_stuck_when_referenced_task_failed`

2022-02-15 Thread GitBox
Ted-Jiang commented on issue #1840: URL: https://github.com/apache/arrow-datafusion/issues/1840#issuecomment-1041196736 i reproduce in my local. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
yahoNanJing commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807623672 ## File path: ballista/rust/scheduler/src/state/mod.rs ## @@ -100,234 +267,464 @@ impl SchedulerState Result> { -let mut result = vec

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
yahoNanJing commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807622791 ## File path: ballista/rust/scheduler/src/lib.rs ## @@ -606,120 +589,65 @@ impl SchedulerGrpc } } -async fn send_heart_be

[GitHub] [arrow] ursabot edited a comment on pull request #12144: ARROW-13594: [CI] Enable nightly turbodbc builds again

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12144: URL: https://github.com/apache/arrow/pull/12144#issuecomment-1040468870 Benchmark runs are scheduled for baseline = 26d6e6217ff79451a3fe366bcc88293c7ae67417 and contender = e00cf8397a73efd9eed5926d2daaa06222028c0e. e00cf8397a73efd9eed5926d2

[GitHub] [arrow-datafusion] Dandandan closed issue #1651: Panic/dropped data when reading parquet files with incompatible shemas

2022-02-15 Thread GitBox
Dandandan closed issue #1651: URL: https://github.com/apache/arrow-datafusion/issues/1651 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-u

[GitHub] [arrow-datafusion] Dandandan merged pull request #1837: Return `Error` when parquet reader fails rather than no data with `println!`

2022-02-15 Thread GitBox
Dandandan merged pull request #1837: URL: https://github.com/apache/arrow-datafusion/pull/1837 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: git

[GitHub] [arrow-datafusion] Dandandan closed issue #1767: Parquet reader thread errors do not make query fail

2022-02-15 Thread GitBox
Dandandan closed issue #1767: URL: https://github.com/apache/arrow-datafusion/issues/1767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-u

[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #1840: Ballista standalone mode tests fail: `context::tests::test_task_stuck_when_referenced_task_failed`

2022-02-15 Thread GitBox
Ted-Jiang commented on issue #1840: URL: https://github.com/apache/arrow-datafusion/issues/1840#issuecomment-1041188083 @alamb plz assign this to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow-datafusion] Ted-Jiang commented on pull request #1783: Enable periodic cleanup of work_dir directories in ballista executor

2022-02-15 Thread GitBox
Ted-Jiang commented on pull request #1783: URL: https://github.com/apache/arrow-datafusion/pull/1783#issuecomment-1041186723 cc @liukun4515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-datafusion] Ted-Jiang commented on a change in pull request #1783: Enable periodic cleanup of work_dir directories in ballista executor

2022-02-15 Thread GitBox
Ted-Jiang commented on a change in pull request #1783: URL: https://github.com/apache/arrow-datafusion/pull/1783#discussion_r807606484 ## File path: ballista/rust/executor/src/main.rs ## @@ -148,3 +167,108 @@ async fn main() -> Result<()> { Ok(()) } + +/// This function

[GitHub] [arrow] ursabot edited a comment on pull request #12427: PARQUET-2124: [C++] Remove Parquet Dictionary DCHECK

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12427: URL: https://github.com/apache/arrow/pull/12427#issuecomment-1040592605 Benchmark runs are scheduled for baseline = d414f030b5c9edab778b23b85bc0fa766499b81f and contender = 6a2ee11d30676f99e40dfd9af94915981180510b. 6a2ee11d30676f99e40dfd9af

[GitHub] [arrow-datafusion] Ted-Jiang commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
Ted-Jiang commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807587232 ## File path: ballista/rust/scheduler/src/lib.rs ## @@ -606,120 +589,65 @@ impl SchedulerGrpc } } -async fn send_heart_beat

[GitHub] [arrow-datafusion] Ted-Jiang commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
Ted-Jiang commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807581029 ## File path: ballista/rust/scheduler/src/lib.rs ## @@ -606,120 +589,65 @@ impl SchedulerGrpc } } -async fn send_heart_beat

[GitHub] [arrow] ursabot edited a comment on pull request #12403: ARROW-15353: [Doc][Guide] Intro into CI topic and link to the existing docs

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12403: URL: https://github.com/apache/arrow/pull/12403#issuecomment-1040680589 Benchmark runs are scheduled for baseline = 3873f632952370688bfd087e0624f96f3d5b5b56 and contender = 96785665eff453aa4e5fc87a8ee5d047b9526869. 96785665eff453aa4e5fc87a8

[GitHub] [arrow-datafusion] Ted-Jiang closed pull request #1827: implement bitmap_distinct function using bitmap

2022-02-15 Thread GitBox
Ted-Jiang closed pull request #1827: URL: https://github.com/apache/arrow-datafusion/pull/1827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: git

[GitHub] [arrow-datafusion] Ted-Jiang opened a new pull request #1841: Implement bitmap_distinct function using croaring-rs bitmap

2022-02-15 Thread GitBox
Ted-Jiang opened a new pull request #1841: URL: https://github.com/apache/arrow-datafusion/pull/1841 # Which issue does this PR close? Closes #1823. # Rationale for this change test result: ``` 1million_1million.parquet ++ | COUNT

[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
yahoNanJing commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807550540 ## File path: ballista/rust/executor/src/standalone.rs ## @@ -36,40 +37,43 @@ pub async fn new_standalone_executor( scheduler: Scheduler

[GitHub] [arrow] jorgecarleitao commented on pull request #12411: ARROW-15698: [Integration] Privatized some code in tests

2022-02-15 Thread GitBox
jorgecarleitao commented on pull request #12411: URL: https://github.com/apache/arrow/pull/12411#issuecomment-1041114132 @kou , thanks. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #12411: ARROW-15698: [Integration] Privatized some code in tests

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12411: URL: https://github.com/apache/arrow/pull/12411#issuecomment-1041114093 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] ursabot edited a comment on pull request #12178: ARROW-9664: [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12178: URL: https://github.com/apache/arrow/pull/12178#issuecomment-1040099351 Benchmark runs are scheduled for baseline = 7b5efe47ba5a31f9850e5cdbf47feea4e0f6c455 and contender = 26d6e6217ff79451a3fe366bcc88293c7ae67417. 26d6e6217ff79451a3fe366bc

[GitHub] [arrow] ursabot edited a comment on pull request #12345: ARROW-15594: [C++][FlightRPC] Add Deserialize(const Buffer&) to various Flight types

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12345: URL: https://github.com/apache/arrow/pull/12345#issuecomment-1040583416 Benchmark runs are scheduled for baseline = 1eb75b3312f179bf315c35fca44527f1c504ff3a and contender = d414f030b5c9edab778b23b85bc0fa766499b81f. d414f030b5c9edab778b23b85

[GitHub] [arrow-rs] HaoYang670 opened a new issue #1316: `len` is not a parameter of `MutableArrayData::extend`

2022-02-15 Thread GitBox
HaoYang670 opened a new issue #1316: URL: https://github.com/apache/arrow-rs/issues/1316 **Describe the bug** The document of `MutableArrayData::extend` contains a weird value `len` which is not a parameter of this function. ![image](https://user-images.githubusercontent.com/5919

[GitHub] [arrow] ursabot edited a comment on pull request #12401: ARROW-15352: [Doc][Guide] R package and make clean

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12401: URL: https://github.com/apache/arrow/pull/12401#issuecomment-1040680555 Benchmark runs are scheduled for baseline = cca3800bd95c4476f695cccf2bf9f39abacf4bf3 and contender = 3873f632952370688bfd087e0624f96f3d5b5b56. 3873f632952370688bfd087e0

[GitHub] [arrow] cyb70289 commented on a change in pull request #12399: ARROW-14993: [C++] Benchmark CSV writer

2022-02-15 Thread GitBox
cyb70289 commented on a change in pull request #12399: URL: https://github.com/apache/arrow/pull/12399#discussion_r807494515 ## File path: cpp/src/arrow/csv/writer.cc ## @@ -194,34 +204,49 @@ class UnquotedColumnPopulator : public ColumnPopulator { return Status::OK();

[GitHub] [arrow] taozuhong commented on pull request #12419: ARROW-15671: [GLib] Generate vapi for Vala community

2022-02-15 Thread GitBox
taozuhong commented on pull request #12419: URL: https://github.com/apache/arrow/pull/12419#issuecomment-1041041470 It's a huge project, I'm not sure that I have time to do it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] ursabot edited a comment on pull request #12279: ARROW-15238: [C++] ARROW_ENGINE module with substrait consumer

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12279: URL: https://github.com/apache/arrow/pull/12279#issuecomment-1041030773 Benchmark runs are scheduled for baseline = f8689a12bb7e4ee9861456c1629637bbf8d5d11c and contender = a935c81b595d24179e115d64cda944efa93aa0e0. a935c81b595d24179e115d64c

[GitHub] [arrow] ursabot commented on pull request #12279: ARROW-15238: [C++] ARROW_ENGINE module with substrait consumer

2022-02-15 Thread GitBox
ursabot commented on pull request #12279: URL: https://github.com/apache/arrow/pull/12279#issuecomment-1041030773 Benchmark runs are scheduled for baseline = f8689a12bb7e4ee9861456c1629637bbf8d5d11c and contender = a935c81b595d24179e115d64cda944efa93aa0e0. a935c81b595d24179e115d64cda944ef

[GitHub] [arrow] westonpace closed pull request #12279: ARROW-15238: [C++] ARROW_ENGINE module with substrait consumer

2022-02-15 Thread GitBox
westonpace closed pull request #12279: URL: https://github.com/apache/arrow/pull/12279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] cyb70289 commented on a change in pull request #12399: ARROW-14993: [C++] Benchmark CSV writer

2022-02-15 Thread GitBox
cyb70289 commented on a change in pull request #12399: URL: https://github.com/apache/arrow/pull/12399#discussion_r807476472 ## File path: cpp/src/arrow/csv/writer.cc ## @@ -194,34 +204,49 @@ class UnquotedColumnPopulator : public ColumnPopulator { return Status::OK();

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1831: determine build side in hash join by `total_byte_size` instead of `num_rows`

2022-02-15 Thread GitBox
xudong963 commented on pull request #1831: URL: https://github.com/apache/arrow-datafusion/pull/1831#issuecomment-1041021293 > The number of rows might be more often available as statistic than the total size in bytes. Both in external metadata but currently also inside our own statistics

[GitHub] [arrow] ursabot edited a comment on pull request #12324: ARROW-15013: [R] Expose concatenate at the R level

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12324: URL: https://github.com/apache/arrow/pull/12324#issuecomment-1040680517 Benchmark runs are scheduled for baseline = 6a2ee11d30676f99e40dfd9af94915981180510b and contender = cca3800bd95c4476f695cccf2bf9f39abacf4bf3. cca3800bd95c4476f695cccf2

[GitHub] [arrow] ursabot edited a comment on pull request #12423: PARQUET-2123: [C++] Fix invalid memory access in ScanFileContents

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12423: URL: https://github.com/apache/arrow/pull/12423#issuecomment-1040558035 Benchmark runs are scheduled for baseline = e00cf8397a73efd9eed5926d2daaa06222028c0e and contender = 1eb75b3312f179bf315c35fca44527f1c504ff3a. 1eb75b3312f179bf315c35fca

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1812: Update documentation example for change in API

2022-02-15 Thread GitBox
xudong963 commented on a change in pull request #1812: URL: https://github.com/apache/arrow-datafusion/pull/1812#discussion_r807469469 ## File path: docs/source/user-guide/example-usage.md ## @@ -28,18 +37,18 @@ use datafusion::prelude::*; async fn main() -> datafusion::error:

[GitHub] [arrow-datafusion] xudong963 commented on pull request #1437: Process stack overflow panic elegantly

2022-02-15 Thread GitBox
xudong963 commented on pull request #1437: URL: https://github.com/apache/arrow-datafusion/pull/1437#issuecomment-1041009259 > Marking as stale pr -- will close it in a week or two unless we plan to keep working on it I'll directly close the ticket because I plan to fix it by #1444

[GitHub] [arrow-datafusion] xudong963 closed pull request #1437: Process stack overflow panic elegantly

2022-02-15 Thread GitBox
xudong963 closed pull request #1437: URL: https://github.com/apache/arrow-datafusion/pull/1437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: git

[GitHub] [arrow] github-actions[bot] commented on pull request #12439: ARROW-15697: [R] Add logo and meta tags to pkgdown site

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12439: URL: https://github.com/apache/arrow/pull/12439#issuecomment-1040996176 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] ursabot edited a comment on pull request #12400: MINOR: [Docs][Archery] Correct the links in the README.md

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12400: URL: https://github.com/apache/arrow/pull/12400#issuecomment-1039120561 Benchmark runs are scheduled for baseline = 269f5d2d42259971e291bd61dadc4cff4d969273 and contender = 699449f2f5fe36938191d771f321ec15d3fd3331. 699449f2f5fe36938191d771f

[GitHub] [arrow] ursabot edited a comment on pull request #12364: ARROW-15606: [CI] [R] Add brew build that exercises the R package

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12364: URL: https://github.com/apache/arrow/pull/12364#issuecomment-1039282958 Benchmark runs are scheduled for baseline = 699449f2f5fe36938191d771f321ec15d3fd3331 and contender = 5ad5ddcafee8fada9cebb341df638b750c98efb7. 5ad5ddcafee8fada9cebb341d

[GitHub] [arrow] github-actions[bot] commented on pull request #12439: Adds meta tags to pkgdown site

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12439: URL: https://github.com/apache/arrow/pull/12439#issuecomment-1040991952 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow] djnavarro opened a new pull request #12439: Adds meta tags to pkgdown site

2022-02-15 Thread GitBox
djnavarro opened a new pull request #12439: URL: https://github.com/apache/arrow/pull/12439 (just a draft... actual text to be filled later!) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] ursabot edited a comment on pull request #12410: MINOR: [Integration] Simplified code in tests

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12410: URL: https://github.com/apache/arrow/pull/12410#issuecomment-1040944528 Benchmark runs are scheduled for baseline = 96785665eff453aa4e5fc87a8ee5d047b9526869 and contender = f8689a12bb7e4ee9861456c1629637bbf8d5d11c. f8689a12bb7e4ee9861456c16

[GitHub] [arrow-datafusion] andygrove commented on a change in pull request #1783: Enable periodic cleanup of work_dir directories in ballista executor

2022-02-15 Thread GitBox
andygrove commented on a change in pull request #1783: URL: https://github.com/apache/arrow-datafusion/pull/1783#discussion_r807443661 ## File path: ballista/rust/executor/src/main.rs ## @@ -148,3 +167,108 @@ async fn main() -> Result<()> { Ok(()) } + +/// This function

[GitHub] [arrow] kou edited a comment on pull request #12419: ARROW-15671: [GLib] Generate vapi for Vala community

2022-02-15 Thread GitBox
kou edited a comment on pull request #12419: URL: https://github.com/apache/arrow/pull/12419#issuecomment-1040230412 Thanks. Could you also add the following sample programs written in Vala to `c_glib/example/vala/`? * A program that uses arrow-glib * A program that uses arrow-

[GitHub] [arrow] ursabot commented on pull request #12410: MINOR: [Integration] Simplified code in tests

2022-02-15 Thread GitBox
ursabot commented on pull request #12410: URL: https://github.com/apache/arrow/pull/12410#issuecomment-1040944528 Benchmark runs are scheduled for baseline = 96785665eff453aa4e5fc87a8ee5d047b9526869 and contender = f8689a12bb7e4ee9861456c1629637bbf8d5d11c. f8689a12bb7e4ee9861456c1629637bb

[GitHub] [arrow] kou commented on pull request #12411: MINOR: [Integration] Privatized some code in tests

2022-02-15 Thread GitBox
kou commented on pull request #12411: URL: https://github.com/apache/arrow/pull/12411#issuecomment-1040942450 There are too many changes to say "MINOR": https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#minor-fixes Could you open a JIRA issue? -- This is an automated mes

[GitHub] [arrow] kou closed pull request #12410: MINOR: [Integration] Simplified code in tests

2022-02-15 Thread GitBox
kou closed pull request #12410: URL: https://github.com/apache/arrow/pull/12410 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow] rok commented on a change in pull request #12141: ARROW-14100: [C++] subtract(duration, duration) -> duration kernel

2022-02-15 Thread GitBox
rok commented on a change in pull request #12141: URL: https://github.com/apache/arrow/pull/12141#discussion_r807432638 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -1242,6 +1242,21 @@ TEST_F(ScalarTemporalTest, TestTemporalSubtractTimeAndDuration)

[GitHub] [arrow] rok commented on pull request #12438: ARROW-15688: [C++] add_checked doesn't error out on duration overflow

2022-02-15 Thread GitBox
rok commented on pull request #12438: URL: https://github.com/apache/arrow/pull/12438#issuecomment-1040934972 @pitrou thanks for spotting that one! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] github-actions[bot] commented on pull request #12438: ARROW-15688: [C++] add_checked doesn't error out on duration overflow

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12438: URL: https://github.com/apache/arrow/pull/12438#issuecomment-1040934145 https://issues.apache.org/jira/browse/ARROW-15688 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] rok opened a new pull request #12438: ARROW-15688: [C++] add_checked doesn't error out on duration overflow

2022-02-15 Thread GitBox
rok opened a new pull request #12438: URL: https://github.com/apache/arrow/pull/12438 This is to resolve [ARROW-15688](https://issues.apache.org/jira/browse/ARROW-15688). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow-datafusion] realno commented on pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
realno commented on pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#issuecomment-1040931959 Left a small question, otherwise looks good. Thanks @yahoNanJing ! -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-datafusion] realno commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
realno commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807330103 ## File path: ballista/rust/scheduler/src/state/mod.rs ## @@ -100,234 +267,464 @@ impl SchedulerState Result> { -let mut result = vec![];

[GitHub] [arrow] ursabot edited a comment on pull request #12427: PARQUET-2124: [C++] Remove Parquet Dictionary DCHECK

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12427: URL: https://github.com/apache/arrow/pull/12427#issuecomment-1040592605 Benchmark runs are scheduled for baseline = d414f030b5c9edab778b23b85bc0fa766499b81f and contender = 6a2ee11d30676f99e40dfd9af94915981180510b. 6a2ee11d30676f99e40dfd9af

[GitHub] [arrow] rok commented on a change in pull request #12141: ARROW-14100: [C++] subtract(duration, duration) -> duration kernel

2022-02-15 Thread GitBox
rok commented on a change in pull request #12141: URL: https://github.com/apache/arrow/pull/12141#discussion_r807419497 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -1242,6 +1242,21 @@ TEST_F(ScalarTemporalTest, TestTemporalSubtractTimeAndDuration)

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1810: Refactor scheduler state with different management policy for volatile and stable states

2022-02-15 Thread GitBox
alamb commented on a change in pull request #1810: URL: https://github.com/apache/arrow-datafusion/pull/1810#discussion_r807411737 ## File path: ballista/rust/executor/src/standalone.rs ## @@ -36,40 +37,43 @@ pub async fn new_standalone_executor( scheduler: SchedulerGrpcCl

[GitHub] [arrow] ursabot edited a comment on pull request #12144: ARROW-13594: [CI] Enable nightly turbodbc builds again

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12144: URL: https://github.com/apache/arrow/pull/12144#issuecomment-1040468870 Benchmark runs are scheduled for baseline = 26d6e6217ff79451a3fe366bcc88293c7ae67417 and contender = e00cf8397a73efd9eed5926d2daaa06222028c0e. e00cf8397a73efd9eed5926d2

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1839: Fix compiling ballista in standalone mode, add build to CI

2022-02-15 Thread GitBox
alamb commented on a change in pull request #1839: URL: https://github.com/apache/arrow-datafusion/pull/1839#discussion_r807410799 ## File path: ballista/rust/client/src/context.rs ## @@ -458,13 +469,16 @@ mod tests { #[tokio::test] #[cfg(feature = "standalone")] +

[GitHub] [arrow-datafusion] alamb opened a new issue #1840: Ballista standalone mode tests fail: `context::tests::test_task_stuck_when_referenced_task_failed`

2022-02-15 Thread GitBox
alamb opened a new issue #1840: URL: https://github.com/apache/arrow-datafusion/issues/1840 **Describe the bug** The following ballista test is failing (not sure when it started failing given the tests weren't run in CI until #1839 ) ``` context::tests::test_task_stuck_whe

[GitHub] [arrow-datafusion] alamb opened a new pull request #1839: Fix compiling ballista in standalone mode, add build to CI

2022-02-15 Thread GitBox
alamb opened a new pull request #1839: URL: https://github.com/apache/arrow-datafusion/pull/1839 # Which issue does this PR close? TBD # Rationale for this change As pointed by @thinkharderdev on https://github.com/apache/arrow-datafusion/pull/1810 https://github.

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1836: Register multiple tables into `ExecutionContext` at once

2022-02-15 Thread GitBox
matthewmturner commented on issue #1836: URL: https://github.com/apache/arrow-datafusion/issues/1836#issuecomment-1040879854 @alamb thank you for that info - it looks great. I'm wondering if it makes sense to add a method to `ObjectStore` to facilitate creating a `Schema` or `Catalo

[GitHub] [arrow] ursabot edited a comment on pull request #12351: ARROW-15598: [C++][Gandiva] Avoid using hardcoded raw pointer addresses in generated code

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12351: URL: https://github.com/apache/arrow/pull/12351#issuecomment-1039069459 Benchmark runs are scheduled for baseline = 5f590e9e64d880e2290dacc76ac85b4cd0d5f40a and contender = 269f5d2d42259971e291bd61dadc4cff4d969273. 269f5d2d42259971e291bd61d

[GitHub] [arrow] lidavidm commented on pull request #11963: ARROW-15066: [C++] Enable use of non-bundled OpenTelemetry

2022-02-15 Thread GitBox
lidavidm commented on pull request #11963: URL: https://github.com/apache/arrow/pull/11963#issuecomment-1040856375 Sorry, I've rebased + changed the title to hopefully be less confusing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] ursabot edited a comment on pull request #12345: ARROW-15594: [C++][FlightRPC] Add Deserialize(const Buffer&) to various Flight types

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12345: URL: https://github.com/apache/arrow/pull/12345#issuecomment-1040583416 Benchmark runs are scheduled for baseline = 1eb75b3312f179bf315c35fca44527f1c504ff3a and contender = d414f030b5c9edab778b23b85bc0fa766499b81f. d414f030b5c9edab778b23b85

[GitHub] [arrow] wjones127 commented on a change in pull request #12437: ARROW-14908: [C++][R] Multiple scanners plus join gives segfault when use_threads=FALSE

2022-02-15 Thread GitBox
wjones127 commented on a change in pull request #12437: URL: https://github.com/apache/arrow/pull/12437#discussion_r807358310 ## File path: cpp/src/arrow/compute/exec/hash_join.cc ## @@ -896,6 +896,17 @@ class HashJoinBasicImpl : public HashJoinImpl { std::vector has_match

[GitHub] [arrow] wjones127 commented on pull request #12437: ARROW-14908: [C++][R] Multiple scanners plus join gives segfault when use_threads=FALSE

2022-02-15 Thread GitBox
wjones127 commented on pull request #12437: URL: https://github.com/apache/arrow/pull/12437#issuecomment-1040843227 Failing test: https://github.com/apache/arrow/runs/5207592640?check_suite_focus=true#step:6:14179 -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow-rs] gsserge commented on pull request #1315: Enable more lints

2022-02-15 Thread GitBox
gsserge commented on pull request #1315: URL: https://github.com/apache/arrow-rs/pull/1315#issuecomment-1040842271 It's kinda weird that the "Docs are clean" job with `--all-features` triggers warnings (`non_camel_case_types` in this case) that are not caught by the proper clippy job. -

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1315: Enable more lints

2022-02-15 Thread GitBox
codecov-commenter commented on pull request #1315: URL: https://github.com/apache/arrow-rs/pull/1315#issuecomment-1040830874 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1315?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1838: Add benchmarks section to DEVELOPERS.md

2022-02-15 Thread GitBox
alamb commented on a change in pull request #1838: URL: https://github.com/apache/arrow-datafusion/pull/1838#discussion_r807344947 ## File path: datafusion/benches/parquet_query_sql.rs ## @@ -229,7 +229,7 @@ fn criterion_benchmark(c: &mut Criterion) { }); } -

[GitHub] [arrow] wjones127 commented on pull request #12437: ARROW-14908: [C++][R] Multiple scanners plus join gives segfault when use_threads=FALSE

2022-02-15 Thread GitBox
wjones127 commented on pull request #12437: URL: https://github.com/apache/arrow/pull/12437#issuecomment-1040825099 When I run with a fix in place for hash join paths, I sometimes get `/Users/willjones/Documents/arrows/arrow/cpp/src/arrow/compute/exec/util.cc:329: Check failed: (thread_in

[GitHub] [arrow] ursabot edited a comment on pull request #12178: ARROW-9664: [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword

2022-02-15 Thread GitBox
ursabot edited a comment on pull request #12178: URL: https://github.com/apache/arrow/pull/12178#issuecomment-1040099351 Benchmark runs are scheduled for baseline = 7b5efe47ba5a31f9850e5cdbf47feea4e0f6c455 and contender = 26d6e6217ff79451a3fe366bcc88293c7ae67417. 26d6e6217ff79451a3fe366bc

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1838: Add benchmarks section to DEVELOPERS.md

2022-02-15 Thread GitBox
tustvold commented on a change in pull request #1838: URL: https://github.com/apache/arrow-datafusion/pull/1838#discussion_r807337681 ## File path: datafusion/benches/parquet_query_sql.rs ## @@ -229,7 +229,7 @@ fn criterion_benchmark(c: &mut Criterion) { }); } -

[GitHub] [arrow-rs] gsserge opened a new pull request #1315: Enable more lints

2022-02-15 Thread GitBox
gsserge opened a new pull request #1315: URL: https://github.com/apache/arrow-rs/pull/1315 # Which issue does this PR close? This is part of https://github.com/apache/arrow-rs/issues/1255 # Rationale for this change It's beneficial to run clippy as strict as possible.

[GitHub] [arrow-datafusion] tustvold opened a new pull request #1838: Add benchmarks section to DEVELOPERS.md

2022-02-15 Thread GitBox
tustvold opened a new pull request #1838: URL: https://github.com/apache/arrow-datafusion/pull/1838 # Rationale for this change This was requested on https://github.com/apache/arrow-datafusion/pull/1738 # What changes are included in this PR? Adds some documentation on

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1837: Return `Error` when parquet reader fails rather than no data with `println!`

2022-02-15 Thread GitBox
alamb commented on a change in pull request #1837: URL: https://github.com/apache/arrow-datafusion/pull/1837#discussion_r807328426 ## File path: datafusion/src/physical_plan/file_format/parquet.rs ## @@ -808,18 +822,8 @@ mod tests { let read = round_trip_t

[GitHub] [arrow-datafusion] alamb merged pull request #1738: Add parquet SQL benchmarks

2022-02-15 Thread GitBox
alamb merged pull request #1738: URL: https://github.com/apache/arrow-datafusion/pull/1738 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1622: Handle merging of evolved schemas in ParquetExec

2022-02-15 Thread GitBox
alamb commented on a change in pull request #1622: URL: https://github.com/apache/arrow-datafusion/pull/1622#discussion_r807327654 ## File path: datafusion/src/physical_plan/file_format/parquet.rs ## @@ -457,22 +518,313 @@ fn read_partition( #[cfg(test)] mod tests { -us

[GitHub] [arrow] nealrichardson commented on pull request #12436: ARROW-15690: [Dev] Update GitHub Actions workflows that hardcode master as default

2022-02-15 Thread GitBox
nealrichardson commented on pull request #12436: URL: https://github.com/apache/arrow/pull/12436#issuecomment-1040807014 I'll back out the R windows job changes for now, they're unrelated (they're about the r-lib/actions tag, not our master branch). It looks like pacman isn't on the path i

[GitHub] [arrow-datafusion] alamb commented on issue #1823: implement bitmap_distinct function using bitmap

2022-02-15 Thread GitBox
alamb commented on issue #1823: URL: https://github.com/apache/arrow-datafusion/issues/1823#issuecomment-1040800234 > @alamb Sorry to bother you 😂, could you share some info why use croating-rs, if you have a bench result that would be fantastic 👍 ! IOx is using croaring for its "R

[GitHub] [arrow-datafusion] alamb commented on issue #1836: Register multiple tables into `ExecutionContext` at once

2022-02-15 Thread GitBox
alamb commented on issue #1836: URL: https://github.com/apache/arrow-datafusion/issues/1836#issuecomment-1040796346 DataFusion has a notion of `Catalog` and `Schema` (thanks @returnString 👋 ) now that seems similar to what you are describing. Here is an example of how it is used:

[GitHub] [arrow] jonkeane commented on a change in pull request #12431: ARROW-14826 [R] Implement bindings for `lubridate::dst()`

2022-02-15 Thread GitBox
jonkeane commented on a change in pull request #12431: URL: https://github.com/apache/arrow/pull/12431#discussion_r807316273 ## File path: r/tests/testthat/test-dplyr-funcs-datetime.R ## @@ -711,3 +711,17 @@ test_that("am/pm mirror lubridate", { ) }) +test_that("dst extra

[GitHub] [arrow-datafusion] alamb opened a new pull request #1837: Return `Error` when parquet reader fails rather than no data with `println!`

2022-02-15 Thread GitBox
alamb opened a new pull request #1837: URL: https://github.com/apache/arrow-datafusion/pull/1837 # Which issue does this PR close? Fixes: https://github.com/apache/arrow-datafusion/issues/1767 (cc @andygrove ) I believe this also fixes https://github.com/apache/arrow-datafusion/i

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1738: Add parquet SQL benchmarks

2022-02-15 Thread GitBox
tustvold commented on a change in pull request #1738: URL: https://github.com/apache/arrow-datafusion/pull/1738#discussion_r807309526 ## File path: datafusion/benches/parquet_query_sql.rs ## @@ -0,0 +1,237 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1738: Add parquet SQL benchmarks

2022-02-15 Thread GitBox
tustvold commented on a change in pull request #1738: URL: https://github.com/apache/arrow-datafusion/pull/1738#discussion_r807309526 ## File path: datafusion/benches/parquet_query_sql.rs ## @@ -0,0 +1,237 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or

[GitHub] [arrow] github-actions[bot] commented on pull request #12437: ARROW-14908: [C++][R] Multiple scanners plus join gives segfault when use_threads=FALSE

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12437: URL: https://github.com/apache/arrow/pull/12437#issuecomment-1040781506 https://issues.apache.org/jira/browse/ARROW-14908 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] github-actions[bot] commented on pull request #12366: ARROW-15468: [R] [CI] A crossbow job that tests against DuckDB's dev branch

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12366: URL: https://github.com/apache/arrow/pull/12366#issuecomment-1040781509 Revision: 4a242c3d38100b5b171eca71b31084b543a82391 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1640](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] github-actions[bot] commented on pull request #12436: ARROW-15690: [Dev] Update GitHub Actions workflows that hardcode master as default

2022-02-15 Thread GitBox
github-actions[bot] commented on pull request #12436: URL: https://github.com/apache/arrow/pull/12436#issuecomment-1040781110 https://issues.apache.org/jira/browse/ARROW-15690 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] nealrichardson commented on a change in pull request #12436: ARROW-15690: [Dev] Update GitHub Actions workflows that hardcode master as default

2022-02-15 Thread GitBox
nealrichardson commented on a change in pull request #12436: URL: https://github.com/apache/arrow/pull/12436#discussion_r807304850 ## File path: .github/workflows/dev_pr.yml ## @@ -21,7 +21,7 @@ on: # TODO: Enable this when eps1lon/actions-label-merge-conflict is available.

[GitHub] [arrow] nealrichardson opened a new pull request #12436: ARROW-15690: [Dev] Update GitHub Actions workflows that hardcode master as default

2022-02-15 Thread GitBox
nealrichardson opened a new pull request #12436: URL: https://github.com/apache/arrow/pull/12436 Included a few other master-related cleanups I saw -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] jonkeane commented on a change in pull request #12357: ARROW-14817 [R] Implement bindings for `lubridate::tz()`

2022-02-15 Thread GitBox
jonkeane commented on a change in pull request #12357: URL: https://github.com/apache/arrow/pull/12357#discussion_r807304348 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -147,5 +147,12 @@ register_bindings_datetime <- function() { register_binding("pm", function(x) {

[GitHub] [arrow] jonkeane commented on pull request #12366: ARROW-15468: [R] [CI] A crossbow job that tests against DuckDB's dev branch

2022-02-15 Thread GitBox
jonkeane commented on pull request #12366: URL: https://github.com/apache/arrow/pull/12366#issuecomment-1040779874 @github-actions crossbow submit test-r-dev-duckdb -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] alamb commented on issue #1767: Parquet reader thread errors do not make query fail

2022-02-15 Thread GitBox
alamb commented on issue #1767: URL: https://github.com/apache/arrow-datafusion/issues/1767#issuecomment-1040778871 I'll take a stab at this Also, as noted by @Dandandan / @andygrove in the slack channel, the #1617 may make this redundant anyways -- This is an automated messa

[GitHub] [arrow-rs] viirya commented on a change in pull request #1314: Clean up `DictionaryArray` construction in test

2022-02-15 Thread GitBox
viirya commented on a change in pull request #1314: URL: https://github.com/apache/arrow-rs/pull/1314#discussion_r807301627 ## File path: arrow/src/array/array_dictionary.rs ## @@ -472,29 +473,11 @@ mod tests { #[test] fn test_dictionary_iter() { // Construct

[GitHub] [arrow] dragosmg commented on a change in pull request #12429: ARROW-14815 [R] bindings for `lubridate::semester()`

2022-02-15 Thread GitBox
dragosmg commented on a change in pull request #12429: URL: https://github.com/apache/arrow/pull/12429#discussion_r807300733 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -147,5 +147,14 @@ register_bindings_datetime <- function() { register_binding("pm", function(x) {

[GitHub] [arrow] dragosmg commented on a change in pull request #12429: ARROW-14815 [R] bindings for `lubridate::semester()`

2022-02-15 Thread GitBox
dragosmg commented on a change in pull request #12429: URL: https://github.com/apache/arrow/pull/12429#discussion_r807300733 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -147,5 +147,14 @@ register_bindings_datetime <- function() { register_binding("pm", function(x) {

[GitHub] [arrow] dragosmg commented on a change in pull request #12357: ARROW-14817 [R] Implement bindings for `lubridate::tz()`

2022-02-15 Thread GitBox
dragosmg commented on a change in pull request #12357: URL: https://github.com/apache/arrow/pull/12357#discussion_r807299424 ## File path: r/R/dplyr-funcs-datetime.R ## @@ -147,5 +147,15 @@ register_bindings_datetime <- function() { register_binding("pm", function(x) {

[GitHub] [arrow] dragosmg commented on a change in pull request #12357: ARROW-14817 [R] Implement bindings for `lubridate::tz()`

2022-02-15 Thread GitBox
dragosmg commented on a change in pull request #12357: URL: https://github.com/apache/arrow/pull/12357#discussion_r807296212 ## File path: r/tests/testthat/test-dplyr-funcs-datetime.R ## @@ -711,3 +711,71 @@ test_that("am/pm mirror lubridate", { ) }) + +test_that("extract

[GitHub] [arrow] jonkeane commented on a change in pull request #12357: ARROW-14817 [R] Implement bindings for `lubridate::tz()`

2022-02-15 Thread GitBox
jonkeane commented on a change in pull request #12357: URL: https://github.com/apache/arrow/pull/12357#discussion_r807295552 ## File path: r/tests/testthat/test-dplyr-funcs-datetime.R ## @@ -711,3 +711,71 @@ test_that("am/pm mirror lubridate", { ) }) + +test_that("extract

[GitHub] [arrow] dragosmg commented on a change in pull request #12357: ARROW-14817 [R] Implement bindings for `lubridate::tz()`

2022-02-15 Thread GitBox
dragosmg commented on a change in pull request #12357: URL: https://github.com/apache/arrow/pull/12357#discussion_r807296212 ## File path: r/tests/testthat/test-dplyr-funcs-datetime.R ## @@ -711,3 +711,71 @@ test_that("am/pm mirror lubridate", { ) }) + +test_that("extract

  1   2   3   4   5   >