[GitHub] [arrow-datafusion] yahoNanJing commented on a change in pull request #1560: Introduce push-based task scheduling for Ballista

2022-01-19 Thread GitBox
yahoNanJing commented on a change in pull request #1560: URL: https://github.com/apache/arrow-datafusion/pull/1560#discussion_r787443197 ## File path: ballista/rust/scheduler/src/lib.rs ## @@ -112,17 +140,171 @@ impl SchedulerServer { tokio::spawn(async move { state_c

[GitHub] [arrow-rs] Ted-Jiang commented on a change in pull request #1205: Return error from JSON writer rather than panic

2022-01-19 Thread GitBox
Ted-Jiang commented on a change in pull request #1205: URL: https://github.com/apache/arrow-rs/pull/1205#discussion_r787446606 ## File path: arrow/src/json/writer.rs ## @@ -178,31 +178,29 @@ pub fn array_to_json_array(array: &ArrayRef) -> Vec { DataType::UInt64 => pri

[GitHub] [arrow] jorisvandenbossche commented on pull request #12091: ARROW-14798: [C++][Python][R] Add container window to PrettyPrintOptions

2022-01-19 Thread GitBox
jorisvandenbossche commented on pull request #12091: URL: https://github.com/apache/arrow/pull/12091#issuecomment-1016192978 I was testing this locally, I think we might actually eventually want both this and some character cap as in https://github.com/apache/arrow/pull/12148 (but maybe at

[GitHub] [arrow] westonpace commented on a change in pull request #11993: ARROW-15153: [Python] Expose ReferencedBufferSize to python

2022-01-19 Thread GitBox
westonpace commented on a change in pull request #11993: URL: https://github.com/apache/arrow/pull/11993#discussion_r787462209 ## File path: python/pyarrow/array.pxi ## @@ -988,13 +988,43 @@ cdef class Array(_PandasConvertible): def nbytes(self): """ Tota

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1596: Consolidate sort and external_sort, consolidate N-way merge sort

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787466759 ## File path: datafusion/src/physical_plan/sorts/sort_preserving_merge.rs ## @@ -304,6 +303,9 @@ pub(crate) struct SortPreservingMergeStream {

[GitHub] [arrow] github-actions[bot] commented on pull request #12110: ARROW-15212: [C++] Handle suffix argument in joins

2022-01-19 Thread GitBox
github-actions[bot] commented on pull request #12110: URL: https://github.com/apache/arrow/pull/12110#issuecomment-1016201504 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] iajoiner commented on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner commented on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225237 @pitrou Please review again. There are several issues we still need to discuss: 1. What exactly should be done to `FileVersion`? Shall I create an `options.cc` which currentl

[GitHub] [arrow] iajoiner commented on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner commented on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225748 Also I got the API docs updated for both C++ and Python to include all the ORC stuff. -- This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [arrow] iajoiner edited a comment on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner edited a comment on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225748 Also I got the API docs updated for both C++ and Python to include all the ORC stuff. Once this is merged I will do the user guides probably together with the reader opti

[GitHub] [arrow-datafusion] yahoNanJing commented on pull request #1560: Introduce push-based task scheduling for Ballista

2022-01-19 Thread GitBox
yahoNanJing commented on pull request #1560: URL: https://github.com/apache/arrow-datafusion/pull/1560#issuecomment-1016229513 Thanks @houqp. Just rebase the code to avoid the conflicts and add a fix commit -- This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] iajoiner edited a comment on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner edited a comment on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225748 Also I got the API docs updated for both C++ and Python to include all the ORC stuff. Once this is merged I will do the user guides probably together with the reader opti

[GitHub] [arrow] iajoiner edited a comment on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner edited a comment on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225748 Also I got the API docs updated for both C++ and Python to include all the ORC stuff. Once this is merged I will do the user guides probably together with the reader opti

[GitHub] [arrow] iajoiner edited a comment on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner edited a comment on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016225748 Also I got the API docs updated for both C++ and Python to include all the ORC stuff. Once this is merged I will do the user guides probably together with the reader opti

[GitHub] [arrow] iajoiner commented on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
iajoiner commented on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016235688 This PR will also resolve ARROW-13571. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] pitrou commented on a change in pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
pitrou commented on a change in pull request #9702: URL: https://github.com/apache/arrow/pull/9702#discussion_r787524153 ## File path: cpp/src/arrow/adapters/orc/adapter.cc ## @@ -628,41 +733,86 @@ class ArrowOutputStream : public liborc::OutputStream { int64_t length_; };

[GitHub] [arrow] pitrou commented on pull request #9702: ARROW-11297: [C++][Python] Add ORC writer options

2022-01-19 Thread GitBox
pitrou commented on pull request #9702: URL: https://github.com/apache/arrow/pull/9702#issuecomment-1016238241 > @pitrou Please review again. There are several issues we still need to discuss: > > 1. What exactly should be done to `FileVersion`? Shall I create an `options.cc` wh

[GitHub] [arrow-datafusion] yjshen commented on pull request #1596: Consolidate sort and external_sort, consolidate N-way merge sort

2022-01-19 Thread GitBox
yjshen commented on pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#issuecomment-1016238900 ## TPC-H sf=1 sort_extendedprice_discount.[combine_and_sort method] No differences for the external_sort compared to the previous sort. ```sh ./target/

[GitHub] [arrow-datafusion] yjshen edited a comment on pull request #1596: Consolidate sort and external_sort, consolidate N-way merge sort

2022-01-19 Thread GitBox
yjshen edited a comment on pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#issuecomment-1016238900 ## TPC-H sf=1 sort_extendedprice_discount.[combine_and_sort method] Similar performance for the external_sort compared to the previous sort. ```sh

[GitHub] [arrow-datafusion] yjshen edited a comment on pull request #1596: Consolidate sort and external_sort, consolidate N-way merge sort

2022-01-19 Thread GitBox
yjshen edited a comment on pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#issuecomment-1016238900 ## TPC-H sf=1 sort_extendedprice_discount.[combine_and_sort method] Similar performance for the new sort compared with the previous sort. ```sh

[GitHub] [arrow] kszucs commented on pull request #10136: ARROW-12515 [Dev][Wiki][Release]: Fix and update Windows RC verify script [WIP]

2022-01-19 Thread GitBox
kszucs commented on pull request #10136: URL: https://github.com/apache/arrow/pull/10136#issuecomment-1016244976 The verification scripts already use python 3.8 but with VS2017. We can bump or rather parametrize the script to support both compilers, but it is not essential for 7.0 release.

[GitHub] [arrow] kszucs edited a comment on pull request #10136: ARROW-12515 [Dev][Wiki][Release]: Fix and update Windows RC verify script [WIP]

2022-01-19 Thread GitBox
kszucs edited a comment on pull request #10136: URL: https://github.com/apache/arrow/pull/10136#issuecomment-1016244976 The verification scripts [already use python 3.8](https://github.com/apache/arrow/blob/master/dev/release/verify-release-candidate.bat#L40) but with VS2017. We can bump o

[GitHub] [arrow] kszucs closed pull request #12159: ARROW-15327: [R] Update news for 7.0.0

2022-01-19 Thread GitBox
kszucs closed pull request #12159: URL: https://github.com/apache/arrow/pull/12159 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787540527 ## File path: datafusion/src/physical_plan/sorts/mod.rs ## @@ -281,6 +287,9 @@ impl Stream for StreamWrapper { Poll::Pending

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787541398 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,450 @@ // specific language governing permissions and limitations // u

[GitHub] [arrow] kszucs edited a comment on pull request #11182: ARROW-14034: [Java] Unexpected Allocator states created after allocating buffer whose AllocationManager has different size from the req

2022-01-19 Thread GitBox
kszucs edited a comment on pull request #11182: URL: https://github.com/apache/arrow/pull/11182#issuecomment-1016251891 Assuming that you meant to write "I'll address" I'm postponing it to 8.0 release :) -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [arrow] kszucs commented on pull request #11182: ARROW-14034: [Java] Unexpected Allocator states created after allocating buffer whose AllocationManager has different size from the requested

2022-01-19 Thread GitBox
kszucs commented on pull request #11182: URL: https://github.com/apache/arrow/pull/11182#issuecomment-1016251891 Assuming that you meant to write "I'll address" I'm postponing it to 8.0 :) -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] kszucs closed pull request #10333: ARROW-12607: [Website] Doc section for Dataset Java bindings

2022-01-19 Thread GitBox
kszucs closed pull request #10333: URL: https://github.com/apache/arrow/pull/10333 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] kszucs commented on a change in pull request #11679: ARROW-14671: [Python][Doc] Documentation on how to integrate PyArrow and R

2022-01-19 Thread GitBox
kszucs commented on a change in pull request #11679: URL: https://github.com/apache/arrow/pull/11679#discussion_r787550314 ## File path: docs/source/python/integration/python_r.rst ## @@ -0,0 +1,312 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more c

[GitHub] [arrow] kszucs closed pull request #11679: ARROW-14671: [Python][Doc] Documentation on how to integrate PyArrow and R

2022-01-19 Thread GitBox
kszucs closed pull request #11679: URL: https://github.com/apache/arrow/pull/11679 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787564893 ## File path: r/R/dplyr-join.R ## @@ -117,10 +117,33 @@ handle_join_by <- function(by, x, y) { if (is.null(names(by))) { by <- set_names(by) }

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787565796 ## File path: r/tests/testthat/_snaps/dplyr-join.md ## @@ -0,0 +1,50 @@ +# Error handling + +Code + (expect_error(left_join(arrow_table(example_

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787567679 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787567978 ## File path: r/tests/testthat/test-dplyr-join.R ## @@ -90,9 +90,57 @@ test_that("Error handling", { left_tab %>% left_join(to_join, by = "no

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787539714 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations //

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787576778 ## File path: r/tests/testthat/test-dplyr-join.R ## @@ -90,9 +90,57 @@ test_that("Error handling", { left_tab %>% left_join(to_join, by = "no

[GitHub] [arrow] pitrou commented on a change in pull request #12182: ARROW-15360: [Python] Check slice bounds in Buffer.slice()

2022-01-19 Thread GitBox
pitrou commented on a change in pull request #12182: URL: https://github.com/apache/arrow/pull/12182#discussion_r787579723 ## File path: python/pyarrow/io.pxi ## @@ -1056,10 +1056,10 @@ cdef class Buffer(_Weakrefable): raise IndexError('Offset must be non-negative'

[GitHub] [arrow] amol- commented on a change in pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
amol- commented on a change in pull request #11726: URL: https://github.com/apache/arrow/pull/11726#discussion_r787583107 ## File path: ci/scripts/python_build.sh ## @@ -75,5 +75,6 @@ popd if [ "${BUILD_DOCS_PYTHON}" == "ON" ]; then ncpus=$(python -c "import os; print(os.

[GitHub] [arrow] amol- commented on a change in pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
amol- commented on a change in pull request #11726: URL: https://github.com/apache/arrow/pull/11726#discussion_r787583107 ## File path: ci/scripts/python_build.sh ## @@ -75,5 +75,6 @@ popd if [ "${BUILD_DOCS_PYTHON}" == "ON" ]; then ncpus=$(python -c "import os; print(os.

[GitHub] [arrow] amol- commented on a change in pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
amol- commented on a change in pull request #11726: URL: https://github.com/apache/arrow/pull/11726#discussion_r787583107 ## File path: ci/scripts/python_build.sh ## @@ -75,5 +75,6 @@ popd if [ "${BUILD_DOCS_PYTHON}" == "ON" ]; then ncpus=$(python -c "import os; print(os.

[GitHub] [arrow-rs] tustvold commented on a change in pull request #1195: Add comparison support for fully qualified BinaryArray

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1195: URL: https://github.com/apache/arrow-rs/pull/1195#discussion_r787576858 ## File path: arrow/src/compute/kernels/comparison.rs ## @@ -2794,6 +2890,177 @@ mod tests { ); } +macro_rules! test_binary { +

[GitHub] [arrow] amol- commented on a change in pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
amol- commented on a change in pull request #11726: URL: https://github.com/apache/arrow/pull/11726#discussion_r787587318 ## File path: ci/scripts/python_build.sh ## @@ -75,5 +75,6 @@ popd if [ "${BUILD_DOCS_PYTHON}" == "ON" ]; then ncpus=$(python -c "import os; print(os.

[GitHub] [arrow-cookbook] amol- merged pull request #119: Add note in CONTRIBUTING.MD about typos

2022-01-19 Thread GitBox
amol- merged pull request #119: URL: https://github.com/apache/arrow-cookbook/pull/119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] jorisvandenbossche commented on pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
jorisvandenbossche commented on pull request #11726: URL: https://github.com/apache/arrow/pull/11726#issuecomment-1016290746 @amol- I pushed a commit with some changes: - A bunch of general docstring fixes that will ensure it gets rendered better with this PR - Some more aliases (

[GitHub] [arrow] pitrou closed pull request #12180: ARROW-15084: [C++] public factory function for GcsFileSystem

2022-01-19 Thread GitBox
pitrou closed pull request #12180: URL: https://github.com/apache/arrow/pull/12180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-cookbook] amol- closed issue #86: [Python][Flight] Recipe to show how to stream data in `do_get`

2022-01-19 Thread GitBox
amol- closed issue #86: URL: https://github.com/apache/arrow-cookbook/issues/86 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow-cookbook] amol- merged pull request #109: [Python] Add Flight streaming example

2022-01-19 Thread GitBox
amol- merged pull request #109: URL: https://github.com/apache/arrow-cookbook/pull/109 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787593339 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787594779 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787595498 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow] pitrou closed pull request #12077: ARROW-15109: [Python] Add show_info() to print build, component, and system info

2022-01-19 Thread GitBox
pitrou closed pull request #12077: URL: https://github.com/apache/arrow/pull/12077 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] e-dard commented on a change in pull request #1196: feat: add support for casting Duration/Interval to Int64Array

2022-01-19 Thread GitBox
e-dard commented on a change in pull request #1196: URL: https://github.com/apache/arrow-rs/pull/1196#discussion_r787601001 ## File path: arrow/src/compute/kernels/cast.rs ## @@ -,6 +1118,17 @@ pub fn cast_with_options( } } } +

[GitHub] [arrow] kszucs commented on a change in pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
kszucs commented on a change in pull request #11726: URL: https://github.com/apache/arrow/pull/11726#discussion_r787601746 ## File path: ci/scripts/python_build.sh ## @@ -75,5 +75,6 @@ popd if [ "${BUILD_DOCS_PYTHON}" == "ON" ]; then ncpus=$(python -c "import os; print(os

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787607671 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787607909 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow] bkmgit commented on a change in pull request #11882: ARROW-9843: [C++][Python] Implement Between ternary kernel and Python bindings

2022-01-19 Thread GitBox
bkmgit commented on a change in pull request #11882: URL: https://github.com/apache/arrow/pull/11882#discussion_r787610184 ## File path: cpp/src/arrow/compute/kernels/scalar_compare_test.cc ## @@ -1865,5 +1867,683 @@ TEST(TestMaxElementWiseMinElementWise, CommonTemporal) {

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787593754 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations // u

[GitHub] [arrow] amol- commented on pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
amol- commented on pull request #11726: URL: https://github.com/apache/arrow/pull/11726#issuecomment-1016309866 > @amol- I pushed a commit with some changes: > > * A bunch of general docstring fixes that will ensure it gets rendered better with this PR > * Some more aliases (it wo

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787618177 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations //

[GitHub] [arrow-rs] jhorstmann commented on a change in pull request #1205: Return error from JSON writer rather than panic

2022-01-19 Thread GitBox
jhorstmann commented on a change in pull request #1205: URL: https://github.com/apache/arrow-rs/pull/1205#discussion_r787618284 ## File path: arrow/src/json/writer.rs ## @@ -178,31 +178,29 @@ pub fn array_to_json_array(array: &ArrayRef) -> Vec { DataType::UInt64 => pr

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787325876 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -159,14 +561,25 @@ impl ExecutionPlan for SortExec { } } -

[GitHub] [arrow] jorisvandenbossche commented on pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
jorisvandenbossche commented on pull request #12172: URL: https://github.com/apache/arrow/pull/12172#issuecomment-1016312470 Cool! The screenshot looks good. I don't have time to test it right now, but if it is working for you locally, I trust that, and I would merge it and then we can sti

[GitHub] [arrow] kszucs commented on pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
kszucs commented on pull request #12172: URL: https://github.com/apache/arrow/pull/12172#issuecomment-1016312998 Including to 7.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1196: feat: add support for casting Duration/Interval to Int64Array

2022-01-19 Thread GitBox
codecov-commenter commented on pull request #1196: URL: https://github.com/apache/arrow-rs/pull/1196#issuecomment-1016313137 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1196?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow] jorisvandenbossche commented on pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
jorisvandenbossche commented on pull request #12172: URL: https://github.com/apache/arrow/pull/12172#issuecomment-1016317906 One question though: where does this versions.json asset get included? (where does it get copied on the actual website) Because we should ensure that for every ve

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
jorisvandenbossche commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787628619 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove();

[GitHub] [arrow] AlenkaF opened a new pull request #12186: ARROW-15343: [Doc][Guide] Introduction and the checklist - minor corrections

2022-01-19 Thread GitBox
AlenkaF opened a new pull request #12186: URL: https://github.com/apache/arrow/pull/12186 Add changes to `index.rst` of the New Contributors Guide. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787630386 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow] AlenkaF commented on a change in pull request #12178: ARROW-9664: [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword

2022-01-19 Thread GitBox
AlenkaF commented on a change in pull request #12178: URL: https://github.com/apache/arrow/pull/12178#discussion_r787630984 ## File path: python/pyarrow/tests/test_table.py ## @@ -339,6 +339,42 @@ def test_chunked_array_to_pandas_preserve_name(): tm.assert_series_equal

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787630386 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787630386 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow-rs] tustvold opened a new issue #1206: Dictionary IDs Arrow IPC

2022-01-19 Thread GitBox
tustvold opened a new issue #1206: URL: https://github.com/apache/arrow-rs/issues/1206 **Which part is this question about** The `Field` data structure contains a `dict_id` member, that stores an i64. It appears the intention of this is that different dictionaries will have differen

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787635919 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations //

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787637475 ## File path: datafusion/tests/sql/joins.rs ## @@ -419,32 +419,32 @@ async fn cross_join_unbalanced() { // the order of the values is not d

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
jorisvandenbossche commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787639255 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove();

[GitHub] [arrow-datafusion] yjshen commented on pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#issuecomment-1016349651 Thanks @tustvold for the detailed review. I've first quick fix the async and documentation-related comments. For deadlock / mutex / hangup parts, let me think it over

[GitHub] [arrow-rs] tustvold opened a new issue #1207: Async IPC Reader/Writer

2022-01-19 Thread GitBox
tustvold opened a new issue #1207: URL: https://github.com/apache/arrow-rs/issues/1207 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The current IPCReader and IPCWriter require a sync Reader/Writer complicating their use in

[GitHub] [arrow-datafusion] tustvold commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
tustvold commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787647225 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations //

[GitHub] [arrow] pitrou closed pull request #12182: ARROW-15360: [Python] Check slice bounds in Buffer.slice()

2022-01-19 Thread GitBox
pitrou closed pull request #12182: URL: https://github.com/apache/arrow/pull/12182 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb closed pull request #1203: temporarily update packed_simd reference

2022-01-19 Thread GitBox
alamb closed pull request #1203: URL: https://github.com/apache/arrow-rs/pull/1203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb commented on pull request #1203: temporarily update packed_simd reference

2022-01-19 Thread GitBox
alamb commented on pull request #1203: URL: https://github.com/apache/arrow-rs/pull/1203#issuecomment-1016360952 This was just a demonstration -- I am not going to merge this one in -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Gi

[GitHub] [arrow-rs] alamb commented on issue #685: Implement casting between duration/intervals and numbers

2022-01-19 Thread GitBox
alamb commented on issue #685: URL: https://github.com/apache/arrow-rs/issues/685#issuecomment-1016364455 It seems as if we are going with a dual approach here: 1. Allow casting to the underlying primitive type - #1196 2. Additional temporal kernels to extract parts of it, e.g https

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787652181 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow-rs] alamb commented on pull request #1195: Add comparison support for fully qualified BinaryArray

2022-01-19 Thread GitBox
alamb commented on pull request #1195: URL: https://github.com/apache/arrow-rs/pull/1195#issuecomment-1016368037 Thank you @HaoYang670 -- this looks great. @tustvold has a good suggestion regarding testing, but we can either do that as part of this PR or as a follow on. -- This is an

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
jorisvandenbossche commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787655327 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove();

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787576778 ## File path: r/tests/testthat/test-dplyr-join.R ## @@ -90,9 +90,57 @@ test_that("Error handling", { left_tab %>% left_join(to_join, by = "no

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787567978 ## File path: r/tests/testthat/test-dplyr-join.R ## @@ -90,9 +90,57 @@ test_that("Error handling", { left_tab %>% left_join(to_join, by = "no

[GitHub] [arrow] thisisnic commented on a change in pull request #12172: ARROW-14338: [Docs] Add version dropdown to the pkgdown (R) docs

2022-01-19 Thread GitBox
thisisnic commented on a change in pull request #12172: URL: https://github.com/apache/arrow/pull/12172#discussion_r787657985 ## File path: r/pkgdown/extra.js ## @@ -59,6 +77,45 @@ var empty_ul = $("#toc").find("ul").filter(":empty"); empty_ul.remove(); }); +

[GitHub] [arrow-rs] alamb commented on a change in pull request #1205: Return error from JSON writer rather than panic

2022-01-19 Thread GitBox
alamb commented on a change in pull request #1205: URL: https://github.com/apache/arrow-rs/pull/1205#discussion_r787657103 ## File path: arrow/src/json/writer.rs ## @@ -110,64 +110,64 @@ use serde_json::Value; use crate::array::*; use crate::datatypes::*; -use crate::error:

[GitHub] [arrow-rs] alamb commented on issue #1200: Implement DecimalArray support in `eq_dyn`, `neq_dyn`, `lt_dyn`, `lt_eq_dyn`, `gt_dyn`, `gt_eq_dyn`

2022-01-19 Thread GitBox
alamb commented on issue #1200: URL: https://github.com/apache/arrow-rs/issues/1200#issuecomment-1016378357 @liukun4515 I have not started this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1606: Fix comparison of dictionary arrays

2022-01-19 Thread GitBox
alamb commented on a change in pull request #1606: URL: https://github.com/apache/arrow-datafusion/pull/1606#discussion_r787661991 ## File path: datafusion/src/physical_plan/coercion_rule/binary_rule.rs ## @@ -77,7 +77,7 @@ pub(crate) fn coerce_types( } fn comparison_eq_coe

[GitHub] [arrow-datafusion] alamb commented on pull request #1607: consolidate binary_expr coercion rule code

2022-01-19 Thread GitBox
alamb commented on pull request #1607: URL: https://github.com/apache/arrow-datafusion/pull/1607#issuecomment-1016381999 > I think it's my fault and leave some duplicated code. > My original thought is that I will consolidate binary logic and coercion in the follow-up pull-requests.

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1195: Add comparison support for fully qualified BinaryArray

2022-01-19 Thread GitBox
codecov-commenter edited a comment on pull request #1195: URL: https://github.com/apache/arrow-rs/pull/1195#issuecomment-1015025874 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1195?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow] rok commented on a change in pull request #12141: ARROW-14100: [C++] subtract(duration, duration) -> duration kernel

2022-01-19 Thread GitBox
rok commented on a change in pull request #12141: URL: https://github.com/apache/arrow/pull/12141#discussion_r78719 ## File path: cpp/src/arrow/compute/kernels/scalar_temporal_test.cc ## @@ -976,6 +976,23 @@ TEST_F(ScalarTemporalTest, TestTemporalDifference) { } } +TE

[GitHub] [arrow-datafusion] yjshen commented on a change in pull request #1596: Consolidate sort and external_sort

2022-01-19 Thread GitBox
yjshen commented on a change in pull request #1596: URL: https://github.com/apache/arrow-datafusion/pull/1596#discussion_r787668918 ## File path: datafusion/src/physical_plan/sorts/sort.rs ## @@ -15,47 +15,432 @@ // specific language governing permissions and limitations // u

[GitHub] [arrow] github-actions[bot] commented on pull request #12186: ARROW-15343: [Doc][Guide] Introduction and the checklist - minor corrections

2022-01-19 Thread GitBox
github-actions[bot] commented on pull request #12186: URL: https://github.com/apache/arrow/pull/12186#issuecomment-1016393319 https://issues.apache.org/jira/browse/ARROW-15343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] kszucs closed pull request #11726: ARROW-14738: [Python][Doc] Make return types clickable

2022-01-19 Thread GitBox
kszucs closed pull request #11726: URL: https://github.com/apache/arrow/pull/11726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1192: Update parquet crate readme

2022-01-19 Thread GitBox
codecov-commenter edited a comment on pull request #1192: URL: https://github.com/apache/arrow-rs/pull/1192#issuecomment-1015385265 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1192?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-datafusion] alamb opened a new issue #1612: LogicalPlan Optimizer: Push ` = ` predicates into Joins, if possible

2022-01-19 Thread GitBox
alamb opened a new issue #1612: URL: https://github.com/apache/arrow-datafusion/issues/1612 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We want to bring some extra performance for certain predicates to the dataframe API users

[GitHub] [arrow-datafusion] alamb commented on pull request #1566: fix: sql planner creates cross join instead of inner join from select predicates

2022-01-19 Thread GitBox
alamb commented on pull request #1566: URL: https://github.com/apache/arrow-datafusion/pull/1566#issuecomment-1016398895 > Yes, this is exactly what I have in mind. Filed https://github.com/apache/arrow-datafusion/issues/1612 to track the idea -- This is an automated message fr

[GitHub] [arrow] dragosmg commented on a change in pull request #12179: ARROW-14609 [R] left_join by argument error message mismatch

2022-01-19 Thread GitBox
dragosmg commented on a change in pull request #12179: URL: https://github.com/apache/arrow/pull/12179#discussion_r787675965 ## File path: r/tests/testthat/test-dplyr-join.R ## @@ -90,9 +90,57 @@ test_that("Error handling", { left_tab %>% left_join(to_join, by = "no

  1   2   3   4   >