[GitHub] [arrow] velvia commented on pull request #9773: ARROW-12028 ARROW-11940: [Rust][DataFusion] Add TimestampMillisecond support to GROUP BY/hash aggregates

2021-03-30 Thread GitBox
velvia commented on pull request #9773: URL: https://github.com/apache/arrow/pull/9773#issuecomment-810817972 It seems the Docker build failed but I can't click on or expand on the details. -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] NinaPeng commented on a change in pull request #9060: ARROW-11089: [C++][Gandiva] Support list datatype for gandiva UDF

2021-03-30 Thread GitBox
NinaPeng commented on a change in pull request #9060: URL: https://github.com/apache/arrow/pull/9060#discussion_r604632644 ## File path: cpp/src/gandiva/projector.cc ## @@ -301,7 +348,24 @@ Status Projector::AllocArrayData(const DataTypePtr& type, int64_t num_records, }

[GitHub] [arrow] NinaPeng commented on a change in pull request #9060: ARROW-11089: [C++][Gandiva] Support list datatype for gandiva UDF

2021-03-30 Thread GitBox
NinaPeng commented on a change in pull request #9060: URL: https://github.com/apache/arrow/pull/9060#discussion_r604630536 ## File path: cpp/src/gandiva/field_descriptor.h ## @@ -58,12 +63,15 @@ class FieldDescriptor { bool HasDataBufferPtrIdx() const { return data_buffer_

[GitHub] [arrow] projjal commented on pull request #9060: ARROW-11089: [C++][Gandiva] Support list datatype for gandiva UDF

2021-03-30 Thread GitBox
projjal commented on pull request #9060: URL: https://github.com/apache/arrow/pull/9060#issuecomment-810807116 > > @emkornfield Thanks for your attention. But I have no idea how to add list type for jni ExtGandivaType. Would you like to give some suggestions? > > You need to be able

[GitHub] [arrow] projjal commented on pull request #9060: ARROW-11089: [C++][Gandiva] Support list datatype for gandiva UDF

2021-03-30 Thread GitBox
projjal commented on pull request #9060: URL: https://github.com/apache/arrow/pull/9060#issuecomment-810803612 > @emkornfield Thanks for your attention. But I have no idea how to add list type for jni ExtGandivaType. Would you like to give some suggestions? You need to be able to con

[GitHub] [arrow] projjal commented on a change in pull request #9060: ARROW-11089: [C++][Gandiva] Support list datatype for gandiva UDF

2021-03-30 Thread GitBox
projjal commented on a change in pull request #9060: URL: https://github.com/apache/arrow/pull/9060#discussion_r603902501 ## File path: cpp/src/gandiva/array_ops.cc ## @@ -0,0 +1,76 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor licens

[GitHub] [arrow] github-actions[bot] commented on pull request #9855: ARROW-11336: [C++][Doc] Improve Developing on Windows docs [WIP]

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9855: URL: https://github.com/apache/arrow/pull/9855#issuecomment-810788843 https://issues.apache.org/jira/browse/ARROW-11336 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #9854: ARROW-9731: [C++][Python][R][Dataset] WIP: Port "head" into C++

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9854: URL: https://github.com/apache/arrow/pull/9854#issuecomment-810788508 https://issues.apache.org/jira/browse/ARROW-9731 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] kou closed pull request #9831: ARROW-7830: [C++][Parquet] Use Arrow version number for parquet

2021-03-30 Thread GitBox
kou closed pull request #9831: URL: https://github.com/apache/arrow/pull/9831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please conta

[GitHub] [arrow] emkornfield commented on pull request #9763: ARROW-12034: [Developer Tools] Formalize Minor PRs

2021-03-30 Thread GitBox
emkornfield commented on pull request #9763: URL: https://github.com/apache/arrow/pull/9763#issuecomment-810761494 I'll wait until at least Friday and then merge unless anyone objects. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] nevi-me closed pull request #9825: ARROW-12121: [Rust] [Parquet] Arrow writer benchmarks

2021-03-30 Thread GitBox
nevi-me closed pull request #9825: URL: https://github.com/apache/arrow/pull/9825 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please c

[GitHub] [arrow] emkornfield commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
emkornfield commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604589584 ## File path: go/parquet/compress/compress_test.go ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contr

[GitHub] [arrow] emkornfield commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
emkornfield commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604589430 ## File path: go/parquet/compress/compress_test.go ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contr

[GitHub] [arrow] emkornfield commented on pull request #9822: ARROW-12110: [Java] Implement ZSTD compression

2021-03-30 Thread GitBox
emkornfield commented on pull request #9822: URL: https://github.com/apache/arrow/pull/9822#issuecomment-810754989 > @emkornfield My only concern is performance. If we are going to address it in ARROW-11901, this PR is ready for merge IMO. Yes, I think functionality should come first

[GitHub] [arrow] liyafan82 commented on pull request #9822: ARROW-12110: [Java] Implement ZSTD compression

2021-03-30 Thread GitBox
liyafan82 commented on pull request #9822: URL: https://github.com/apache/arrow/pull/9822#issuecomment-810754381 > @liyafan82 I opened https://issues.apache.org/jira/browse/ARROW-12163 to track making compression levels configurable, any other concerns? @emkornfield My only concern i

[GitHub] [arrow] emkornfield commented on pull request #9573: ARROW-11783: [Rust] Proposal for RFCs in Rust Arrow

2021-03-30 Thread GitBox
emkornfield commented on pull request #9573: URL: https://github.com/apache/arrow/pull/9573#issuecomment-810754377 Stumbled across this while searching for something else. I think this is a great idea. From a governance point of view I would suggest the following. 1. Hold a vote on a

[GitHub] [arrow] emkornfield closed pull request #9836: ARROW-12138: [Go][IPC] Update flatbuffers definitions

2021-03-30 Thread GitBox
emkornfield closed pull request #9836: URL: https://github.com/apache/arrow/pull/9836 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, plea

[GitHub] [arrow] emkornfield commented on pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
emkornfield commented on pull request #9817: URL: https://github.com/apache/arrow/pull/9817#issuecomment-810751389 sorry this week is particularly bad. I will try to review on the weekend/next week. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] emkornfield commented on pull request #9822: ARROW-12110: [Java] Implement ZSTD compression

2021-03-30 Thread GitBox
emkornfield commented on pull request #9822: URL: https://github.com/apache/arrow/pull/9822#issuecomment-810746428 @liyafan82 I opened https://issues.apache.org/jira/browse/ARROW-12163 to track making compression levels configurable, any other concerns? -- This is an automated message fr

[GitHub] [arrow] github-actions[bot] commented on pull request #9853: ARROW-12157: [C++][Gandiva] Implement like function for regex expressions

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9853: URL: https://github.com/apache/arrow/pull/9853#issuecomment-810744943 https://issues.apache.org/jira/browse/ARROW-12157 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wjones127 commented on pull request #8844: ARROW-7205 : [C++][Gandiva] Implement regexp_like in Gandiva

2021-03-30 Thread GitBox
wjones127 commented on pull request #8844: URL: https://github.com/apache/arrow/pull/8844#issuecomment-810735907 Projjal, that's a good point. The SQL like function is basically a wrapper around `regexp_like`. I don't know why I didn't see the optimizations should be the same. I wil

[GitHub] [arrow] westonpace commented on a change in pull request #9808: ARROW-12097: [C++] Modify BackgroundGenerator so it creates fewer threads

2021-03-30 Thread GitBox
westonpace commented on a change in pull request #9808: URL: https://github.com/apache/arrow/pull/9808#discussion_r604572405 ## File path: cpp/src/arrow/util/async_generator_test.cc ## @@ -79,20 +79,14 @@ class TrackingGenerator { std::shared_ptr state_; }; +// Iterator V

[GitHub] [arrow] westonpace commented on pull request #9842: ARROW-12040: [R] [CI] [C++] test-r-rstudio-r-base-3.6-opensuse15 timing out during tests

2021-03-30 Thread GitBox
westonpace commented on pull request #9842: URL: https://github.com/apache/arrow/pull/9842#issuecomment-810733114 @jonkeane That second failure I have tracked now under ARROW-12161 @pitrou Future::All, Future::AllCompleted, and Future::Any (not sure if we've made this yet) are all re

[GitHub] [arrow] github-actions[bot] commented on pull request #9852: ARROW-12154: [C++][Gandiva] Fix gandiva crash in certain OS/CPU combinations

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9852: URL: https://github.com/apache/arrow/pull/9852#issuecomment-810731255 https://issues.apache.org/jira/browse/ARROW-12154 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] westonpace opened a new pull request #9859: ARROW-11887: [C++] REVERT Add asynchronous read to streaming CSV reader

2021-03-30 Thread GitBox
westonpace opened a new pull request #9859: URL: https://github.com/apache/arrow/pull/9859 This reverts commit 9e3aa8560a4ffb0298493110d8a7e9e0c699d6b4. See discussion in https://issues.apache.org/jira/browse/ARROW-12161 -- This is an automated message from the Apache Git Service.

[GitHub] [arrow] github-actions[bot] commented on pull request #9851: ARROW-12155: [R] Require Table columns to be same length

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9851: URL: https://github.com/apache/arrow/pull/9851#issuecomment-810728866 https://issues.apache.org/jira/browse/ARROW-12155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] ericwburden opened a new pull request #9858: ARROW-12160: [Rust] Add `into_inner()` to StreamWriter

2021-03-30 Thread GitBox
ericwburden opened a new pull request #9858: URL: https://github.com/apache/arrow/pull/9858 Add an `into_inner()` method to `ipc::writer::StreamWriter`, allowing users to recover the underlying writer, consuming the StreamWriter. Essentially exposes `into_inner()` from the BufWriter contai

[GitHub] [arrow] github-actions[bot] commented on pull request #9850: ARROW-12153: [Rust] [Parquet] Return file stats after writing file

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9850: URL: https://github.com/apache/arrow/pull/9850#issuecomment-810724551 https://issues.apache.org/jira/browse/ARROW-12153 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] cyb70289 commented on a change in pull request #9841: ARROW-11070: [C++] Implement power / exponentiation compute kernel

2021-03-30 Thread GitBox
cyb70289 commented on a change in pull request #9841: URL: https://github.com/apache/arrow/pull/9841#discussion_r604553475 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -233,6 +235,282 @@ struct DivideChecked { } }; +struct Power { + template +

[GitHub] [arrow] projjal commented on a change in pull request #9700: ARROW-11960: [C++][Gandiva]Support escape in LIKE

2021-03-30 Thread GitBox
projjal commented on a change in pull request #9700: URL: https://github.com/apache/arrow/pull/9700#discussion_r604550069 ## File path: cpp/src/gandiva/function_registry_string.cc ## @@ -100,6 +100,10 @@ std::vector GetStringFunctionRegistry() { kResultNul

[GitHub] [arrow] Crystrix commented on a change in pull request #9700: ARROW-11960: [C++][Gandiva]Support escape in LIKE

2021-03-30 Thread GitBox
Crystrix commented on a change in pull request #9700: URL: https://github.com/apache/arrow/pull/9700#discussion_r604547954 ## File path: cpp/src/gandiva/function_registry_string.cc ## @@ -100,6 +100,10 @@ std::vector GetStringFunctionRegistry() { kResultNu

[GitHub] [arrow] cyb70289 commented on a change in pull request #9838: ARROW-12134: [C++] Add match_substring_regex kernel

2021-03-30 Thread GitBox
cyb70289 commented on a change in pull request #9838: URL: https://github.com/apache/arrow/pull/9838#discussion_r604543823 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -411,40 +411,104 @@ void TransformMatchSubstring(const uint8_t* pattern, int64_t patter

[GitHub] [arrow] github-actions[bot] commented on pull request #9849: ARROW-12068: [Python] Stop using distutils

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9849: URL: https://github.com/apache/arrow/pull/9849#issuecomment-810699015 https://issues.apache.org/jira/browse/ARROW-12068 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] cyb70289 commented on pull request #9843: ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json

2021-03-30 Thread GitBox
cyb70289 commented on pull request #9843: URL: https://github.com/apache/arrow/pull/9843#issuecomment-810690610 Thanks @dianaclarke -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] cyb70289 closed pull request #9843: ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json

2021-03-30 Thread GitBox
cyb70289 closed pull request #9843: URL: https://github.com/apache/arrow/pull/9843 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] alexey-milovidov closed pull request #9857: Remove submodules

2021-03-30 Thread GitBox
alexey-milovidov closed pull request #9857: URL: https://github.com/apache/arrow/pull/9857 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [arrow] alexey-milovidov commented on pull request #9857: Remove submodules

2021-03-30 Thread GitBox
alexey-milovidov commented on pull request #9857: URL: https://github.com/apache/arrow/pull/9857#issuecomment-810687074 Sorry, opened by mistake. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] alexey-milovidov opened a new pull request #9857: Remove submodules

2021-03-30 Thread GitBox
alexey-milovidov opened a new pull request #9857: URL: https://github.com/apache/arrow/pull/9857 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this se

[GitHub] [arrow] kou opened a new pull request #9856: ARROW-11858: [GLib][Gandiva] Add Gandiva::Filter and related functions

2021-03-30 Thread GitBox
kou opened a new pull request #9856: URL: https://github.com/apache/arrow/pull/9856 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] projjal commented on pull request #8844: ARROW-7205 : [C++][Gandiva] Implement regexp_like in Gandiva

2021-03-30 Thread GitBox
projjal commented on pull request #8844: URL: https://github.com/apache/arrow/pull/8844#issuecomment-810671294 Hi @wjones127. Looks like creating a base holder class and two derived class for just these two functions seems overkill to me (I think thats why I closed the original pr temporar

[GitHub] [arrow] github-actions[bot] commented on pull request #9848: ARROW-12089: [Doc] Fix Sphinx warnings

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9848: URL: https://github.com/apache/arrow/pull/9848#issuecomment-810667462 https://issues.apache.org/jira/browse/ARROW-12089 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] projjal commented on pull request #9853: ARROW-12157: [C++][Gandiva] Implement like function for regex expressions

2021-03-30 Thread GitBox
projjal commented on pull request #9853: URL: https://github.com/apache/arrow/pull/9853#issuecomment-810666971 > > @frank400 Could this be a duplicate of #8844? Or is this function different? > > @wjones127, after I saw your PR I also think it is a duplicate, @projjal can you take a

[GitHub] [arrow] github-actions[bot] commented on pull request #9847: ARROW-12108: [Rust] [DataFusion] Implement SHOW TABLES

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9847: URL: https://github.com/apache/arrow/pull/9847#issuecomment-810663606 https://issues.apache.org/jira/browse/ARROW-12108 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] github-actions[bot] commented on pull request #9846: ARROW-12143: [CI] R builds should timeout and fail after some threshold and dump the output.

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9846: URL: https://github.com/apache/arrow/pull/9846#issuecomment-810663512 Revision: 85a06c8fa2da77deb7ceb09a9db02304360579c0 Submitted crossbow builds: [ursacomputing/crossbow @ actions-250](https://github.com/ursacomputing/crossbow/br

[GitHub] [arrow] kou closed pull request #9045: ARROW-11180: [Developer] cmake-format pre-commit hook doesn't run

2021-03-30 Thread GitBox
kou closed pull request #9045: URL: https://github.com/apache/arrow/pull/9045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please conta

[GitHub] [arrow] kou commented on pull request #9045: ARROW-11180: [Developer] cmake-format pre-commit hook doesn't run

2021-03-30 Thread GitBox
kou commented on pull request #9045: URL: https://github.com/apache/arrow/pull/9045#issuecomment-810651397 Thanks for confirming this. I'll merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #9846: ARROW-12143: [CI] R builds should timeout and fail after some threshold and dump the output.

2021-03-30 Thread GitBox
github-actions[bot] commented on pull request #9846: URL: https://github.com/apache/arrow/pull/9846#issuecomment-810638539 https://issues.apache.org/jira/browse/ARROW-12143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] velvia commented on pull request #9773: ARROW-12028 ARROW-11940: [Rust][DataFusion] Add TimestampMillisecond support to GROUP BY/hash aggregates

2021-03-30 Thread GitBox
velvia commented on pull request #9773: URL: https://github.com/apache/arrow/pull/9773#issuecomment-810622821 @alamb pushed the cargo fmt fix. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] drin commented on pull request #9810: ARROW-11677: [C++][Docs] Add basic C++ datasets documentation

2021-03-30 Thread GitBox
drin commented on pull request #9810: URL: https://github.com/apache/arrow/pull/9810#issuecomment-810615985 > @drin thanks for the feedback. I can add a snippet about that as well. If there are other things that were confusing or difficult to figure out, I'd like to tackle them too.

[GitHub] [arrow] drin commented on a change in pull request #9810: ARROW-11677: [C++][Docs] Add basic C++ datasets documentation

2021-03-30 Thread GitBox
drin commented on a change in pull request #9810: URL: https://github.com/apache/arrow/pull/9810#discussion_r604466376 ## File path: cpp/examples/arrow/dataset-documentation-example.cc ## @@ -217,24 +229,29 @@ std::shared_ptr SelectAndProjectDataset( auto scan_builder = data

[GitHub] [arrow] frank400 commented on pull request #9853: ARROW-12157: [C++][Gandiva] Implement like function for regex expressions

2021-03-30 Thread GitBox
frank400 commented on pull request #9853: URL: https://github.com/apache/arrow/pull/9853#issuecomment-810612118 > @frank400 Could this be a duplicate of #8844? Or is this function different? @wjones127, after I saw your PR I also think it is a duplicate, @projjal can you take a look

[GitHub] [arrow] ianmcook opened a new pull request #9855: ARROW-11336: [C++][Doc] Improve Developing on Windows docs [WIP]

2021-03-30 Thread GitBox
ianmcook opened a new pull request #9855: URL: https://github.com/apache/arrow/pull/9855 This improves the **Developing on Windows** docs, adding instructions for Visual Studio 2019 users and documenting how to use vcpkg to install Arrow dependencies on Windows. -- This is an automated

[GitHub] [arrow] lidavidm commented on pull request #9810: ARROW-11677: [C++][Docs] Add basic C++ datasets documentation

2021-03-30 Thread GitBox
lidavidm commented on pull request #9810: URL: https://github.com/apache/arrow/pull/9810#issuecomment-810608422 @drin thanks for the feedback. I can add a snippet about that as well. If there are other things that were confusing or difficult to figure out, I'd like to tackle them too. --

[GitHub] [arrow] drin commented on pull request #9810: ARROW-11677: [C++][Docs] Add basic C++ datasets documentation

2021-03-30 Thread GitBox
drin commented on pull request #9810: URL: https://github.com/apache/arrow/pull/9810#issuecomment-810607005 @lidavidm I'm curious on your thoughts about other things that `ScannerBuilder`s can be constructed from and whether this is something we would want to add to these commits, or if we

[GitHub] [arrow] lidavidm commented on pull request #9854: ARROW-9731: [C++][Python][R][Dataset] WIP: Port "head" into C++

2021-03-30 Thread GitBox
lidavidm commented on pull request #9854: URL: https://github.com/apache/arrow/pull/9854#issuecomment-810602431 Depends on ARROW-7001/#9607 so not quite ready yet. @westonpace, I made commits here to deprecate uses of Scan() in Python/R which you might want to just cherry-pick into your ow

[GitHub] [arrow] lidavidm opened a new pull request #9854: ARROW-9731: [C++][Python][R][Dataset] WIP: Port "head" into C++

2021-03-30 Thread GitBox
lidavidm opened a new pull request #9854: URL: https://github.com/apache/arrow/pull/9854 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, p

[GitHub] [arrow] kou closed pull request #9845: ARROW-12128: [CI][Crossbow] Remove test-ubuntu-16.04-cpp job

2021-03-30 Thread GitBox
kou closed pull request #9845: URL: https://github.com/apache/arrow/pull/9845 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please conta

[GitHub] [arrow] kou commented on pull request #9845: ARROW-12128: [CI][Crossbow] Remove test-ubuntu-16.04-cpp job

2021-03-30 Thread GitBox
kou commented on pull request #9845: URL: https://github.com/apache/arrow/pull/9845#issuecomment-810568929 +1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries a

[GitHub] [arrow] wesm commented on pull request #9763: ARROW-12034: [Developer Tools] Formalize Minor PRs

2021-03-30 Thread GitBox
wesm commented on pull request #9763: URL: https://github.com/apache/arrow/pull/9763#issuecomment-810562937 +1 from me on the content -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] wjones127 commented on pull request #9853: ARROW-12157: [C++][Gandiva] Implement like function for regex expressions

2021-03-30 Thread GitBox
wjones127 commented on pull request #9853: URL: https://github.com/apache/arrow/pull/9853#issuecomment-810546496 @frank400 Could this be a duplicate of https://github.com/apache/arrow/pull/8844? Or is this function different? -- This is an automated message from the Apache Git Service. T

[GitHub] [arrow] pitrou commented on a change in pull request #9843: ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json

2021-03-30 Thread GitBox
pitrou commented on a change in pull request #9843: URL: https://github.com/apache/arrow/pull/9843#discussion_r604396615 ## File path: dev/archery/archery/tests/test_benchmarks.py ## @@ -94,10 +94,16 @@ def test_static_runner_from_json(): archery_result['suites'][0]['bench

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604392118 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,39 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604391508 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,39 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] nevi-me commented on pull request #9825: ARROW-12121: [Rust] [Parquet] Arrow writer benchmarks

2021-03-30 Thread GitBox
nevi-me commented on pull request #9825: URL: https://github.com/apache/arrow/pull/9825#issuecomment-810522046 @alamb that's brutal AMD 3900x (the previous bench was on a MBA) ``` write_batch primitive/1024 values

[GitHub] [arrow] zeroshade commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
zeroshade commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604351891 ## File path: go/parquet/compress/compress_test.go ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] frank400 opened a new pull request #9853: ARROW-12157: [C++][Gandiva] Implement like function for regex expressions

2021-03-30 Thread GitBox
frank400 opened a new pull request #9853: URL: https://github.com/apache/arrow/pull/9853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, p

[GitHub] [arrow] zeroshade commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
zeroshade commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604337163 ## File path: go/parquet/compress/compress_test.go ## @@ -0,0 +1,128 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contrib

[GitHub] [arrow] zeroshade commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
zeroshade commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604336715 ## File path: go/parquet/compress/compress.go ## @@ -0,0 +1,150 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] dianaclarke commented on a change in pull request #9843: ARROW-12145: [Developer][Archery] Flaky: test_static_runner_from_json

2021-03-30 Thread GitBox
dianaclarke commented on a change in pull request #9843: URL: https://github.com/apache/arrow/pull/9843#discussion_r604325602 ## File path: dev/archery/archery/tests/test_benchmarks.py ## @@ -94,10 +94,16 @@ def test_static_runner_from_json(): archery_result['suites'][0]['

[GitHub] [arrow] zeroshade commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
zeroshade commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604325152 ## File path: go/parquet/compress/compress.go ## @@ -0,0 +1,150 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] zeroshade commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
zeroshade commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r604324653 ## File path: go/parquet/compress/brotli.go ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor li

[GitHub] [arrow] pitrou commented on a change in pull request #9848: ARROW-12089: [Doc] Fix Sphinx warnings

2021-03-30 Thread GitBox
pitrou commented on a change in pull request #9848: URL: https://github.com/apache/arrow/pull/9848#discussion_r604316680 ## File path: docs/source/developers/benchmarks.rst ## @@ -33,22 +33,23 @@ The benchmark suites can be run with the ``benchmark run`` sub-command. .. cod

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604220433 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,33 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #9848: ARROW-12089: [Doc] Fix Sphinx warnings

2021-03-30 Thread GitBox
jorisvandenbossche commented on a change in pull request #9848: URL: https://github.com/apache/arrow/pull/9848#discussion_r604311060 ## File path: docs/source/developers/benchmarks.rst ## @@ -33,22 +33,23 @@ The benchmark suites can be run with the ``benchmark run`` sub-comman

[GitHub] [arrow] lidavidm closed pull request #9730: ARROW-9878: [Python] Document caveats of to_pandas(self_destruct=True)

2021-03-30 Thread GitBox
lidavidm closed pull request #9730: URL: https://github.com/apache/arrow/pull/9730 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [arrow] lidavidm commented on pull request #9802: ARROW-10882: [Python] Allow writing dataset from iterator of batches

2021-03-30 Thread GitBox
lidavidm commented on pull request #9802: URL: https://github.com/apache/arrow/pull/9802#issuecomment-810456269 This should be good (minus the known flaky integration test). As suggested I've changed _filesystemdataset_write such that it only needs to handle datasets now. -- This is an

[GitHub] [arrow] MarcoGorelli commented on pull request #9045: ARROW-11180: [Developer] cmake-format pre-commit hook doesn't run

2021-03-30 Thread GitBox
MarcoGorelli commented on pull request #9045: URL: https://github.com/apache/arrow/pull/9045#issuecomment-810455739 Thanks for updating - looks good, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] projjal commented on a change in pull request #9707: ARROW-11984: [C++][Gandiva] Implement SHA1 and SHA256 functions

2021-03-30 Thread GitBox
projjal commented on a change in pull request #9707: URL: https://github.com/apache/arrow/pull/9707#discussion_r604305110 ## File path: cpp/cmake_modules/FindOpenSSLAlt.cmake ## @@ -0,0 +1,45 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributo

[GitHub] [arrow] projjal commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
projjal commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604303465 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,39 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char* bin_in

[GitHub] [arrow] projjal commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
projjal commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604301461 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,39 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char* bin_in

[GitHub] [arrow] WilliamWhispell commented on a change in pull request #9817: ARROW-12104: [Go][Parquet] Second chunk of Ported Go Parquet code

2021-03-30 Thread GitBox
WilliamWhispell commented on a change in pull request #9817: URL: https://github.com/apache/arrow/pull/9817#discussion_r602609760 ## File path: go/parquet/compress/brotli.go ## @@ -0,0 +1,112 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribu

[GitHub] [arrow] projjal opened a new pull request #9852: ARROW-12154: [C++][Gandiva] Fix gandiva crash in certain OS/CPU combinations

2021-03-30 Thread GitBox
projjal opened a new pull request #9852: URL: https://github.com/apache/arrow/pull/9852 Currently when running gandiva in a OS where the OS doesn't support all the features of the host cpu, specifically vector instructions like AVX, AVX512 which needs OS support (because the OS or the VM r

[GitHub] [arrow] ianmcook opened a new pull request #9851: ARROW-12155: [R] Require Table columns to be same length

2021-03-30 Thread GitBox
ianmcook opened a new pull request #9851: URL: https://github.com/apache/arrow/pull/9851 This throws an error if the user attempts to create a Table with columns of different lengths. We already had this for RecordBatches but not for Tables. -- This is an automated message from the Apach

[GitHub] [arrow] jonkeane commented on pull request #9842: ARROW-12040: [R] [CI] [C++] test-r-rstudio-r-base-3.6-opensuse15 timing out during tests

2021-03-30 Thread GitBox
jonkeane commented on pull request #9842: URL: https://github.com/apache/arrow/pull/9842#issuecomment-810425509 It looks like the test that is hanging now is https://github.com/apache/arrow/blob/master/r/tests/testthat/test-dataset.R#L1345-L1367 which interestingly is skipped on windows re

[GitHub] [arrow] jonkeane commented on pull request #9689: [WIP] Restore simpler ARROW_R_WITH_ARROW wrapping

2021-03-30 Thread GitBox
jonkeane commented on pull request #9689: URL: https://github.com/apache/arrow/pull/9689#issuecomment-810420129 I've taken a look at this and I think this is an overall better (temporary?) solution than wrapping each function call. Absolutely agree that an earlier + clearer error will be g

[GitHub] [arrow] wjones127 commented on pull request #9289: ARROW-11341: [Python] [Gandiva] Add NULL/None checks to Gandiva builder functions

2021-03-30 Thread GitBox
wjones127 commented on pull request #9289: URL: https://github.com/apache/arrow/pull/9289#issuecomment-810417521 Good call on the tests. Looks like we are still getting a segfault. I will look into that later today. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] nevi-me commented on pull request #9850: ARROW-12153: [Rust] [Parquet] Return file stats after writing file

2021-03-30 Thread GitBox
nevi-me commented on pull request #9850: URL: https://github.com/apache/arrow/pull/9850#issuecomment-810414900 @alamb, I don't know if this will be useful for you at IOx, but it's useful for https://github.com/delta-io/delta-rs, which needs stats for book-keeping. CC @xianwill --

[GitHub] [arrow] nevi-me opened a new pull request #9850: ARROW-12153: [Rust] [Parquet] Return file stats after writing file

2021-03-30 Thread GitBox
nevi-me opened a new pull request #9850: URL: https://github.com/apache/arrow/pull/9850 The stats are useful for some writers who need to perform book-keeping of files writte. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] pitrou commented on pull request #9849: ARROW-12068: [Python] Stop using distutils

2021-03-30 Thread GitBox
pitrou commented on pull request #9849: URL: https://github.com/apache/arrow/pull/9849#issuecomment-810403793 I triggered additional crossbow builds here: https://github.com/ursacomputing/crossbow/branches/all?query=build-120 and https://github.com/ursacomputing/crossbow/branches/all?query

[GitHub] [arrow] pitrou commented on pull request #9849: ARROW-12068: [Python] Stop using distutils

2021-03-30 Thread GitBox
pitrou commented on pull request #9849: URL: https://github.com/apache/arrow/pull/9849#issuecomment-810399428 It looks like AppVeyor will require more debugging unfortunately :-( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow] emkornfield commented on pull request #9763: ARROW-12034: [Developer Tools] Formalize Minor PRs

2021-03-30 Thread GitBox
emkornfield commented on pull request #9763: URL: https://github.com/apache/arrow/pull/9763#issuecomment-810394311 > I do not know how to test the automation / scripts other than "in production" but that might be ok for this kind of PR @kou gave the pointer of testing on my fork abov

[GitHub] [arrow] pitrou commented on pull request #9842: ARROW-12040: [R] [CI] [C++] test-r-rstudio-r-base-3.6-opensuse15 timing out during tests

2021-03-30 Thread GitBox
pitrou commented on pull request #9842: URL: https://github.com/apache/arrow/pull/9842#issuecomment-810390072 I was thinking that this would make producing a stream of futures more difficult, but perhaps that's not the case actually. -- This is an automated message from the Apache Git Se

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604223982 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,33 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604221525 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,33 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] zeroshade commented on pull request #9836: ARROW-12138: [Go][IPC] Update flatbuffers definitions

2021-03-30 Thread GitBox
zeroshade commented on pull request #9836: URL: https://github.com/apache/arrow/pull/9836#issuecomment-810373785 @sbinet @emkornfield @wesm once this is merged, i have a PR ready for adding compression handling to the go IPC implementation, I'm just waiting for this to get merged before fi

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604220697 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,33 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] jpedroantunes commented on a change in pull request #9844: ARROW-12146: [C++][Gandiva] Implement CONVERT_FROM(expression, replacement char) function

2021-03-30 Thread GitBox
jpedroantunes commented on a change in pull request #9844: URL: https://github.com/apache/arrow/pull/9844#discussion_r604220433 ## File path: cpp/src/gandiva/precompiled/string_ops.cc ## @@ -1246,6 +1246,33 @@ const char* convert_fromUTF8_binary(gdv_int64 context, const char*

[GitHub] [arrow] pitrou commented on pull request #9849: ARROW-12068: [Python] Stop using distutils

2021-03-30 Thread GitBox
pitrou commented on pull request #9849: URL: https://github.com/apache/arrow/pull/9849#issuecomment-810366088 Triggered a crossbow build here: https://github.com/ursacomputing/crossbow/branches/all?query=build-119 -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] Dandandan edited a comment on pull request #9826: ARROW-12123: [Rust][DataFusion] Use smallvec for indices for better join performance

2021-03-30 Thread GitBox
Dandandan edited a comment on pull request #9826: URL: https://github.com/apache/arrow/pull/9826#issuecomment-810362925 Merged with master A small change with some nice performance benefits for common joins, without (for now at least) creating a custom hashmap datastructure. `smal

[GitHub] [arrow] Dandandan commented on pull request #9826: ARROW-12123: [Rust][DataFusion] Use smallvec for indices for better join performance

2021-03-30 Thread GitBox
Dandandan commented on pull request #9826: URL: https://github.com/apache/arrow/pull/9826#issuecomment-810362925 Rebased against master A small change with some nice performance benefits for common joins, without (for now at least) creating a custom hashmap datastructure. `smallve

  1   2   >