[GitHub] [arrow-rs] HaoYang670 commented on issue #2071: Several Builder::append methods returning results even though they are infallible

2022-07-14 Thread GitBox
HaoYang670 commented on issue #2071: URL: https://github.com/apache/arrow-rs/issues/2071#issuecomment-1185230409 I had the same concern when updating the binary and list builders -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] HaoYang670 commented on pull request #2041: Add support of converting `FixedSizeBinaryArray` to `DecimalArray`

2022-07-14 Thread GitBox
HaoYang670 commented on PR #2041: URL: https://github.com/apache/arrow-rs/pull/2041#issuecomment-1185227969 The failure randomly occurred, such as in #2064. https://github.com/apache/arrow-rs/runs/7335882941?check_suite_focus=true -- This is an automated message from the Apache Git Ser

[GitHub] [arrow-datafusion] nevi-me opened a new issue, #2916: csv_explain fails on RC verifier

2022-07-14 Thread GitBox
nevi-me opened a new issue, #2916: URL: https://github.com/apache/arrow-datafusion/issues/2916 **Describe the bug** When verifying the version 10 RC1, 3 csv tests fail because of an expected hardcoded string being different. `privateARROW_TEST_DATA` appears instead of `ARROW_TEST_DATA`.

[GitHub] [arrow] AlenkaF commented on pull request #13591: ARROW-17046 [Python] improve documentation of pyarrow.parquet.write_to_dataset function

2022-07-14 Thread GitBox
AlenkaF commented on PR #13591: URL: https://github.com/apache/arrow/pull/13591#issuecomment-1185211658 Thanks for the changes @mirkhosro. Only the linter errors need to be corrected and then I think the PR is ready: ``` INFO:archery:Running Python linter (flake8) /arrow/pytho

[GitHub] [arrow] kou commented on a diff in pull request #13334: ARROW-14314: [C++] Sorting dictionary array not implemented

2022-07-14 Thread GitBox
kou commented on code in PR #13334: URL: https://github.com/apache/arrow/pull/13334#discussion_r921821992 ## cpp/src/arrow/array/array_dict.h: ## @@ -111,6 +111,10 @@ class ARROW_EXPORT DictionaryArray : public Array { const DictionaryType* dict_type() const { return dict_t

[GitHub] [arrow] kou commented on pull request #13334: ARROW-14314: [C++] Sorting dictionary array not implemented

2022-07-14 Thread GitBox
kou commented on PR #13334: URL: https://github.com/apache/arrow/pull/13334#issuecomment-1185184543 Sure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-

[GitHub] [arrow] ursabot commented on pull request #13514: ARROW-16977: [R] Update dataset row counting so no integer overflow on large datasets

2022-07-14 Thread GitBox
ursabot commented on PR #13514: URL: https://github.com/apache/arrow/pull/13514#issuecomment-1185171584 Benchmark runs are scheduled for baseline = 5d86e9fc4063ab56e6fb4fb0b44667d3c1b74836 and contender = 87d1889092bafdf12f6b69e743d9807c78e1358b. 87d1889092bafdf12f6b69e743d9807c78e1358b is

[GitHub] [arrow] AlenkaF commented on a diff in pull request #13582: ARROW-16094: [Python] Address docstrings in Filesystems (Utilities)

2022-07-14 Thread GitBox
AlenkaF commented on code in PR #13582: URL: https://github.com/apache/arrow/pull/13582#discussion_r921807196 ## python/pyarrow/fs.py: ## @@ -227,16 +227,28 @@ def copy_files(source, destination, Examples -Copy an S3 bucket's files to a local directory:

[GitHub] [arrow] vibhatha commented on a diff in pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-07-14 Thread GitBox
vibhatha commented on code in PR #13613: URL: https://github.com/apache/arrow/pull/13613#discussion_r921781258 ## cpp/src/arrow/engine/substrait/function_test.cc: ## @@ -0,0 +1,156 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license

[GitHub] [arrow] vibhatha commented on a diff in pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-07-14 Thread GitBox
vibhatha commented on code in PR #13613: URL: https://github.com/apache/arrow/pull/13613#discussion_r921780857 ## cpp/src/arrow/engine/substrait/extension_set.cc: ## @@ -129,18 +136,19 @@ Result ExtensionSet::Make( return Status::Invalid("Type ", type_ids[i].uri, "#", type_

[GitHub] [arrow] vibhatha commented on a diff in pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-07-14 Thread GitBox
vibhatha commented on code in PR #13613: URL: https://github.com/apache/arrow/pull/13613#discussion_r921780460 ## cpp/src/arrow/compute/exec/util.cc: ## @@ -383,5 +383,25 @@ size_t ThreadIndexer::Check(size_t thread_index) { return thread_index; } +Status TableSinkNodeCons

[GitHub] [arrow-datafusion] waynexia commented on a diff in pull request #2915: Fix invalid projection in `CommonSubexprEliminate`

2022-07-14 Thread GitBox
waynexia commented on code in PR #2915: URL: https://github.com/apache/arrow-datafusion/pull/2915#discussion_r921756118 ## datafusion/optimizer/src/common_subexpr_eliminate.rs: ## @@ -282,15 +282,13 @@ fn build_project_plan( } for field in input.schema().fields() { -

[GitHub] [arrow] vibhatha commented on pull request #13567: ARROW-16911: [C++] Add Equals method to Partitioning

2022-07-14 Thread GitBox
vibhatha commented on PR #13567: URL: https://github.com/apache/arrow/pull/13567#issuecomment-1185133720 cc @lidavidm @pitrou addressed the reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185131244 Revision: 147aadbf524752d3126110a24a1af71450d4679f Submitted crossbow builds: [ursacomputing/crossbow @ actions-206b6227c4](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] github-actions[bot] commented on pull request #13598: ARROW-17078: [C++] Cleaning up C++ Examples

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13598: URL: https://github.com/apache/arrow/pull/13598#issuecomment-1185116172 https://issues.apache.org/jira/browse/ARROW-17078 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] jinchengchenghh commented on pull request #13614: MINOR: [C++] install java/dataset include file and fix debug build failed by compiler flag

2022-07-14 Thread GitBox
jinchengchenghh commented on PR #13614: URL: https://github.com/apache/arrow/pull/13614#issuecomment-1185115674 @pitrou @lidavidm Can you take a look? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] michalursa commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
michalursa commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185114544 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-rs] Ted-Jiang opened a new issue, #2072: Support skip_values in ByteArrayColumnValueDecoder

2022-07-14 Thread GitBox
Ted-Jiang opened a new issue, #2072: URL: https://github.com/apache/arrow-rs/issues/2072 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Support skip_values in ByteArrayColumnValueDecoder related https://github.com/apache/arrow-

[GitHub] [arrow] vibhatha commented on pull request #13598: ARROW-17078: [C++] Cleaning up C++ Examples

2022-07-14 Thread GitBox
vibhatha commented on PR #13598: URL: https://github.com/apache/arrow/pull/13598#issuecomment-1185108853 @lidavidm I renamed the PR to match with the filed JIRA. Should we do something to track it here? -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow-rs] viirya closed pull request #2061: Use `read_volatile` in `BitChunkIterator.next`

2022-07-14 Thread GitBox
viirya closed pull request #2061: Use `read_volatile` in `BitChunkIterator.next` URL: https://github.com/apache/arrow-rs/pull/2061 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2061: Use `read_volatile` in `BitChunkIterator.next`

2022-07-14 Thread GitBox
viirya commented on code in PR #2061: URL: https://github.com/apache/arrow-rs/pull/2061#discussion_r921749773 ## arrow/src/util/bit_chunk_iterator.rs: ## @@ -333,7 +333,7 @@ impl Iterator for BitChunkIterator<'_> { // the constructor ensures that bit_offset is in 0.

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2061: Use `read_volatile` in `BitChunkIterator.next`

2022-07-14 Thread GitBox
viirya commented on code in PR #2061: URL: https://github.com/apache/arrow-rs/pull/2061#discussion_r921749773 ## arrow/src/util/bit_chunk_iterator.rs: ## @@ -333,7 +333,7 @@ impl Iterator for BitChunkIterator<'_> { // the constructor ensures that bit_offset is in 0.

[GitHub] [arrow-rs] Ted-Jiang closed pull request #1977: Enable serialized_reader read specific Page by passing row ranges.

2022-07-14 Thread GitBox
Ted-Jiang closed pull request #1977: Enable serialized_reader read specific Page by passing row ranges. URL: https://github.com/apache/arrow-rs/pull/1977 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-rs] Ted-Jiang commented on pull request #2044: Support peek_next_page() and skip_next_page in serialized_reader.

2022-07-14 Thread GitBox
Ted-Jiang commented on PR #2044: URL: https://github.com/apache/arrow-rs/pull/2044#issuecomment-1185096565 > Nice 👍 Thanks ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] github-actions[bot] commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185095187 Revision: 53ad1c2e52636785dcd8579e5a33592e50b385c2 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1d8d74c193](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] github-actions[bot] commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185091560 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185091549 https://issues.apache.org/jira/browse/ARROW-17071 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] michalursa commented on pull request #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
michalursa commented on PR #13616: URL: https://github.com/apache/arrow/pull/13616#issuecomment-1185091516 @github-actions crossbow submit -g cpp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow] michalursa opened a new pull request, #13616: ARROW-17071: [C++][Compute] Fixing off-by-one error in hash join node

2022-07-14 Thread GitBox
michalursa opened a new pull request, #13616: URL: https://github.com/apache/arrow/pull/13616 Fixing off-by-one error in hash join node -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185090995 ``` Unable to match any tasks for `*_hdfs_*` The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2674196411 ``` -- This is an automate

[GitHub] [arrow] wjones127 commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185090575 @github-actions crossbow submit *_hdfs_* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185090215 ``` Unable to match any tasks for `_hdfs_` The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2674191177 ``` -- This is an automated

[GitHub] [arrow-cookbook] lidavidm commented on a diff in pull request #229: [Java] Adding Arrow Java JDBC adapter examples

2022-07-14 Thread GitBox
lidavidm commented on code in PR #229: URL: https://github.com/apache/arrow-cookbook/pull/229#discussion_r921741133 ## java/source/io.rst: ## @@ -444,6 +444,176 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` +Reading JDBC ResultSets +***

[GitHub] [arrow] wjones127 commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185089846 @github-actions crossbow submit _hdfs_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185088544 ``` Unable to match any tasks for `conda-python-hdfs` The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/2674168121 ``` -- This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185088351 ``` Invalid group(s) {'conda-python-hdfs'}. Must be one of {'example', 'fuzz', 'verify-rc-wheels', 'verify-rc-source-linux', 'vcpkg', 'linux-amd64', 'homebrew', 'nightly',

[GitHub] [arrow] paleolimbot commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
paleolimbot commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921735078 ## r/tests/testthat/test-dplyr-funcs-datetime.R: ## @@ -201,6 +212,14 @@ test_that("strftime", { times ) + # with namespacing Review Comment: That pat

[GitHub] [arrow] wjones127 commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185086914 @github-actions crossbow submit conda-python-hdfs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-cookbook] davisusanibar commented on a diff in pull request #229: [Java] Adding Arrow Java JDBC adapter examples

2022-07-14 Thread GitBox
davisusanibar commented on code in PR #229: URL: https://github.com/apache/arrow-cookbook/pull/229#discussion_r921738140 ## java/source/io.rst: ## @@ -444,6 +444,176 @@ Reading Parquet File Please check :doc:`Dataset <./dataset>` +Reading JDBC ResultSets +**

[GitHub] [arrow] wjones127 commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185085150 @github-actions crossbow submit -g conda-python-hdfs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185084371 ``` Invalid group(s) {'conda-python-hdfs'}. Must be one of {'nightly', 'verify-rc-wheels', 'linux-amd64', 'conda', 'linux-arm64', 'example-python', 'fuzz', 'nightly-tests',

[GitHub] [arrow] cyb70289 commented on a diff in pull request #13583: ARROW-16807: [C++][R] count distinct incorrectly merges state

2022-07-14 Thread GitBox
cyb70289 commented on code in PR #13583: URL: https://github.com/apache/arrow/pull/13583#discussion_r921727349 ## cpp/src/arrow/util/hashing.h: ## @@ -485,6 +485,20 @@ class ScalarMemoTable : public MemoTable { hash_t ComputeHash(const Scalar& value) const { return Scala

[GitHub] [arrow] wjones127 commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185077194 @github-actions crossbow submit -g conda-python-hdfs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow] github-actions[bot] commented on pull request #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13615: URL: https://github.com/apache/arrow/pull/13615#issuecomment-1185076873 https://issues.apache.org/jira/browse/ARROW-17075 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] wjones127 opened a new pull request, #13615: ARROW-17075: [C++] Enforce no trailing slashes on filenames in HDFS

2022-07-14 Thread GitBox
wjones127 opened a new pull request, #13615: URL: https://github.com/apache/arrow/pull/13615 Follow up to #13577 / ARROW-17045. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [arrow-datafusion] yahoNanJing commented on pull request #2906: Introduce ObjectStoreSelfDetector for detector an object store based on the url

2022-07-14 Thread GitBox
yahoNanJing commented on PR #2906: URL: https://github.com/apache/arrow-datafusion/pull/2906#issuecomment-1185071994 > If I understand correctly the issue is following switching to object_store, the stores have to be registered per-host whereas previously they were registered per-scheme. T

[GitHub] [arrow-datafusion] yahoNanJing commented on a diff in pull request #2906: Introduce ObjectStoreSelfDetector for detector an object store based on the url

2022-07-14 Thread GitBox
yahoNanJing commented on code in PR #2906: URL: https://github.com/apache/arrow-datafusion/pull/2906#discussion_r921724787 ## datafusion/core/src/datasource/object_store.rs: ## @@ -81,10 +81,19 @@ impl std::fmt::Display for ObjectStoreUrl { } } +/// Object store self det

[GitHub] [arrow-datafusion] yahoNanJing commented on a diff in pull request #2906: Introduce ObjectStoreSelfDetector for detector an object store based on the url

2022-07-14 Thread GitBox
yahoNanJing commented on code in PR #2906: URL: https://github.com/apache/arrow-datafusion/pull/2906#discussion_r921724379 ## datafusion/core/src/datasource/object_store.rs: ## @@ -81,10 +81,19 @@ impl std::fmt::Display for ObjectStoreUrl { } } +/// Object store self det

[GitHub] [arrow-datafusion] yahoNanJing commented on a diff in pull request #2906: Introduce ObjectStoreSelfDetector for detector an object store based on the url

2022-07-14 Thread GitBox
yahoNanJing commented on code in PR #2906: URL: https://github.com/apache/arrow-datafusion/pull/2906#discussion_r921724137 ## datafusion/core/src/datasource/listing/helpers.rs: ## @@ -162,7 +162,7 @@ pub fn split_files( pub async fn pruned_partition_list<'a>( store: &'a dy

[GitHub] [arrow] ursabot commented on pull request #13581: ARROW-16734: [C++] Bump vendored version of protobuf

2022-07-14 Thread GitBox
ursabot commented on PR #13581: URL: https://github.com/apache/arrow/pull/13581#issuecomment-1185052207 Benchmark runs are scheduled for baseline = e766828c699c6c74eba3b8c5de99e541017b8b9e and contender = 5d86e9fc4063ab56e6fb4fb0b44667d3c1b74836. 5d86e9fc4063ab56e6fb4fb0b44667d3c1b74836 is

[GitHub] [arrow] marsupialtail commented on pull request #13385: ARROW-16521 [C++][Python] Configure curl timeout policy for S3

2022-07-14 Thread GitBox
marsupialtail commented on PR #13385: URL: https://github.com/apache/arrow/pull/13385#issuecomment-1185045318 change the units to seconds. @pitrou can you please rerun your tests on ubuntu? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow-rs] HaoYang670 commented on pull request #2041: Add support of converting `FixedSizeBinaryArray` to `DecimalArray`

2022-07-14 Thread GitBox
HaoYang670 commented on PR #2041: URL: https://github.com/apache/arrow-rs/pull/2041#issuecomment-1185023572 @alamb could you please help to look at what causes the CI failure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] nealrichardson merged pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
nealrichardson merged PR #13610: URL: https://github.com/apache/arrow/pull/13610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow] westonpace merged pull request #13143: ARROW-16523: [C++] Part 1 of ExecPlan cleanup: Centralized Task Group

2022-07-14 Thread GitBox
westonpace merged PR #13143: URL: https://github.com/apache/arrow/pull/13143 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.ap

[GitHub] [arrow] github-actions[bot] commented on pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-07-14 Thread GitBox
github-actions[bot] commented on PR #13613: URL: https://github.com/apache/arrow/pull/13613#issuecomment-1184972185 https://issues.apache.org/jira/browse/ARROW-15582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] westonpace opened a new pull request, #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-07-14 Thread GitBox
westonpace opened a new pull request, #13613: URL: https://github.com/apache/arrow/pull/13613 This picks up where #13285 has left off. It only focuses on the Substrait->Arrow direction at the moment. In addition, basic support is added for named tables. This makes it easy to create unit

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #2915: Fix invalid projection in `CommonSubexprEliminate`

2022-07-14 Thread GitBox
codecov-commenter commented on PR #2915: URL: https://github.com/apache/arrow-datafusion/pull/2915#issuecomment-1184970985 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/2915?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow] mirkhosro commented on a diff in pull request #13591: ARROW-17046 [Python] improve documentation of pyarrow.parquet.write_to_dataset function

2022-07-14 Thread GitBox
mirkhosro commented on code in PR #13591: URL: https://github.com/apache/arrow/pull/13591#discussion_r921636046 ## python/pyarrow/parquet/__init__.py: ## @@ -3063,16 +3064,19 @@ def write_to_dataset(table, root_path, partition_cols=None, used determined by the number o

[GitHub] [arrow] mirkhosro commented on a diff in pull request #13591: ARROW-17046 [Python] improve documentation of pyarrow.parquet.write_to_dataset function

2022-07-14 Thread GitBox
mirkhosro commented on code in PR #13591: URL: https://github.com/apache/arrow/pull/13591#discussion_r921636046 ## python/pyarrow/parquet/__init__.py: ## @@ -3063,16 +3064,19 @@ def write_to_dataset(table, root_path, partition_cols=None, used determined by the number o

[GitHub] [arrow] mirkhosro commented on a diff in pull request #13591: ARROW-17046 [Python] improve documentation of pyarrow.parquet.write_to_dataset function

2022-07-14 Thread GitBox
mirkhosro commented on code in PR #13591: URL: https://github.com/apache/arrow/pull/13591#discussion_r921636046 ## python/pyarrow/parquet/__init__.py: ## @@ -3063,16 +3064,19 @@ def write_to_dataset(table, root_path, partition_cols=None, used determined by the number o

[GitHub] [arrow] nealrichardson commented on a diff in pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
nealrichardson commented on code in PR #13610: URL: https://github.com/apache/arrow/pull/13610#discussion_r921634346 ## r/tests/testthat/test-dplyr-glimpse.R: ## @@ -15,6 +15,11 @@ # specific language governing permissions and limitations # under the License. +# The glimpse

[GitHub] [arrow] nealrichardson commented on a diff in pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
nealrichardson commented on code in PR #13610: URL: https://github.com/apache/arrow/pull/13610#discussion_r921634077 ## r/tests/testthat/test-dplyr-glimpse.R: ## @@ -15,6 +15,9 @@ # specific language governing permissions and limitations # under the License. +# For some reas

[GitHub] [arrow] nealrichardson commented on a diff in pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
nealrichardson commented on code in PR #13610: URL: https://github.com/apache/arrow/pull/13610#discussion_r921632771 ## r/tests/testthat/test-dplyr-glimpse.R: ## @@ -15,6 +15,9 @@ # specific language governing permissions and limitations # under the License. +# For some reas

[GitHub] [arrow-datafusion] ursabot commented on pull request #2901: Combine all comparison coercion rules

2022-07-14 Thread GitBox
ursabot commented on PR #2901: URL: https://github.com/apache/arrow-datafusion/pull/2901#issuecomment-1184956652 Benchmark runs are scheduled for baseline = 8ad3df54f54acd801d81485d7bb9678bf4727e7c and contender = fb2221c43d3367e876430e687ad6f1783cb79075. fb2221c43d3367e876430e687ad6f1783

[GitHub] [arrow-rs] jhorstmann opened a new issue, #2071: Several Builder::append methods returning results even though they are infallible

2022-07-14 Thread GitBox
jhorstmann opened a new issue, #2071: URL: https://github.com/apache/arrow-rs/issues/2071 **Which part is this question about** I noticed that a lot of the `append` or `append_null` methods in builders are always returning `Ok(())`. There are only very few builders that can legitimat

[GitHub] [arrow-datafusion] andygrove closed issue #2890: Inconsistent type coercion rules with comparison expressions

2022-07-14 Thread GitBox
andygrove closed issue #2890: Inconsistent type coercion rules with comparison expressions URL: https://github.com/apache/arrow-datafusion/issues/2890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-datafusion] andygrove merged pull request #2901: Combine all comparison coercion rules

2022-07-14 Thread GitBox
andygrove merged PR #2901: URL: https://github.com/apache/arrow-datafusion/pull/2901 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #2901: Combine all comparison coercion rules

2022-07-14 Thread GitBox
andygrove commented on code in PR #2901: URL: https://github.com/apache/arrow-datafusion/pull/2901#discussion_r921624757 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -1746,8 +1746,6 @@ mod tests { async fn errors() -> Result<()> { let bool_expr = col("c1"

[GitHub] [arrow-datafusion] andygrove commented on pull request #2915: Fix invalid projection in `CommonSubexprEliminate`

2022-07-14 Thread GitBox
andygrove commented on PR #2915: URL: https://github.com/apache/arrow-datafusion/pull/2915#issuecomment-1184954115 @waynexia fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow-datafusion] andygrove commented on pull request #2915: Fix invalid projection in `CommonSubexprEliminate`

2022-07-14 Thread GitBox
andygrove commented on PR #2915: URL: https://github.com/apache/arrow-datafusion/pull/2915#issuecomment-1184947182 @jdye64 Here is the fix for the root cause of the invalid projections in the optimizer -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow-datafusion] thomas-k-cameron commented on issue #2910: index out of range error from datafusion_row::write::write_field

2022-07-14 Thread GitBox
thomas-k-cameron commented on issue #2910: URL: https://github.com/apache/arrow-datafusion/issues/2910#issuecomment-1184947051 @comphead It's my own UDF. It still generates the same error even without it. I was able to reproduce the error without the data set I was talking abo

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #2915: Fix invalid projection in `CommonSubexprEliminate`

2022-07-14 Thread GitBox
andygrove opened a new pull request, #2915: URL: https://github.com/apache/arrow-datafusion/pull/2915 # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/2907 # Rationale for this change `CommonSubexprEliminate` created

[GitHub] [arrow] ursabot commented on pull request #13554: ARROW-16002: [Go] fileBlock.NewMessage should use memory.Allocator

2022-07-14 Thread GitBox
ursabot commented on PR #13554: URL: https://github.com/apache/arrow/pull/13554#issuecomment-1184945810 Benchmark runs are scheduled for baseline = 96a3af437bfc498b75b832b161df378ad96cae1c and contender = e766828c699c6c74eba3b8c5de99e541017b8b9e. e766828c699c6c74eba3b8c5de99e541017b8b9e is

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #2891: Implement `ScalarValue::Dictionary` and preserve type through conversion back/forth to Array

2022-07-14 Thread GitBox
codecov-commenter commented on PR #2891: URL: https://github.com/apache/arrow-datafusion/pull/2891#issuecomment-1184928093 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/2891?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow] jvanstraten commented on pull request #13537: ARROW-16988: [C++] Introduce Substrait ToProto/FromProto conversion options

2022-07-14 Thread GitBox
jvanstraten commented on PR #13537: URL: https://github.com/apache/arrow/pull/13537#issuecomment-1184915536 Rebased, s/PEDANTIC/EXACT_ROUNDTRIP/g, and reworded the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow-datafusion] comphead commented on pull request #2893: Scalar list preserve element name

2022-07-14 Thread GitBox
comphead commented on PR #2893: URL: https://github.com/apache/arrow-datafusion/pull/2893#issuecomment-1184915227 @alamb please check this PR again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921529807 ## r/tests/testthat/test-dplyr-funcs-datetime.R: ## @@ -201,6 +212,14 @@ test_that("strftime", { times ) + # with namespacing Review Comment: I think tha

[GitHub] [arrow-rs] viirya commented on pull request #2040: Truncate IPC record batch

2022-07-14 Thread GitBox
viirya commented on PR #2040: URL: https://github.com/apache/arrow-rs/pull/2040#issuecomment-1184906589 I don't find C++ implementation does truncation for ListArray, StructArray, but maybe I miss it when reading the code. I will re-check this. -- This is an automated message from the Apa

[GitHub] [arrow-rs] viirya commented on pull request #2040: Truncate IPC record batch

2022-07-14 Thread GitBox
viirya commented on PR #2040: URL: https://github.com/apache/arrow-rs/pull/2040#issuecomment-1184905834 Thanks @alamb @tustvold @JasonLi-cn @HaoYang670 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-rs] tustvold commented on pull request #2040: Truncate IPC record batch

2022-07-14 Thread GitBox
tustvold commented on PR #2040: URL: https://github.com/apache/arrow-rs/pull/2040#issuecomment-1184904344 I think there is more to be done in this vein, e.g. ListArray, StructArray, etc... but this is a good step forward. Thank you 🙂 -- This is an automated message from the Apache Git Ser

[GitHub] [arrow-rs] tustvold merged pull request #2040: Truncate IPC record batch

2022-07-14 Thread GitBox
tustvold merged PR #2040: URL: https://github.com/apache/arrow-rs/pull/2040 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-rs] tustvold closed issue #208: flight_data_from_arrow_batch sends too much data

2022-07-14 Thread GitBox
tustvold closed issue #208: flight_data_from_arrow_batch sends too much data URL: https://github.com/apache/arrow-rs/issues/208 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-rs] tustvold closed issue #1528: RecordBatch: Serialization of sliced record using StreamWriter produces incorrect resut

2022-07-14 Thread GitBox
tustvold closed issue #1528: RecordBatch: Serialization of sliced record using StreamWriter produces incorrect resut URL: https://github.com/apache/arrow-rs/issues/1528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2891: Implement `ScalarValue::Dictionary` and preserve type through conversion back/forth to Array

2022-07-14 Thread GitBox
alamb commented on code in PR #2891: URL: https://github.com/apache/arrow-datafusion/pull/2891#discussion_r921571128 ## datafusion/core/tests/path_partition.rs: ## @@ -149,9 +149,10 @@ async fn parquet_distinct_partition_col() -> Result<()> { assert_eq!(min_limit, resulti

[GitHub] [arrow] boshek commented on a diff in pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
boshek commented on code in PR #13610: URL: https://github.com/apache/arrow/pull/13610#discussion_r921565813 ## r/tests/testthat/test-dplyr-glimpse.R: ## @@ -15,6 +15,9 @@ # specific language governing permissions and limitations # under the License. +# For some reason, the

[GitHub] [arrow] nealrichardson merged pull request #13514: ARROW-16977: [R] Update dataset row counting so no integer overflow on large datasets

2022-07-14 Thread GitBox
nealrichardson merged PR #13514: URL: https://github.com/apache/arrow/pull/13514 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow] boshek commented on a diff in pull request #13610: MINOR: [R] Conditionally skip some glimpse-related tests

2022-07-14 Thread GitBox
boshek commented on code in PR #13610: URL: https://github.com/apache/arrow/pull/13610#discussion_r921565813 ## r/tests/testthat/test-dplyr-glimpse.R: ## @@ -15,6 +15,9 @@ # specific language governing permissions and limitations # under the License. +# For some reason, the

[GitHub] [arrow-rs] viirya commented on pull request #2040: Truncate IPC record batch

2022-07-14 Thread GitBox
viirya commented on PR #2040: URL: https://github.com/apache/arrow-rs/pull/2040#issuecomment-1184889814 Unrelated error on Windows CI: ``` error: error: Invalid value for '<+toolchain>': Toolchain overrides must begin with '+' ``` -- This is an automated message from the Apac

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921554773 ## r/tests/testthat/test-dplyr-filter.R: ## @@ -239,6 +239,14 @@ test_that("filter() with between()", { filter(between(chr, 1, 2)) %>% collect() ) + +

[GitHub] [arrow-datafusion] alamb closed pull request #2875: Fix casts of `ScalarValue::Utf8` to `DictionaryArray`

2022-07-14 Thread GitBox
alamb closed pull request #2875: Fix casts of `ScalarValue::Utf8` to `DictionaryArray` URL: https://github.com/apache/arrow-datafusion/pull/2875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow-datafusion] alamb commented on pull request #2875: Fix casts of `ScalarValue::Utf8` to `DictionaryArray`

2022-07-14 Thread GitBox
alamb commented on PR #2875: URL: https://github.com/apache/arrow-datafusion/pull/2875#issuecomment-1184876683 Closing in favor of https://github.com/apache/arrow-datafusion/pull/2891 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] ursabot commented on pull request #13493: ARROW-14182: [C++][Compute] Hash Join performance improvement v2

2022-07-14 Thread GitBox
ursabot commented on PR #13493: URL: https://github.com/apache/arrow/pull/13493#issuecomment-1184871691 Benchmark runs are scheduled for baseline = 0024962ff761d1d5f3a63013e67886334f1e57ca and contender = 96a3af437bfc498b75b832b161df378ad96cae1c. 96a3af437bfc498b75b832b161df378ad96cae1c is

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921547337 ## r/tests/testthat/test-dplyr-funcs-conditional.R: ## @@ -192,6 +212,25 @@ test_that("case_when()", { tbl ) + # with namespacing + compare_dplyr_binding( +

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921545783 ## r/R/dplyr-funcs.R: ## @@ -116,3 +136,18 @@ create_binding_cache <- function() { nse_funcs <- new.env(parent = emptyenv()) agg_funcs <- new.env(parent = emptyenv())

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921545526 ## r/R/dplyr-funcs.R: ## @@ -58,15 +58,34 @@ NULL #' @keywords internal #' register_binding <- function(fun_name, fun, registry = nse_funcs) { - name <- gsub("^.*?:

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921545398 ## r/R/dplyr-funcs.R: ## @@ -58,15 +58,34 @@ NULL #' @keywords internal #' register_binding <- function(fun_name, fun, registry = nse_funcs) { - name <- gsub("^.*?:

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921544627 ## r/R/dplyr-funcs.R: ## @@ -58,15 +58,34 @@ NULL #' @keywords internal #' register_binding <- function(fun_name, fun, registry = nse_funcs) { - name <- gsub("^.*?:

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921542527 ## r/R/dplyr-funcs.R: ## @@ -116,3 +136,18 @@ create_binding_cache <- function() { nse_funcs <- new.env(parent = emptyenv()) agg_funcs <- new.env(parent = emptyenv())

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921537478 ## r/R/dplyr-summarize.R: ## @@ -348,7 +362,7 @@ summarize_eval <- function(name, quosure, ctx, hash) { # the list output from the Arrow hash_tdigest kernel to flatt

[GitHub] [arrow] dragosmg commented on a diff in pull request #13160: ARROW-14575: [R] Allow functions with `pkg::` prefixes

2022-07-14 Thread GitBox
dragosmg commented on code in PR #13160: URL: https://github.com/apache/arrow/pull/13160#discussion_r921536397 ## r/R/dplyr-funcs.R: ## @@ -58,15 +58,34 @@ NULL #' @keywords internal #' register_binding <- function(fun_name, fun, registry = nse_funcs) { - name <- gsub("^.*?:

  1   2   3   >