[GitHub] [arrow] github-actions[bot] commented on pull request #11931: ARROW-15071: [C#] Fixed a bug in Column.cs ValidateArrayDataTypes method

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11931: URL: https://github.com/apache/arrow/pull/11931#issuecomment-991506112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] github-actions[bot] commented on pull request #11931: Fixed a bug in Column.cs ValidateArrayDataTypes method, added unit testing

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11931: URL: https://github.com/apache/arrow/pull/11931#issuecomment-991503918 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you ope

[GitHub] [arrow] zixi-bwang opened a new pull request #11931: Fixed a bug in Column.cs ValidateArrayDataTypes method, added unit testing

2021-12-10 Thread GitBox
zixi-bwang opened a new pull request #11931: URL: https://github.com/apache/arrow/pull/11931 Fixed a bug in Column.cs ValidateArrayDataTypes method: From: if (Data.Array(i).Data.DataType != Field.DataType) To: if (Data.Array(i).Data.DataType.TypeId != Field.DataType.TypeId)

[GitHub] [arrow-datafusion] xudong963 commented on issue #1434: Query with 100 OR conditions overflows stack

2021-12-10 Thread GitBox
xudong963 commented on issue #1434: URL: https://github.com/apache/arrow-datafusion/issues/1434#issuecomment-991468801 Please assign me @houqp -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-datafusion] xudong963 commented on issue #1434: Query with 100 OR conditions overflows stack

2021-12-10 Thread GitBox
xudong963 commented on issue #1434: URL: https://github.com/apache/arrow-datafusion/issues/1434#issuecomment-991468780 Thanks for your report. I think this is a common question not only in your case. such as ```sql ❯ SELECT 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 + 11 + 12 + 13

[GitHub] [arrow] westonpace commented on a change in pull request #11923: ARROW-14970: [C++] Make ExecNodes can generate/consume tasks

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11923: URL: https://github.com/apache/arrow/pull/11923#discussion_r767079966 ## File path: cpp/src/arrow/compute/exec/exec_plan.h ## @@ -226,6 +227,8 @@ class ARROW_EXPORT ExecNode { std::string ToString() const; protecte

[GitHub] [arrow-rs] matthewmturner edited a comment on pull request #984: Add comparison kernels for DictionaryArray

2021-12-10 Thread GitBox
matthewmturner edited a comment on pull request #984: URL: https://github.com/apache/arrow-rs/pull/984#issuecomment-991260739 @alamb ive started updating to your proposed approach but now when testing i get an error that a type annotation is needed for type parameter T. Im still playing a

[GitHub] [arrow] kou commented on a change in pull request #11876: ARROW-14479: [C++] Hash Join Microbenchmarks

2021-12-10 Thread GitBox
kou commented on a change in pull request #11876: URL: https://github.com/apache/arrow/pull/11876#discussion_r767079743 ## File path: cpp/src/arrow/compute/exec/CMakeLists.txt ## @@ -32,6 +32,19 @@ add_arrow_compute_test(util_test PREFIX "arrow-compute") add_arrow_benchmark(

[GitHub] [arrow] westonpace commented on a change in pull request #11876: ARROW-14479: [C++] Hash Join Microbenchmarks

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11876: URL: https://github.com/apache/arrow/pull/11876#discussion_r767078622 ## File path: cpp/src/arrow/compute/exec/hash_join_benchmark.cc ## @@ -0,0 +1,425 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[GitHub] [arrow] westonpace commented on a change in pull request #11876: ARROW-14479: [C++] Hash Join Microbenchmarks

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11876: URL: https://github.com/apache/arrow/pull/11876#discussion_r767078640 ## File path: cpp/src/arrow/compute/exec/hash_join_benchmark.cc ## @@ -0,0 +1,425 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[GitHub] [arrow] westonpace commented on a change in pull request #11876: ARROW-14479: [C++] Hash Join Microbenchmarks

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11876: URL: https://github.com/apache/arrow/pull/11876#discussion_r767078622 ## File path: cpp/src/arrow/compute/exec/hash_join_benchmark.cc ## @@ -0,0 +1,425 @@ +// Licensed to the Apache Software Foundation (ASF) under one +//

[GitHub] [arrow] westonpace commented on a change in pull request #11876: ARROW-14479: [C++] Hash Join Microbenchmarks

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11876: URL: https://github.com/apache/arrow/pull/11876#discussion_r766078283 ## File path: cpp/src/arrow/compute/exec/CMakeLists.txt ## @@ -32,6 +32,19 @@ add_arrow_compute_test(util_test PREFIX "arrow-compute") add_arrow_ben

[GitHub] [arrow-rs] matthewmturner edited a comment on pull request #984: Add comparison kernels for DictionaryArray

2021-12-10 Thread GitBox
matthewmturner edited a comment on pull request #984: URL: https://github.com/apache/arrow-rs/pull/984#issuecomment-991260739 @alamb ive started updating to your proposed approach but now when testing i get an error that a type annotation is needed for type parameter T. Im still playing a

[GitHub] [arrow-datafusion] mcassels opened a new issue #1434: Query with 100 OR conditions overflows stack

2021-12-10 Thread GitBox
mcassels opened a new issue #1434: URL: https://github.com/apache/arrow-datafusion/issues/1434 **Describe the bug** `SELECT * FROM table WHERE OR OR ...` succeeds with 50 conditions but overflows stack with 100 conditions. **To Reproduce** Adding this test to arrow-datafusion

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #1431: support decimal data type in create table

2021-12-10 Thread GitBox
liukun4515 commented on pull request #1431: URL: https://github.com/apache/arrow-datafusion/pull/1431#issuecomment-991391020 PTAL @Dandandan again. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow-datafusion] liukun4515 commented on a change in pull request #1431: support decimal data type in create table

2021-12-10 Thread GitBox
liukun4515 commented on a change in pull request #1431: URL: https://github.com/apache/arrow-datafusion/pull/1431#discussion_r767059652 ## File path: datafusion/src/sql/planner.rs ## @@ -372,7 +372,27 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { SQLDataType::C

[GitHub] [arrow] westonpace commented on pull request #11930: ARROW-15070: [Python][C++][R][Doc] Add a general statement to dataset docs around the lack of ACID guarantees

2021-12-10 Thread GitBox
westonpace commented on pull request #11930: URL: https://github.com/apache/arrow/pull/11930#issuecomment-991370064 Found it. Jon pointed me to the datasets vignette. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [arrow] westonpace commented on pull request #11930: ARROW-15070: [Python][C++][R][Doc] Add a general statement to dataset docs around the lack of ACID guarantees

2021-12-10 Thread GitBox
westonpace commented on pull request #11930: URL: https://github.com/apache/arrow/pull/11930#issuecomment-991363676 @nealrichardson @jonkeane is there any spot in the R docs we want to mention this? Or will this suffice. -- This is an automated message from the Apache Git Service. To re

[GitHub] [arrow] github-actions[bot] commented on pull request #11930: ARROW-15070: [Python][C++][R][Doc] Add a general statement to dataset docs around the lack of ACID guarantees

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11930: URL: https://github.com/apache/arrow/pull/11930#issuecomment-991363572 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] rip-nsk commented on a change in pull request #11887: ARROW-15012: [C++] fixes for msvc environment

2021-12-10 Thread GitBox
rip-nsk commented on a change in pull request #11887: URL: https://github.com/apache/arrow/pull/11887#discussion_r767026133 ## File path: cpp/src/arrow/type.h ## @@ -714,9 +714,11 @@ class ARROW_EXPORT FixedSizeBinaryType : public FixedWidthType, public Parametri /// \brief B

[GitHub] [arrow] westonpace commented on issue #11781: Is adding Parquet partitions/part files using R's arrow::write_dataset() transactional?

2021-12-10 Thread GitBox
westonpace commented on issue #11781: URL: https://github.com/apache/arrow/issues/11781#issuecomment-991344006 ARROW-15070 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] westonpace closed issue #11781: Is adding Parquet partitions/part files using R's arrow::write_dataset() transactional?

2021-12-10 Thread GitBox
westonpace closed issue #11781: URL: https://github.com/apache/arrow/issues/11781 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr.

[GitHub] [arrow] jonkeane closed pull request #11927: ARROW-15057: [R] [CI] Move where we install DuckDB from in CI

2021-12-10 Thread GitBox
jonkeane closed pull request #11927: URL: https://github.com/apache/arrow/pull/11927 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] lidavidm closed pull request #11928: ARROW-14701: [Python][MINOR] document parquet.write_table row_group_size

2021-12-10 Thread GitBox
lidavidm closed pull request #11928: URL: https://github.com/apache/arrow/pull/11928 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow] wjones127 commented on pull request #11928: ARROW-14701: [Python][MINOR] document parquet.write_table row_group_size

2021-12-10 Thread GitBox
wjones127 commented on pull request #11928: URL: https://github.com/apache/arrow/pull/11928#issuecomment-991321789 > Just to make sure @wjones127, when I go to merge this, the commit is attributed to `wjones...@users.noreply.github.com` - is this intentional or would you prefer it be attri

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1021: Simplify parquet arror `RecordReader`

2021-12-10 Thread GitBox
codecov-commenter edited a comment on pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#issuecomment-991010923 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1021?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-datafusion] Dandandan commented on issue #1433: Query failing to return any results when filter is an equality check on strings

2021-12-10 Thread GitBox
Dandandan commented on issue #1433: URL: https://github.com/apache/arrow-datafusion/issues/1433#issuecomment-991287408 This seems to be something in the pruning logic: ``` ❯ explain analyze SELECT "adt" FROM t WHERE "direction" = 'Two Way'; +---+--

[GitHub] [arrow-rs] matthewmturner commented on pull request #984: Add comparison kernels for DictionaryArray

2021-12-10 Thread GitBox
matthewmturner commented on pull request #984: URL: https://github.com/apache/arrow-rs/pull/984#issuecomment-991260739 @alamb ive started updating to your proposed approach but now when testing i get an error that a type annotation is needed for type parameter T. Im still playing around w

[GitHub] [arrow] westonpace closed pull request #11857: ARROW-14907: [C++] Enable CSV Writer to control end-of-line character

2021-12-10 Thread GitBox
westonpace closed pull request #11857: URL: https://github.com/apache/arrow/pull/11857 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsu

[GitHub] [arrow] westonpace commented on pull request #11857: ARROW-14907: [C++] Enable CSV Writer to control end-of-line character

2021-12-10 Thread GitBox
westonpace commented on pull request #11857: URL: https://github.com/apache/arrow/pull/11857#issuecomment-991250151 The failed test seems unrelated. I will merge this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] westonpace commented on a change in pull request #11616: ARROW-14577: [C++] Enable fine grained IO for async IPC reader

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11616: URL: https://github.com/apache/arrow/pull/11616#discussion_r766936170 ## File path: cpp/src/arrow/ipc/read_write_benchmark.cc ## @@ -49,9 +51,29 @@ std::shared_ptr MakeRecordBatch(int64_t total_size, int64_t num_fie r

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1427: Clarify communication on bi-weekly sync

2021-12-10 Thread GitBox
alamb commented on a change in pull request #1427: URL: https://github.com/apache/arrow-datafusion/pull/1427#discussion_r766935967 ## File path: docs/source/community/communication.md ## @@ -54,11 +54,11 @@ can also ask for one in our Discord server. ### Sync up Zoom calls

[GitHub] [arrow] westonpace commented on a change in pull request #11616: ARROW-14577: [C++] Enable fine grained IO for async IPC reader

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11616: URL: https://github.com/apache/arrow/pull/11616#discussion_r766935359 ## File path: cpp/src/arrow/ipc/read_write_test.cc ## @@ -2715,6 +2727,133 @@ TEST(TestRecordBatchFileReaderIo, ReadTwoContinousFieldsWithIoMerged) {

[GitHub] [arrow] westonpace commented on a change in pull request #11616: ARROW-14577: [C++] Enable fine grained IO for async IPC reader

2021-12-10 Thread GitBox
westonpace commented on a change in pull request #11616: URL: https://github.com/apache/arrow/pull/11616#discussion_r766935036 ## File path: cpp/src/arrow/ipc/read_write_test.cc ## @@ -2715,6 +2727,133 @@ TEST(TestRecordBatchFileReaderIo, ReadTwoContinousFieldsWithIoMerged) {

[GitHub] [arrow-datafusion] alamb merged pull request #1407: support decimal for min/max agg

2021-12-10 Thread GitBox
alamb merged pull request #1407: URL: https://github.com/apache/arrow-datafusion/pull/1407 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on pull request #1407: support decimal for min/max agg

2021-12-10 Thread GitBox
alamb commented on pull request #1407: URL: https://github.com/apache/arrow-datafusion/pull/1407#issuecomment-991243539 Thanks again @liukun4515 ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow-datafusion] alamb commented on pull request #1419: Ordering by index in select expression

2021-12-10 Thread GitBox
alamb commented on pull request #1419: URL: https://github.com/apache/arrow-datafusion/pull/1419#issuecomment-991243159 Thank you @hntd187 -- this feature (lack thereof) has annoyed me for a long time! > You couldn't do this before, but now you can. 🤣 -- This is an automa

[GitHub] [arrow-datafusion] alamb merged pull request #1419: Ordering by index in select expression

2021-12-10 Thread GitBox
alamb merged pull request #1419: URL: https://github.com/apache/arrow-datafusion/pull/1419 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb closed issue #1414: `order by` supports index in select list

2021-12-10 Thread GitBox
alamb closed issue #1414: URL: https://github.com/apache/arrow-datafusion/issues/1414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] lidavidm commented on pull request #11928: ARROW-14701: [Python][MINOR] document parquet.write_table row_group_size

2021-12-10 Thread GitBox
lidavidm commented on pull request #11928: URL: https://github.com/apache/arrow/pull/11928#issuecomment-991230720 Just to make sure @wjones127, when I go to merge this, the commit is attributed to `wjones...@users.noreply.github.com` - is this intentional or would you prefer it be attribut

[GitHub] [arrow-rs] alamb commented on a change in pull request #1001: Address benchmarks that aren't compiling

2021-12-10 Thread GitBox
alamb commented on a change in pull request #1001: URL: https://github.com/apache/arrow-rs/pull/1001#discussion_r766921325 ## File path: .github/workflows/rust.yml ## @@ -241,6 +241,46 @@ jobs: export CARGO_TARGET_DIR="/github/home/target" cargo clippy --f

[GitHub] [arrow-rs] alamb commented on a change in pull request #1001: Address benchmarks that aren't compiling

2021-12-10 Thread GitBox
alamb commented on a change in pull request #1001: URL: https://github.com/apache/arrow-rs/pull/1001#discussion_r766921216 ## File path: .github/workflows/rust.yml ## @@ -241,6 +241,46 @@ jobs: export CARGO_TARGET_DIR="/github/home/target" cargo clippy --f

[GitHub] [arrow-rs] alamb commented on pull request #1001: Address benchmarks that aren't compiling

2021-12-10 Thread GitBox
alamb commented on pull request #1001: URL: https://github.com/apache/arrow-rs/pull/1001#issuecomment-991229753 @carols10cents I have worked around the issue with the internal compiler error with rust in https://github.com/apache/arrow-rs/pull/1023 so if you merge from master this PR shou

[GitHub] [arrow-rs] alamb closed issue #1026: Parquet benchmarks are broken - no `DataPageBuilderImpl` in `util`

2021-12-10 Thread GitBox
alamb closed issue #1026: URL: https://github.com/apache/arrow-rs/issues/1026 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-rs] alamb commented on issue #1026: Parquet benchmarks are broken - no `DataPageBuilderImpl` in `util`

2021-12-10 Thread GitBox
alamb commented on issue #1026: URL: https://github.com/apache/arrow-rs/issues/1026#issuecomment-991229004 Dupe of https://github.com/apache/arrow-rs/issues/770 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] lidavidm commented on a change in pull request #11616: ARROW-14577: [C++] Enable fine grained IO for async IPC reader

2021-12-10 Thread GitBox
lidavidm commented on a change in pull request #11616: URL: https://github.com/apache/arrow/pull/11616#discussion_r766761232 ## File path: cpp/src/arrow/ipc/read_write_test.cc ## @@ -2715,6 +2727,133 @@ TEST(TestRecordBatchFileReaderIo, ReadTwoContinousFieldsWithIoMerged) {

[GitHub] [arrow-rs] alamb commented on pull request #1021: Simplify record reader

2021-12-10 Thread GitBox
alamb commented on pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#issuecomment-991226859 My performance tests showed no significant performance difference 1. `tustvold/simplify-record-reader` @ 290b24f7d1845b4cfe58eefeb6562e4ec2d54829 2. `apache/mast

[GitHub] [arrow-rs] alamb commented on pull request #1021: Simplify record reader

2021-12-10 Thread GitBox
alamb commented on pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#issuecomment-991227188 It looks like this PR needs some clippy appeasement: https://github.com/apache/arrow-rs/runs/4485244206?check_suite_focus=true But otherwise looks good from my perspective

[GitHub] [arrow-datafusion] maxburke opened a new issue #1433: Query failing to return any results when filter is an equality check on strings

2021-12-10 Thread GitBox
maxburke opened a new issue #1433: URL: https://github.com/apache/arrow-datafusion/issues/1433 With the attached file, running the query: ``` CREATE EXTERNAL TABLE t STORED AS PARQUET LOCATION 'test.parquet'; SELECT "adt" FROM t WHERE "direction" = 'Two Way'; ``` retu

[GitHub] [arrow] westonpace commented on pull request #11929: ARROW-15036: [C++] Automatically configure S3 SDK configuration parameter "maxConnections"

2021-12-10 Thread GitBox
westonpace commented on pull request #11929: URL: https://github.com/apache/arrow/pull/11929#issuecomment-991199577 CC @pitrou I'm not sure about testing this but if you want me to add something I'd be happy to. -- This is an automated message from the Apache Git Service. To respond to t

[GitHub] [arrow] github-actions[bot] commented on pull request #11929: ARROW-15036: [C++] Automatically configure S3 SDK configuration parameter "maxConnections"

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11929: URL: https://github.com/apache/arrow/pull/11929#issuecomment-991199264 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] westonpace opened a new pull request #11929: ARROW-15036: [C++] Automatically configure S3 SDK configuration parameter "maxConnections"

2021-12-10 Thread GitBox
westonpace opened a new pull request #11929: URL: https://github.com/apache/arrow/pull/11929 This should allow uses cases where the user wants more than 25 threads in the I/O thread pool because they are doing highly concurrent S3 work. See ARROW-14965 -- This is an automated message f

[GitHub] [arrow] ursabot edited a comment on pull request #11865: ARROW-14976: [Dev][Archery] Fail early if no benchmark found

2021-12-10 Thread GitBox
ursabot edited a comment on pull request #11865: URL: https://github.com/apache/arrow/pull/11865#issuecomment-989433563 Benchmark runs are scheduled for baseline = 9d200f5a4466e0ae2731abbf2949fa5e118bb056 and contender = f0110cf26af3cc8f4bcb94da7fafd01974bbbfd2. f0110cf26af3cc8f4bcb94da7f

[GitHub] [arrow] lidavidm commented on a change in pull request #11849: ARROW-14905: [C++] Enable CSV Writer to handle quoting

2021-12-10 Thread GitBox
lidavidm commented on a change in pull request #11849: URL: https://github.com/apache/arrow/pull/11849#discussion_r766875020 ## File path: cpp/src/arrow/csv/writer.cc ## @@ -155,27 +161,64 @@ class UnquotedColumnPopulator : public ColumnPopulator { return Status::OK();

[GitHub] [arrow] github-actions[bot] commented on pull request #11927: ARROW-15057: [R] [CI] Move where we install DuckDB from in CI

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11927: URL: https://github.com/apache/arrow/pull/11927#issuecomment-991176194 Revision: 1b6359a9fb4f39cfbc6fce9b349bc1fc66edcffa Submitted crossbow builds: [ursacomputing/crossbow @ actions-1275](https://github.com/ursacomputing/crossbow

[GitHub] [arrow] jonkeane commented on pull request #11927: ARROW-15057: [R] [CI] Move where we install DuckDB from in CI

2021-12-10 Thread GitBox
jonkeane commented on pull request #11927: URL: https://github.com/apache/arrow/pull/11927#issuecomment-991175437 @github-actions crossbow submit -g r -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] sunchao commented on a change in pull request #11709: ARROW-14718: [Java] loadValidityBuffer should avoid allocating memory when input is not null and there are only null or non-null

2021-12-10 Thread GitBox
sunchao commented on a change in pull request #11709: URL: https://github.com/apache/arrow/pull/11709#discussion_r766867771 ## File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVectorHelper.java ## @@ -222,6 +223,59 @@ public void testConcatBitsInPlace() {

[GitHub] [arrow] sunchao commented on a change in pull request #11709: ARROW-14718: [Java] loadValidityBuffer should avoid allocating memory when input is not null and there are only null or non-null

2021-12-10 Thread GitBox
sunchao commented on a change in pull request #11709: URL: https://github.com/apache/arrow/pull/11709#discussion_r766867162 ## File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVectorHelper.java ## @@ -222,6 +223,59 @@ public void testConcatBitsInPlace() {

[GitHub] [arrow] ursabot edited a comment on pull request #11793: ARROW-13950: [C++] min_element_wise/max_element_wise missing support for some types

2021-12-10 Thread GitBox
ursabot edited a comment on pull request #11793: URL: https://github.com/apache/arrow/pull/11793#issuecomment-989205442 Benchmark runs are scheduled for baseline = 62db4b6a2545da29279ee5c138b5f531067d802a and contender = 7fb1a7203121d26c5f0e163ea007dbbe50fd6d3b. 7fb1a7203121d26c5f0e163ea0

[GitHub] [arrow] rok commented on pull request #8510: ARROW-1614: [C++] Add a Tensor logical value type with constant dimensions, implemented using ExtensionType

2021-12-10 Thread GitBox
rok commented on pull request #8510: URL: https://github.com/apache/arrow/pull/8510#issuecomment-991150389 @jorisvandenbossche @sjperkins @pitrou is there interest to get this in? If yes is `cpp/src/arrow/extension_type_test.cc` a good place to put it? -- This is an automated message f

[GitHub] [arrow] github-actions[bot] commented on pull request #11928: ARROW-14701: [Python][MINOR] document parquet.write_table row_group_size

2021-12-10 Thread GitBox
github-actions[bot] commented on pull request #11928: URL: https://github.com/apache/arrow/pull/11928#issuecomment-991138215 https://issues.apache.org/jira/browse/ARROW-14701 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] wjones127 opened a new pull request #11928: ARROW-14701: [Python][MINOR] document parquet.write_table row_group_size

2021-12-10 Thread GitBox
wjones127 opened a new pull request #11928: URL: https://github.com/apache/arrow/pull/11928 Nicer docstrings for `row_group_size` were implemented in ARROW-13668 (#11455), but missed this convenience function. Add this here. -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow] lidavidm commented on a change in pull request #11863: ARROW-14906: [C++] Enable CSV Writer to control the type of escape used for quoting

2021-12-10 Thread GitBox
lidavidm commented on a change in pull request #11863: URL: https://github.com/apache/arrow/pull/11863#discussion_r766826456 ## File path: cpp/src/arrow/csv/writer.cc ## @@ -462,25 +483,32 @@ class CSVWriterImpl : public ipc::RecordBatchWriter { return Status::OK(); }

[GitHub] [arrow] bkmgit commented on pull request #11562: ARROW-14446: [Docs][Release] Update documentation on verification of release candidates

2021-12-10 Thread GitBox
bkmgit commented on pull request #11562: URL: https://github.com/apache/arrow/pull/11562#issuecomment-991109536 @kou please review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] ursabot edited a comment on pull request #11893: MINOR: [R] Small addition to disable extraneous duckdb building

2021-12-10 Thread GitBox
ursabot edited a comment on pull request #11893: URL: https://github.com/apache/arrow/pull/11893#issuecomment-989285751 Benchmark runs are scheduled for baseline = 7fb1a7203121d26c5f0e163ea007dbbe50fd6d3b and contender = 9d200f5a4466e0ae2731abbf2949fa5e118bb056. 9d200f5a4466e0ae2731abbf29

[GitHub] [arrow-rs] alamb commented on issue #1026: Parquet benchmarks are broken - no `DataPageBuilderImpl` in `util`

2021-12-10 Thread GitBox
alamb commented on issue #1026: URL: https://github.com/apache/arrow-rs/issues/1026#issuecomment-991102464 Found a workaround -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [arrow] paleolimbot commented on pull request #11850: ARROW-14941 [R] Implement Duration R6 class and bindings for lubridate::duration()

2021-12-10 Thread GitBox
paleolimbot commented on pull request #11850: URL: https://github.com/apache/arrow/pull/11850#issuecomment-991097964 Ok! This now works: ``` r library(arrow, warn.conflicts = FALSE) (arr <- Array$create(as.difftime(123, units = "secs"))) #> Array #> #> [ #> 00:02:

[GitHub] [arrow-rs] alamb opened a new issue #1026: Parquet benchmarks are broken - no `DataPageBuilderImpl` in `util`

2021-12-10 Thread GitBox
alamb opened a new issue #1026: URL: https://github.com/apache/arrow-rs/issues/1026 **Describe the bug** Parquet benchmarks are broken **To Reproduce** Run ```shell cargo bench -p parquet ``` For example: ```shell alamb@instance-1:/data/arrow-rs$ cargo benc

[GitHub] [arrow] paleolimbot commented on a change in pull request #11919: ARROW-14804: [R] import_from_c() / export_to_c() methods should accept external pointers

2021-12-10 Thread GitBox
paleolimbot commented on a change in pull request #11919: URL: https://github.com/apache/arrow/pull/11919#discussion_r766791379 ## File path: r/src/py-to-r.cpp ## @@ -21,6 +21,38 @@ #include +// [[arrow::export]] +double external_pointer_addr_double(SEXP external_pointer)

[GitHub] [arrow-rs] alamb commented on a change in pull request #1021: Simplify record reader

2021-12-10 Thread GitBox
alamb commented on a change in pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#discussion_r766786054 ## File path: parquet/src/arrow/record_reader.rs ## @@ -75,9 +73,7 @@ impl RecordReader { column_desc: column_schema, num_records

[GitHub] [arrow] jonkeane commented on pull request #11921: ARROW-12743 [R] Add DESCRIPTION fields for dev dependencies

2021-12-10 Thread GitBox
jonkeane commented on pull request #11921: URL: https://github.com/apache/arrow/pull/11921#issuecomment-991080111 > 40+ minutes later my R session was still busy installing (mostly binary) packages Indeed, something sounds not quite right about that or not the behavior we would ulti

[GitHub] [arrow-rs] alamb commented on pull request #1021: Simplify record reader

2021-12-10 Thread GitBox
alamb commented on pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#issuecomment-991075986 I think we should run the parquet performance benchmark for this change -- I will do so -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-datafusion] alamb commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

2021-12-10 Thread GitBox
alamb commented on issue #1432: URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991073111 And indeed postgres supports such columns: ``` alamb=# create table foo ("f1.c1" int); CREATE TABLE alamb=# insert into foo values (1); INSERT 0 1 alamb=#

[GitHub] [arrow] romainfrancois commented on a change in pull request #11850: ARROW-14941 [R] Implement Duration R6 class and bindings for lubridate::duration()

2021-12-10 Thread GitBox
romainfrancois commented on a change in pull request #11850: URL: https://github.com/apache/arrow/pull/11850#discussion_r766771764 ## File path: r/src/r_to_arrow.cpp ## @@ -580,33 +583,37 @@ int64_t get_TimeUnit_multiplier(TimeUnit::type unit) { } } +Result get_difftime_u

[GitHub] [arrow] schelhorn commented on issue #11781: Is adding Parquet partitions/part files using R's arrow::write_dataset() transactional?

2021-12-10 Thread GitBox
schelhorn commented on issue #11781: URL: https://github.com/apache/arrow/issues/11781#issuecomment-991062577 I see, thank you @nealrichardson. So I guess I would fare safer by indeed having a second dataset to write in the same directory as the main dataset, and then sync all new part fil

[GitHub] [arrow-rs] alamb commented on pull request #1025: Prepare for the 6.4.0: release notes and version update

2021-12-10 Thread GitBox
alamb commented on pull request #1025: URL: https://github.com/apache/arrow-rs/pull/1025#issuecomment-991061631 Merging this PR to kick off the final tests on active_release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow-rs] alamb merged pull request #1025: Prepare for the 6.4.0: release notes and version update

2021-12-10 Thread GitBox
alamb merged pull request #1025: URL: https://github.com/apache/arrow-rs/pull/1025 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb opened a new pull request #1025: Prepare for the 6.4.0: release notes and version update

2021-12-10 Thread GitBox
alamb opened a new pull request #1025: URL: https://github.com/apache/arrow-rs/pull/1025 Note targets `active_release` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [arrow-rs] alamb merged pull request #1024: Cherry pick Force new cargo and target caching to fix CI to active_release

2021-12-10 Thread GitBox
alamb merged pull request #1024: URL: https://github.com/apache/arrow-rs/pull/1024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-rs] alamb opened a new pull request #1024: Cherry pick Force new cargo and target caching to fix CI to active_release

2021-12-10 Thread GitBox
alamb opened a new pull request #1024: URL: https://github.com/apache/arrow-rs/pull/1024 Automatic cherry-pick of e0abda2 * Originally appeared in https://github.com/apache/arrow-rs/pull/1023: Force new cargo and target caching to fix CI -- This is an automated message from the Ap

[GitHub] [arrow-datafusion] Dandandan commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

2021-12-10 Thread GitBox
Dandandan commented on issue #1432: URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991055788 Usually you can use quotes to achieve this. In SparkSQL: ``` create table t1(`f.c1` int); select "f.c1" from t1; ``` -- This is an automated messa

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #1407: support decimal for min/max agg

2021-12-10 Thread GitBox
liukun4515 commented on pull request #1407: URL: https://github.com/apache/arrow-datafusion/pull/1407#issuecomment-991054740 @alamb it's can be merged. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [arrow-rs] alamb commented on pull request #1021: Simplify record reader

2021-12-10 Thread GitBox
alamb commented on pull request #1021: URL: https://github.com/apache/arrow-rs/pull/1021#issuecomment-991053958 I fixed the nightly failures in https://github.com/apache/arrow-rs/pull/1023 -- will merge to this PR to get that to pass too -- This is an automated message from the Apache Gi

[GitHub] [arrow-datafusion] liukun4515 edited a comment on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

2021-12-10 Thread GitBox
liukun4515 edited a comment on issue #1432: URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157 In my opinion, this `f1.c1` style is illegal for SQL syntax. For example in Mysql and PG ``` mysql> create table t5("f.c1" int); ERROR 1064 (42000): Yo

[GitHub] [arrow-rs] alamb commented on pull request #779: Add MONTH_DAY_NANO interval type, impl `ArrowNativeType` for `i128`

2021-12-10 Thread GitBox
alamb commented on pull request #779: URL: https://github.com/apache/arrow-rs/pull/779#issuecomment-991052871 The rust compiler thing is fixed in https://github.com/apache/arrow-rs/pull/1023 -- I'll try and merge to this PR -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-rs] alamb commented on pull request #1023: Force new cargo and target caching to fix CI

2021-12-10 Thread GitBox
alamb commented on pull request #1023: URL: https://github.com/apache/arrow-rs/pull/1023#issuecomment-991052202 Merging this PR in to get CI checks running cleanly again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow-rs] alamb closed issue #1022: CI tests using rust nightly are failing

2021-12-10 Thread GitBox
alamb closed issue #1022: URL: https://github.com/apache/arrow-rs/issues/1022 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-rs] alamb merged pull request #1023: Force new cargo and target caching to fix CI

2021-12-10 Thread GitBox
alamb merged pull request #1023: URL: https://github.com/apache/arrow-rs/pull/1023 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] liukun4515 commented on issue #1432: Field names containing period such as `f.c1` cannot be named in SQL query

2021-12-10 Thread GitBox
liukun4515 commented on issue #1432: URL: https://github.com/apache/arrow-datafusion/issues/1432#issuecomment-991051157 In my opinion, this `f1.c1` style is illegal for SQL syntax. For example in Mysql and PG ``` mysql> create table t4(f.c1 int); ERROR 1064 (42000): You have an

[GitHub] [arrow] ursabot edited a comment on pull request #11883: ARROW-14306: [C++][Compute] Add binary reverse kernel

2021-12-10 Thread GitBox
ursabot edited a comment on pull request #11883: URL: https://github.com/apache/arrow/pull/11883#issuecomment-991028365 Benchmark runs are scheduled for baseline = 00d507792ffdadda24f8bb0d4b419d9e0d7d1241 and contender = b1f009ca80ef6f12fbdf56a1b53ed8d0e0571a5a. b1f009ca80ef6f12fbdf56a1b5

[GitHub] [arrow] liyafan82 commented on a change in pull request #11709: ARROW-14718: [Java] loadValidityBuffer should avoid allocating memory when input is not null and there are only null or non-nul

2021-12-10 Thread GitBox
liyafan82 commented on a change in pull request #11709: URL: https://github.com/apache/arrow/pull/11709#discussion_r766741762 ## File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVectorHelper.java ## @@ -222,6 +223,59 @@ public void testConcatBitsInPlace() {

[GitHub] [arrow] liyafan82 commented on a change in pull request #11709: ARROW-14718: [Java] loadValidityBuffer should avoid allocating memory when input is not null and there are only null or non-nul

2021-12-10 Thread GitBox
liyafan82 commented on a change in pull request #11709: URL: https://github.com/apache/arrow/pull/11709#discussion_r766740773 ## File path: java/vector/src/test/java/org/apache/arrow/vector/TestBitVectorHelper.java ## @@ -222,6 +223,59 @@ public void testConcatBitsInPlace() {

[GitHub] [arrow] romainfrancois commented on a change in pull request #11919: ARROW-14804: [R] import_from_c() / export_to_c() methods should accept external pointers

2021-12-10 Thread GitBox
romainfrancois commented on a change in pull request #11919: URL: https://github.com/apache/arrow/pull/11919#discussion_r766734547 ## File path: r/src/py-to-r.cpp ## @@ -21,6 +21,38 @@ #include +// [[arrow::export]] +double external_pointer_addr_double(SEXP external_point

[GitHub] [arrow] romainfrancois commented on a change in pull request #11850: ARROW-14941 [R] Implement Duration R6 class and bindings for lubridate::duration()

2021-12-10 Thread GitBox
romainfrancois commented on a change in pull request #11850: URL: https://github.com/apache/arrow/pull/11850#discussion_r766732853 ## File path: r/src/r_to_arrow.cpp ## @@ -580,33 +583,37 @@ int64_t get_TimeUnit_multiplier(TimeUnit::type unit) { } } +Result get_difftime_u

[GitHub] [arrow-rs] hntd187 commented on issue #1014: Add `Schema::project` and `RecordBatch` project function to project / select a subset of columns

2021-12-10 Thread GitBox
hntd187 commented on issue #1014: URL: https://github.com/apache/arrow-rs/issues/1014#issuecomment-991029374 I was going to tackle this, still learning the code base but I think I'll manage -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] ursabot commented on pull request #11883: ARROW-14306: [C++][Compute] Add binary reverse kernel

2021-12-10 Thread GitBox
ursabot commented on pull request #11883: URL: https://github.com/apache/arrow/pull/11883#issuecomment-991028365 Benchmark runs are scheduled for baseline = 00d507792ffdadda24f8bb0d4b419d9e0d7d1241 and contender = b1f009ca80ef6f12fbdf56a1b53ed8d0e0571a5a. b1f009ca80ef6f12fbdf56a1b53ed8d0e

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #1431: support decimal data type in create table

2021-12-10 Thread GitBox
Dandandan commented on a change in pull request #1431: URL: https://github.com/apache/arrow-datafusion/pull/1431#discussion_r766728475 ## File path: datafusion/tests/sql.rs ## @@ -3761,6 +3761,28 @@ async fn register_aggregate_csv(ctx: &mut ExecutionContext) -> Result<()> {

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #1431: support decimal data type in create table

2021-12-10 Thread GitBox
Dandandan commented on a change in pull request #1431: URL: https://github.com/apache/arrow-datafusion/pull/1431#discussion_r766728006 ## File path: datafusion/src/sql/planner.rs ## @@ -2006,6 +2026,27 @@ pub fn convert_data_type(sql: &SQLDataType) -> Result { SQLData

[GitHub] [arrow] lidavidm closed pull request #11883: ARROW-14306: [C++][Compute] Add binary reverse kernel

2021-12-10 Thread GitBox
lidavidm closed pull request #11883: URL: https://github.com/apache/arrow/pull/11883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubs

[GitHub] [arrow-datafusion] Dandandan commented on a change in pull request #1431: support decimal data type in create table

2021-12-10 Thread GitBox
Dandandan commented on a change in pull request #1431: URL: https://github.com/apache/arrow-datafusion/pull/1431#discussion_r766727687 ## File path: datafusion/src/sql/planner.rs ## @@ -372,7 +372,27 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { SQLDataType::Ch

  1   2   >