[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019816855 @github-actions crossbow submit conda-linux-gcc-py37-arm64 conda-linux-gcc-py37-ppc64le -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019817790 Revision: d766bf7a69021081bc483921959816d7f9f09590 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1467](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1660: fix: substr - correct behaivour with negative start pos

2022-01-24 Thread GitBox
xudong963 commented on a change in pull request #1660: URL: https://github.com/apache/arrow-datafusion/pull/1660#discussion_r790493092 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -3680,6 +3704,60 @@ mod tests { StringArray ); #[c

[GitHub] [arrow-datafusion] xudong963 commented on a change in pull request #1660: fix: substr - correct behaivour with negative start pos

2022-01-24 Thread GitBox
xudong963 commented on a change in pull request #1660: URL: https://github.com/apache/arrow-datafusion/pull/1660#discussion_r790493746 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -3680,6 +3704,60 @@ mod tests { StringArray ); #[c

[GitHub] [arrow-datafusion] yjshen commented on issue #1569: Track memory usage in Non Limited Operators

2022-01-24 Thread GitBox
yjshen commented on issue #1569: URL: https://github.com/apache/arrow-datafusion/issues/1569#issuecomment-1019829977 `add` will be helpful for `ConsumerType::Requesting` since we are growing memory usage each time we acquire. I haven't thought of a use case for `sub` yet. Maybe leave that

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790499334 ## File path: cpp/src/arrow/compute/exec/key_encode.h ## @@ -53,13 +53,19 @@ class KeyEncoder { /// for the purpose of row encoding. struct KeyColu

[GitHub] [arrow] Crystrix commented on pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on pull request #12032: URL: https://github.com/apache/arrow/pull/12032#issuecomment-1019831616 @pitrou @edponce, Could you please help to review? The main changes are updating based on comments and supporting the null type key in the `GrouperFastImpl`. -

[GitHub] [arrow-rs] HaoYang670 commented on issue #1108: Add native comparison kernel support for BinaryArray

2022-01-24 Thread GitBox
HaoYang670 commented on issue #1108: URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1019849720 BTW, could you please assign this issue to me? I have no right to self-assigned. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [arrow-rs] HaoYang670 edited a comment on issue #1108: Add native comparison kernel support for BinaryArray

2022-01-24 Thread GitBox
HaoYang670 edited a comment on issue #1108: URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1019849720 BTW, could you please assign this issue to me? I have no right to self-assign. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow-datafusion] gaojun2048 commented on a change in pull request #1659: [Ballista] Add Decimal128, Date64, TimestampSecond, TimestampMillisecond, Interv…

2022-01-24 Thread GitBox
gaojun2048 commented on a change in pull request #1659: URL: https://github.com/apache/arrow-datafusion/pull/1659#discussion_r790537324 ## File path: ballista/rust/core/proto/ballista.proto ## @@ -1179,9 +1179,21 @@ message ScalarValue{ ScalarType null_list_value = 18;

[GitHub] [arrow-datafusion] hengaini2055 opened a new issue #1661: thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported for data type Decimal(18, 4)

2022-01-24 Thread GitBox
hengaini2055 opened a new issue #1661: URL: https://github.com/apache/arrow-datafusion/issues/1661 **Describe the bug** I use datafusion-cli using sql. When I select sum(amount) as amount from my_table; , then everything is OK. When I use sql "select dept, sum(amount) as amount from my_

[GitHub] [arrow-datafusion] hengaini2055 commented on issue #1661: thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported for data type Decimal(18, 4)

2022-01-24 Thread GitBox
hengaini2055 commented on issue #1661: URL: https://github.com/apache/arrow-datafusion/issues/1661#issuecomment-1019877681 My parquet file has a decimal(18,4) column named amount. The exact error is ``` thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported

[GitHub] [arrow-datafusion] hengaini2055 commented on issue #1661: thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported for data type Decimal(18, 4)

2022-01-24 Thread GitBox
hengaini2055 commented on issue #1661: URL: https://github.com/apache/arrow-datafusion/issues/1661#issuecomment-1019881219 The parquet file is exported using odbc2parquet tool from SQLServer Database. ``` odbc2parquet query --dsn wd_bi_test --password "88" --user "88" e:\temp\bi_

[GitHub] [arrow] pitrou commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-24 Thread GitBox
pitrou commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790547496 ## File path: python/pyarrow/_orc.pyx ## @@ -388,8 +388,11 @@ cdef class ORCWriter(_Weakrefable): object sink unique_ptr[ORCFileWriter] w

[GitHub] [arrow-datafusion] Ted-Jiang opened a new issue #1662: Need clean up intermediate data in Ballista

2022-01-24 Thread GitBox
Ted-Jiang opened a new issue #1662: URL: https://github.com/apache/arrow-datafusion/issues/1662 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We need to check whether the states saved in the sled is consumed by UI or not. if

[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019909657 @github-actions crossbow submit -g conda -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] yjshen opened a new pull request #1663: [Cleanup] Move AggregatedMetricsSet to metrics for further reuse

2022-01-24 Thread GitBox
yjshen opened a new pull request #1663: URL: https://github.com/apache/arrow-datafusion/pull/1663 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1019911139 Revision: d766bf7a69021081bc483921959816d7f9f09590 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1468](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] tustvold commented on issue #1652: ARROW2: Performance benchmark

2022-01-24 Thread GitBox
tustvold commented on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1019913842 Big :+1: to this, getting some concrete numbers would be really nice. FWIW some ideas for whoever picks this up that I at least would be very interested in:

[GitHub] [arrow-datafusion] tustvold edited a comment on issue #1652: ARROW2: Performance benchmark

2022-01-24 Thread GitBox
tustvold edited a comment on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1019913842 Big :+1: to this, getting some concrete numbers would be really nice. FWIW some ideas for whoever picks this up that I at least would be very interested in:

[GitHub] [arrow-datafusion] tustvold edited a comment on issue #1652: ARROW2: Performance benchmark

2022-01-24 Thread GitBox
tustvold edited a comment on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1019913842 Big :+1: to this, getting some concrete numbers would be really nice. FWIW some ideas for whoever picks this up that I at least would be very interested in:

[GitHub] [arrow] kszucs commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled version

2022-01-24 Thread GitBox
kszucs commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019925057 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled vers

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019926113 Revision: 92b41d5d3f434cc6395338a679330ea46ef51158 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1469](https://github.com/ursacomputing/crossbo

[GitHub] [arrow-datafusion] Ted-Jiang commented on issue #1405: Do not send empty batches for Hash partitioning

2022-01-24 Thread GitBox
Ted-Jiang commented on issue #1405: URL: https://github.com/apache/arrow-datafusion/issues/1405#issuecomment-1019926539 @Dandandan Sorry for forgot add close tag in pr, please close this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] pitrou commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
pitrou commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790574210 ## File path: cpp/src/arrow/compute/exec/key_hash.cc ## @@ -286,34 +286,40 @@ void Hashing::HashMultiColumn(const std::vector& col bool is_first = true

[GitHub] [arrow] rok commented on pull request #11889: ARROW-14708: [C++] Adding missing abseil dependencies to enable static flight build

2022-01-24 Thread GitBox
rok commented on pull request #11889: URL: https://github.com/apache/arrow/pull/11889#issuecomment-1019934244 @kszucs apologies for taking the resources! I accidentally triggered on the wrong commit and reran quickly on the right one. I'll be more selective in the future. Thanks for the fe

[GitHub] [arrow-rs] tustvold commented on pull request #1228: Unaligned bit chunk

2022-01-24 Thread GitBox
tustvold commented on pull request #1228: URL: https://github.com/apache/arrow-rs/pull/1228#issuecomment-1019939399 In order to test how much of the performance uplift was the changes to SlicesIterator and how much UnalignedChunk I created a branch with the changes to SlicesIterator but us

[GitHub] [arrow-cookbook] thisisnic commented on issue #129: Read encrypted parquet file from R

2022-01-24 Thread GitBox
thisisnic commented on issue #129: URL: https://github.com/apache/arrow-cookbook/issues/129#issuecomment-1019949566 Hi @jaimesalvador , thanks for opening this issue! Is this something you'd like to see a recipe for, or are you interested in contributing one yourself? -- This is an auto

[GitHub] [arrow-datafusion] Dandandan closed issue #1405: Do not send empty batches for Hash partitioning

2022-01-24 Thread GitBox
Dandandan closed issue #1405: URL: https://github.com/apache/arrow-datafusion/issues/1405 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-u

[GitHub] [arrow] pitrou commented on pull request #10789: ARROW-5926: [Java] Test fuzzer inputs

2022-01-24 Thread GitBox
pitrou commented on pull request #10789: URL: https://github.com/apache/arrow/pull/10789#issuecomment-1019965461 Does this need to be merged before the OSS-Fuzz PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] kszucs commented on a change in pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled versi

2022-01-24 Thread GitBox
kszucs commented on a change in pull request #12227: URL: https://github.com/apache/arrow/pull/12227#discussion_r790278205 ## File path: dev/tasks/python-wheels/github.linux.amd64.yml ## @@ -47,7 +47,7 @@ jobs: {{ macros.github_upload_releases("arrow/python/repaired_whe

[GitHub] [arrow] kszucs closed pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled version

2022-01-24 Thread GitBox
kszucs closed pull request #12227: URL: https://github.com/apache/arrow/pull/12227 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] ursabot commented on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled version

2022-01-24 Thread GitBox
ursabot commented on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019973568 Benchmark runs are scheduled for baseline = 8e34b64f60120bdee5991148f765cd4452f0e0d7 and contender = a8c0b1f3e359bc5a519976758c0873dd77f152f1. a8c0b1f3e359bc5a519976758c0873dd

[GitHub] [arrow] kszucs commented on pull request #12235: [Release] Verify 7.0.0 RC6 [WIP]

2022-01-24 Thread GitBox
kszucs commented on pull request #12235: URL: https://github.com/apache/arrow/pull/12235#issuecomment-1019978885 @github-actions crossbow submit --group verify-rc-source --param release=7.0.0 --param rc=6 -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [arrow] pitrou commented on pull request #12230: ARROW-15415: [C++] Fixes for Windows Debug build

2022-01-24 Thread GitBox
pitrou commented on pull request #12230: URL: https://github.com/apache/arrow/pull/12230#issuecomment-1019979433 There is a fork of clcache at https://github.com/Nuitka/clcache/ . Note we use clcache on AppVeyor, so it still works, presumably. See e.g. https://ci.appveyor.com/project/Apach

[GitHub] [arrow-rs] yjshen opened a new pull request #1236: [Minor]`into_inner` for IPC `FileWriter`

2022-01-24 Thread GitBox
yjshen opened a new pull request #1236: URL: https://github.com/apache/arrow-rs/pull/1236 # Which issue does this PR close? Closes #. # Rationale for this change We could get the file handle out and do more appending to the underline file. # What

[GitHub] [arrow] pitrou commented on pull request #12230: ARROW-15415: [C++] Fixes for Windows Debug build

2022-01-24 Thread GitBox
pitrou commented on pull request #12230: URL: https://github.com/apache/arrow/pull/12230#issuecomment-1019980182 cc @kou -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [arrow] NightMachinary commented on issue #4802: pyarrow / pandas support for tensors (multi-dimensional arrays)

2022-01-24 Thread GitBox
NightMachinary commented on issue #4802: URL: https://github.com/apache/arrow/issues/4802#issuecomment-1019985875 I have a `(10**6, 10**4)` numpy array. I want to store this in an efficient format and be able to load batches from this file to use in an out-of-core algorithm. Is Arrow suita

[GitHub] [arrow] pitrou commented on pull request #12195: ARROW-15384: [Python] Wheel task for python 3.7 M1

2022-01-24 Thread GitBox
pitrou commented on pull request #12195: URL: https://github.com/apache/arrow/pull/12195#issuecomment-1019989503 > Thanks @rohitppathak for your PR! First we need to install python 3.7 on the self hosted runner, but last time I checked there were not available any installers for download a

[GitHub] [arrow] ursabot edited a comment on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled version

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019973568 Benchmark runs are scheduled for baseline = 8e34b64f60120bdee5991148f765cd4452f0e0d7 and contender = a8c0b1f3e359bc5a519976758c0873dd77f152f1. a8c0b1f3e359bc5a519976758

[GitHub] [arrow] github-actions[bot] commented on pull request #12235: [Release] Verify 7.0.0 RC6 [WIP]

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #12235: URL: https://github.com/apache/arrow/pull/12235#issuecomment-1019997093 Revision: cc809bd98a04f562a38107858cab669db0768cc1 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1470](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] rok commented on pull request #11889: ARROW-14708: [C++] Adding missing abseil dependencies to enable static flight build

2022-01-24 Thread GitBox
rok commented on pull request #11889: URL: https://github.com/apache/arrow/pull/11889#issuecomment-1019998064 I checked the failing builds that did not fail in the nightly: Seems unrelated: * test-conda-python-3.9-spark-master * test-ubuntu-20.04-cpp-thread-sanitizer * ubuntu

[GitHub] [arrow] rok edited a comment on pull request #11889: ARROW-14708: [C++] Adding missing abseil dependencies to enable static flight build

2022-01-24 Thread GitBox
rok edited a comment on pull request #11889: URL: https://github.com/apache/arrow/pull/11889#issuecomment-1019998064 I checked the failing builds that did not fail in the nightly: Seems unrelated: * test-conda-python-3.9-spark-master * test-ubuntu-20.04-cpp-thread-sanitizer *

[GitHub] [arrow-datafusion] yjshen opened a new pull request #1664: Make `MemoryManager` and `MemoryStream` public

2022-01-24 Thread GitBox
yjshen opened a new pull request #1664: URL: https://github.com/apache/arrow-datafusion/pull/1664 # Which issue does this PR close? Closes #150 . # Rationale for this change Make `MemoryManager` and `MemoryStream` public, make them accessible by ballista or

[GitHub] [arrow-cookbook] thisisnic opened a new issue #130: Add recipe on converting a large CSV into a smaller one

2022-01-24 Thread GitBox
thisisnic opened a new issue #130: URL: https://github.com/apache/arrow-cookbook/issues/130 Add a recipe which shows how to do `open_dataset()`...`write_dataset()`, and some extra text which explains why this is better than doing head/tail/awk from one giant csv to many smaller ones (i.e.

[GitHub] [arrow] thisisnic commented on pull request #12152: ARROW-15123: [R] CSV dataset file header read in as data

2022-01-24 Thread GitBox
thisisnic commented on pull request #12152: URL: https://github.com/apache/arrow/pull/12152#issuecomment-1020007427 The cookbook has this ticket open for adding something on doing `open_dataset()`...`write_dataset()`: https://github.com/apache/arrow-cookbook/issues/130. Any more cha

[GitHub] [arrow-rs] alamb commented on a change in pull request #1219: Do not concatenate identical dictionaries

2022-01-24 Thread GitBox
alamb commented on a change in pull request #1219: URL: https://github.com/apache/arrow-rs/pull/1219#discussion_r790662987 ## File path: arrow/src/array/data.rs ## @@ -1155,6 +1155,41 @@ impl ArrayData { Ok(()) }) } + +/// Returns true if this `Ar

[GitHub] [arrow-rs] alamb closed issue #171: Implement returning dictionary arrays from parquet reader

2022-01-24 Thread GitBox
alamb closed issue #171: URL: https://github.com/apache/arrow-rs/issues/171 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arr

[GitHub] [arrow-rs] alamb merged pull request #1180: Preserve dictionary encoding when decoding parquet into Arrow arrays, 60x perf improvement (#171)

2022-01-24 Thread GitBox
alamb merged pull request #1180: URL: https://github.com/apache/arrow-rs/pull/1180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow-datafusion] Ted-Jiang opened a new pull request #1665: Skip some path in list_file_with_suffix.

2022-01-24 Thread GitBox
Ted-Jiang opened a new pull request #1665: URL: https://github.com/apache/arrow-datafusion/pull/1665 # Which issue does this PR close? Closes #1648. # Rationale for this change We should filter out some path in list_file method, like https://github.com/apache/spa

[GitHub] [arrow-datafusion] alamb commented on issue #1652: ARROW2: Performance benchmark

2022-01-24 Thread GitBox
alamb commented on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1020025480 A number of the performance improvements will be in arrow-rs 8.0.0, though some such as https://github.com/apache/arrow-rs/pull/1180 will not be released until arrow-rs 9.0.

[GitHub] [arrow-datafusion] alamb merged pull request #1645: Remove non idiomatic `DataFusionError::into_arrow_external_error` in favor of From conversion

2022-01-24 Thread GitBox
alamb merged pull request #1645: URL: https://github.com/apache/arrow-datafusion/pull/1645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb closed issue #1644: Remove non idomatic DataFusionError::into_arrow_external_error in favor of `From` conversion

2022-01-24 Thread GitBox
alamb closed issue #1644: URL: https://github.com/apache/arrow-datafusion/issues/1644 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] Ted-Jiang commented on pull request #1665: Skip some path in list_file_with_suffix.

2022-01-24 Thread GitBox
Ted-Jiang commented on pull request #1665: URL: https://github.com/apache/arrow-datafusion/pull/1665#issuecomment-1020032386 @houqp @andygrove Could you give me some advice? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] ursabot edited a comment on pull request #12232: ARROW-15416: [Python] Add option to skip gdb tests

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12232: URL: https://github.com/apache/arrow/pull/12232#issuecomment-1019488065 Benchmark runs are scheduled for baseline = 664a15896c8eb07916beb054b45db3e35b9810da and contender = 7f0867e616c23fe3e0ebaed8e8ef11be1b9f1dd9. 7f0867e616c23fe3e0ebaed8e

[GitHub] [arrow-rs] alamb commented on issue #1108: Add native comparison kernel support for BinaryArray

2022-01-24 Thread GitBox
alamb commented on issue #1108: URL: https://github.com/apache/arrow-rs/issues/1108#issuecomment-1020041844 > Is there something like BinaryDictionaryBuilder in the code, which is needed when testing the eq_dyn_binary_scalar_with_dict? I do not know of any such thing, sadly --

[GitHub] [arrow] ursabot edited a comment on pull request #12227: ARROW-15417: [Python][Packaging] Use vcpkg manifest to install wheel dependencies; downgrade AWS SDK by building the bundled version

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12227: URL: https://github.com/apache/arrow/pull/12227#issuecomment-1019973568 Benchmark runs are scheduled for baseline = 8e34b64f60120bdee5991148f765cd4452f0e0d7 and contender = a8c0b1f3e359bc5a519976758c0873dd77f152f1. a8c0b1f3e359bc5a519976758

[GitHub] [arrow] pitrou closed pull request #12197: ARROW-15385: [Integration] Split duration from interval in integration tests

2022-01-24 Thread GitBox
pitrou closed pull request #12197: URL: https://github.com/apache/arrow/pull/12197 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] ursabot commented on pull request #12197: ARROW-15385: [Integration] Split duration from interval in integration tests

2022-01-24 Thread GitBox
ursabot commented on pull request #12197: URL: https://github.com/apache/arrow/pull/12197#issuecomment-1020074927 Benchmark runs are scheduled for baseline = a8c0b1f3e359bc5a519976758c0873dd77f152f1 and contender = ae9915d3d3a30732cdc6c7904ba570c2b55f302e. ae9915d3d3a30732cdc6c7904ba570c2

[GitHub] [arrow] pitrou closed pull request #12192: ARROW-15373: [C++] Return unique_ptr from MemoryManager::AllocateBuffer

2022-01-24 Thread GitBox
pitrou closed pull request #12192: URL: https://github.com/apache/arrow/pull/12192 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr

[GitHub] [arrow] lidavidm closed issue #12225: Javascript Filght Support

2022-01-24 Thread GitBox
lidavidm closed issue #12225: URL: https://github.com/apache/arrow/issues/12225 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790722329 ## File path: cpp/src/arrow/compute/exec/key_compare.cc ## @@ -334,6 +334,14 @@ void KeyCompare::CompareColumnsToRows(uint32_t num_rows_to_compare, b

[GitHub] [arrow] liyafan82 closed pull request #10789: ARROW-5926: [Java] Test fuzzer inputs

2022-01-24 Thread GitBox
liyafan82 closed pull request #10789: URL: https://github.com/apache/arrow/pull/10789 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] ursabot commented on pull request #12192: ARROW-15373: [C++] Return unique_ptr from MemoryManager::AllocateBuffer

2022-01-24 Thread GitBox
ursabot commented on pull request #12192: URL: https://github.com/apache/arrow/pull/12192#issuecomment-1020085049 Benchmark runs are scheduled for baseline = ae9915d3d3a30732cdc6c7904ba570c2b55f302e and contender = 512c5f1d454d21440f1ce1f8fa77cbb3b6cf8b42. 512c5f1d454d21440f1ce1f8fa77cbb3

[GitHub] [arrow] liyafan82 commented on pull request #10789: ARROW-5926: [Java] Test fuzzer inputs

2022-01-24 Thread GitBox
liyafan82 commented on pull request #10789: URL: https://github.com/apache/arrow/pull/10789#issuecomment-1020084856 > Does this need to be merged before the OSS-Fuzz PR? Maybe not. The oss-fuzz PR is independent of this. However, some bugs in this MR need to be fixed. I will clos

[GitHub] [arrow-datafusion] ovr commented on a change in pull request #1660: fix: substr - correct behaivour with negative start pos

2022-01-24 Thread GitBox
ovr commented on a change in pull request #1660: URL: https://github.com/apache/arrow-datafusion/pull/1660#discussion_r790733421 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -3680,6 +3704,60 @@ mod tests { StringArray ); #[cfg(fea

[GitHub] [arrow-datafusion] ovr commented on a change in pull request #1660: fix: substr - correct behaivour with negative start pos

2022-01-24 Thread GitBox
ovr commented on a change in pull request #1660: URL: https://github.com/apache/arrow-datafusion/pull/1660#discussion_r790733421 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -3680,6 +3704,60 @@ mod tests { StringArray ); #[cfg(fea

[GitHub] [arrow] ursabot edited a comment on pull request #12197: ARROW-15385: [Integration] Split duration from interval in integration tests

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12197: URL: https://github.com/apache/arrow/pull/12197#issuecomment-1020074927 Benchmark runs are scheduled for baseline = a8c0b1f3e359bc5a519976758c0873dd77f152f1 and contender = ae9915d3d3a30732cdc6c7904ba570c2b55f302e. ae9915d3d3a30732cdc6c7904

[GitHub] [arrow-datafusion] ovr commented on a change in pull request #1660: fix: substr - correct behaivour with negative start pos

2022-01-24 Thread GitBox
ovr commented on a change in pull request #1660: URL: https://github.com/apache/arrow-datafusion/pull/1660#discussion_r790733421 ## File path: datafusion/src/physical_plan/functions.rs ## @@ -3680,6 +3704,60 @@ mod tests { StringArray ); #[cfg(fea

[GitHub] [arrow] liyafan82 closed pull request #12229: ARROW-15414: [Java] Fix RangeEqualsVisitor for BitVector

2022-01-24 Thread GitBox
liyafan82 closed pull request #12229: URL: https://github.com/apache/arrow/pull/12229 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790738458 ## File path: cpp/src/arrow/compute/kernels/hash_aggregate.cc ## @@ -322,14 +333,20 @@ struct GrouperFastImpl : Grouper { group_ids, AllocateBuf

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790738567 ## File path: cpp/src/arrow/compute/kernels/hash_aggregate.cc ## @@ -322,14 +333,20 @@ struct GrouperFastImpl : Grouper { group_ids, AllocateBuf

[GitHub] [arrow] jorisvandenbossche commented on issue #4802: pyarrow / pandas support for tensors (multi-dimensional arrays)

2022-01-24 Thread GitBox
jorisvandenbossche commented on issue #4802: URL: https://github.com/apache/arrow/issues/4802#issuecomment-1020099085 In general, Arrow will be able to handle such a table, but it is certainly not the most optimal format for this kind of data (especially since we don't yet have good suppor

[GitHub] [arrow] ursabot commented on pull request #12229: ARROW-15414: [Java] Fix RangeEqualsVisitor for BitVector

2022-01-24 Thread GitBox
ursabot commented on pull request #12229: URL: https://github.com/apache/arrow/pull/12229#issuecomment-1020104357 Benchmark runs are scheduled for baseline = 512c5f1d454d21440f1ce1f8fa77cbb3b6cf8b42 and contender = daa5c18e9697a6455a7a75fec19594543c17b21e. daa5c18e9697a6455a7a75fec1959454

[GitHub] [arrow] xhochy commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
xhochy commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1020105783 @github-actions crossbow submit -g conda -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #11916: ARROW-14506: [C++] Conda support for google-cloud-cpp

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #11916: URL: https://github.com/apache/arrow/pull/11916#issuecomment-1020106893 Revision: a34c8934c5313135a5f689125584b414ec1b4582 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1471](https://github.com/ursacomputing/crossbo

[GitHub] [arrow] ursabot edited a comment on pull request #12192: ARROW-15373: [C++] Return unique_ptr from MemoryManager::AllocateBuffer

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12192: URL: https://github.com/apache/arrow/pull/12192#issuecomment-1020085049 Benchmark runs are scheduled for baseline = ae9915d3d3a30732cdc6c7904ba570c2b55f302e and contender = 512c5f1d454d21440f1ce1f8fa77cbb3b6cf8b42. 512c5f1d454d21440f1ce1f8f

[GitHub] [arrow] pitrou opened a new pull request #12236: ARROW-15423: [C++][Dev] Make GDB plugin auto-load friendly

2022-01-24 Thread GitBox
pitrou opened a new pull request #12236: URL: https://github.com/apache/arrow/pull/12236 When auto-loaded, a GDB plugin should attach itself to the specific objfile being debugged. This allows for potentially multiple inferiors being debugged at once, using different versions of Arr

[GitHub] [arrow] github-actions[bot] commented on pull request #12236: ARROW-15423: [C++][Dev] Make GDB plugin auto-load friendly

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #12236: URL: https://github.com/apache/arrow/pull/12236#issuecomment-1020114159 https://issues.apache.org/jira/browse/ARROW-15423 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-cookbook] jorisvandenbossche commented on issue #129: Read encrypted parquet file from R

2022-01-24 Thread GitBox
jorisvandenbossche commented on issue #129: URL: https://github.com/apache/arrow-cookbook/issues/129#issuecomment-1020114056 Also note that this might not yet be supported? At least for Python, there is still work under way to expose this functionality more easily in Python, see https://gi

[GitHub] [arrow-datafusion] alamb commented on issue #1661: thread 'tokio-runtime-worker' panicked at 'not implemented: Take not supported for data type Decimal(18, 4)

2022-01-24 Thread GitBox
alamb commented on issue #1661: URL: https://github.com/apache/arrow-datafusion/issues/1661#issuecomment-1020115707 This will likely be fixed as soon as we can upgrade to arrow 8.0.0 (due to be released today): https://github.com/apache/arrow-rs/pull/1172 Sorry for the delay

[GitHub] [arrow] lidavidm commented on pull request #12160: ARROW-13467: [C++] Support delta dictionaries in the IPC file format

2022-01-24 Thread GitBox
lidavidm commented on pull request #12160: URL: https://github.com/apache/arrow/pull/12160#issuecomment-1020116300 It doesn't seem so, filed ARROW-15425. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow] lidavidm commented on pull request #12160: ARROW-13467: [C++] Support delta dictionaries in the IPC file format

2022-01-24 Thread GitBox
lidavidm commented on pull request #12160: URL: https://github.com/apache/arrow/pull/12160#issuecomment-1020116938 (Also, sorry, do you mind updating the PR description for the eventual commit message?) -- This is an automated message from the Apache Git Service. To respond to the messag

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-24 Thread GitBox
jorisvandenbossche commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790761904 ## File path: python/pyarrow/_orc.pyx ## @@ -388,8 +388,11 @@ cdef class ORCWriter(_Weakrefable): object sink unique_ptr[ORCF

[GitHub] [arrow] ursabot edited a comment on pull request #12229: ARROW-15414: [Java] Fix RangeEqualsVisitor for BitVector

2022-01-24 Thread GitBox
ursabot edited a comment on pull request #12229: URL: https://github.com/apache/arrow/pull/12229#issuecomment-1020104357 Benchmark runs are scheduled for baseline = 512c5f1d454d21440f1ce1f8fa77cbb3b6cf8b42 and contender = daa5c18e9697a6455a7a75fec19594543c17b21e. daa5c18e9697a6455a7a75fec

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #12231: ARROW-14783: [C++][Python] Fix the BytesIO support issue

2022-01-24 Thread GitBox
jorisvandenbossche commented on a change in pull request #12231: URL: https://github.com/apache/arrow/pull/12231#discussion_r790763779 ## File path: python/pyarrow/_orc.pyx ## @@ -403,6 +406,7 @@ cdef class ORCWriter(_Weakrefable): cdef: shared_ptr[WriteOp

[GitHub] [arrow-rs] alamb commented on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-24 Thread GitBox
alamb commented on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1020127246 > Do you want to refactor some codes like ... in this pull request? I was planning to do that in a follow on PR to keep this PR small and focused on the API changes.

[GitHub] [arrow] lidavidm opened a new pull request #12237: ARROW-15424: [C++][GLib] Fix CUDA bindings

2022-01-24 Thread GitBox
lidavidm opened a new pull request #12237: URL: https://github.com/apache/arrow/pull/12237 ARROW-15373 changed MemoryManager::AllocateBuffer to return unique_ptr; the GLib bindings needed updating for this, however. Also, tweaked the C++ since some compilers don't implicitly convert

[GitHub] [arrow] github-actions[bot] commented on pull request #12237: ARROW-15424: [C++][GLib] Fix CUDA bindings

2022-01-24 Thread GitBox
github-actions[bot] commented on pull request #12237: URL: https://github.com/apache/arrow/pull/12237#issuecomment-1020132465 https://issues.apache.org/jira/browse/ARROW-15424 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790772250 ## File path: cpp/src/arrow/compute/exec/key_hash.cc ## @@ -286,34 +286,40 @@ void Hashing::HashMultiColumn(const std::vector& col bool is_first = tr

[GitHub] [arrow] Crystrix commented on a change in pull request #12032: ARROW-15126: [C++] Support Null type as group keys

2022-01-24 Thread GitBox
Crystrix commented on a change in pull request #12032: URL: https://github.com/apache/arrow/pull/12032#discussion_r790772843 ## File path: cpp/src/arrow/compute/kernels/hash_aggregate.cc ## @@ -237,6 +245,9 @@ struct GrouperFastImpl : Grouper { } else if (is_binary_like(

[GitHub] [arrow] wesm commented on pull request #11730: ARROW-14745: [R] Enable true duckdb streaming

2022-01-24 Thread GitBox
wesm commented on pull request #11730: URL: https://github.com/apache/arrow/pull/11730#issuecomment-1020135023 I was looking at the Jira for this and it wasn't obvious to me what is meant by "true duckdb streaming" -- could you add a PR description to explain what this issue is about? -

[GitHub] [arrow-datafusion] alamb commented on issue #1636: Provide RuntimeEnv to ExecutionContext

2022-01-24 Thread GitBox
alamb commented on issue #1636: URL: https://github.com/apache/arrow-datafusion/issues/1636#issuecomment-1020148505 I plan to work on this issue, with respect to https://github.com/influxdata/influxdb_iox/issues/3507 -- This is an automated message from the Apache Git Service. To respon

[GitHub] [arrow] jonkeane commented on pull request #11730: ARROW-14745: [R] Enable true duckdb streaming

2022-01-24 Thread GitBox
jonkeane commented on pull request #11730: URL: https://github.com/apache/arrow/pull/11730#issuecomment-1020151400 Yes, absolutely. I'm working on cleaning this up + will add a description and flag it as ready. The long story short is: there was an issue with DuckDB that was causing the co

[GitHub] [arrow] jorisvandenbossche commented on pull request #12178: ARROW-9664: [Python] Array/ChunkedArray.to_pandas do not support types_mapper keyword

2022-01-24 Thread GitBox
jorisvandenbossche commented on pull request #12178: URL: https://github.com/apache/arrow/pull/12178#issuecomment-1020153796 Yes, that's what I meant. So you need to test it with a function that will return None for a different type (such as the dict.get does), and then also do the `to_pan

[GitHub] [arrow-datafusion] gaojun2048 commented on pull request #1659: [Ballista] Add Decimal128, Date64, TimestampSecond, TimestampMillisecond, Interv…

2022-01-24 Thread GitBox
gaojun2048 commented on pull request #1659: URL: https://github.com/apache/arrow-datafusion/pull/1659#issuecomment-1020154898 Why the windows check failed ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [arrow-datafusion] gaojun2048 edited a comment on pull request #1659: [Ballista] Add Decimal128, Date64, TimestampSecond, TimestampMillisecond, Interv…

2022-01-24 Thread GitBox
gaojun2048 edited a comment on pull request #1659: URL: https://github.com/apache/arrow-datafusion/pull/1659#issuecomment-1020154898 Why the windows check failed ? Is there some idea to fix it? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] lidavidm commented on a change in pull request #11982: ARROW-15313: [C++][Java][FlightRPC] Implement type info method to flight-sql

2022-01-24 Thread GitBox
lidavidm commented on a change in pull request #11982: URL: https://github.com/apache/arrow/pull/11982#discussion_r790802336 ## File path: format/FlightSql.proto ## @@ -867,6 +867,153 @@ enum SqlSupportsConvert { SQL_CONVERT_VARCHAR = 19; } +enum SqlDatetimeSubcode { + S

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1234: Remove arrow array reader (#1197)

2022-01-24 Thread GitBox
codecov-commenter edited a comment on pull request #1234: URL: https://github.com/apache/arrow-rs/pull/1234#issuecomment-1019537831 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1234?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-cookbook] davisusanibar opened a new pull request #131: [Java]: WIP - Java cookbook for create arrow object

2022-01-24 Thread GitBox
davisusanibar opened a new pull request #131: URL: https://github.com/apache/arrow-cookbook/pull/131 Java cookbook for create arrow object -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

  1   2   3   4   >