[GitHub] [arrow] Dandandan commented on a change in pull request #8965: ARROW-10968: [Rust][DataFusion] Don't build hash table for right side of join

2020-12-19 Thread GitBox
Dandandan commented on a change in pull request #8965: URL: https://github.com/apache/arrow/pull/8965#discussion_r546209575 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -423,71 +419,87 @@ fn build_batch( // (1, 0) (1, 2) fn build_join_indexes( l

[GitHub] [arrow] jorgecarleitao commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
jorgecarleitao commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748441499 @yordan-pavlov Thanks for the feedback. All great points. > The performance degradation in the filter u8 is interesting - do you have a hypothesis for what's

[GitHub] [arrow] Dandandan commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
Dandandan commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748441517 I looked a bit at this PR, looks good, didn't find anything weird (but also don't know the details of this part of the code). Speed ups look great! Any idea where the filter u

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
jorgecarleitao edited a comment on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748441499 @yordan-pavlov Thanks for the feedback. All great points. > The performance degradation in the filter u8 is interesting - do you have a hypothesis for

[GitHub] [arrow] jorgecarleitao commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
jorgecarleitao commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748443139 > Any idea where the filter u8 regression comes from? I tried to explain it in the comment above (there was a race condition here). Does it make sense?

[GitHub] [arrow] jorgecarleitao closed pull request #8929: ARROW-10914: [Rust] Refactor simd arithmetic kernels to use chunked iteration

2020-12-19 Thread GitBox
jorgecarleitao closed pull request #8929: URL: https://github.com/apache/arrow/pull/8929 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] jorgecarleitao commented on pull request #8836: ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations.

2020-12-19 Thread GitBox
jorgecarleitao commented on pull request #8836: URL: https://github.com/apache/arrow/pull/8836#issuecomment-74826 This is breaking a test in mac. I am not sure it is related. Could you rebase it so that I can safely merge it? ---

[GitHub] [arrow] jorgecarleitao commented on pull request #8942: ARROW-10946: [Rust] Simplified bit chunk iterator

2020-12-19 Thread GitBox
jorgecarleitao commented on pull request #8942: URL: https://github.com/apache/arrow/pull/8942#issuecomment-748445822 > Any (expected) change in performance? Nop, `slice.as_ptr()` is constant, which is the only call that was changed on the iteration. ---

[GitHub] [arrow] mqy opened a new pull request #8967: Arrow 10967 optional env

2020-12-19 Thread GitBox
mqy opened a new pull request #8967: URL: https://github.com/apache/arrow/pull/8967 Two env vars ARROW_TEST_DATA and PARQUET_TEST_DATA are required to be set, for running tests, benchmark. The major usage likes this: ``` let testdata = std::env::var("PARQUET_TEST_DATA").expec

[GitHub] [arrow] github-actions[bot] commented on pull request #8967: Arrow-10967: [Rust] Make env vars ARROW_TEST_DATA and PARQUET_TEST_DATA optional

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8967: URL: https://github.com/apache/arrow/pull/8967#issuecomment-748457504 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] github-actions[bot] commented on pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8967: URL: https://github.com/apache/arrow/pull/8967#issuecomment-748459379 https://issues.apache.org/jira/browse/ARROW-10967 This is an automated message from the Apache Git Ser

[GitHub] [arrow] codecov-io commented on pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
codecov-io commented on pull request #8967: URL: https://github.com/apache/arrow/pull/8967#issuecomment-748460013 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8967?src=pr&el=h1) Report > Merging [#8967](https://codecov.io/gh/apache/arrow/pull/8967?src=pr&el=desc) (ca83840) into

[GitHub] [arrow] rdettai commented on a change in pull request #8917: ARROW-9828: [Rust] [DataFusion] Support filter pushdown optimisation for TableProvider implementations

2020-12-19 Thread GitBox
rdettai commented on a change in pull request #8917: URL: https://github.com/apache/arrow/pull/8917#discussion_r546227882 ## File path: rust/datafusion/src/datasource/datasource.rs ## @@ -34,6 +35,23 @@ pub struct Statistics { pub total_byte_size: Option, } +/// Indicat

[GitHub] [arrow] rdettai commented on pull request #8917: ARROW-9828: [Rust] [DataFusion] Support filter pushdown optimisation for TableProvider implementations

2020-12-19 Thread GitBox
rdettai commented on pull request #8917: URL: https://github.com/apache/arrow/pull/8917#issuecomment-748462140 Nobody replied to this [comment](https://github.com/apache/arrow/pull/8917#discussion_r544337093). The question is whether it is worth doing the filter pruning at the logical plan

[GitHub] [arrow] codecov-io edited a comment on pull request #8796: [Rust] [Experiment] Vec vs current allocations

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-743967914 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8796?src=pr&el=h1) Report > Merging [#8796](https://codecov.io/gh/apache/arrow/pull/8796?src=pr&el=desc) (8a1c52c)

[GitHub] [arrow] Dandandan commented on a change in pull request #8965: ARROW-10968: [Rust][DataFusion] Don't build hash table for right side of join

2020-12-19 Thread GitBox
Dandandan commented on a change in pull request #8965: URL: https://github.com/apache/arrow/pull/8965#discussion_r546230466 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -423,71 +419,87 @@ fn build_batch( // (1, 0) (1, 2) fn build_join_indexes( l

[GitHub] [arrow] Dandandan commented on a change in pull request #8965: ARROW-10968: [Rust][DataFusion] Don't build hash table for right side of join

2020-12-19 Thread GitBox
Dandandan commented on a change in pull request #8965: URL: https://github.com/apache/arrow/pull/8965#discussion_r546209575 ## File path: rust/datafusion/src/physical_plan/hash_join.rs ## @@ -423,71 +419,87 @@ fn build_batch( // (1, 0) (1, 2) fn build_join_indexes( l

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8796: [Rust] [Experiment] Vec vs current allocations

2020-12-19 Thread GitBox
jorgecarleitao edited a comment on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-748470192 I have now rebased this against master. After @jhorstmann fix to the out of bounds on #8954, it now runs correctly. Here are the results: # no SIMD

[GitHub] [arrow] jorgecarleitao commented on pull request #8796: [Rust] [Experiment] Vec vs current allocations

2020-12-19 Thread GitBox
jorgecarleitao commented on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-748470192 I have now rebased this against master. After @jhorstmann fix to the out of bounds on #8954, it now runs correctly. Here are the results: # no SIMD ```

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8796: [Rust] [Experiment] Vec vs current allocations

2020-12-19 Thread GitBox
jorgecarleitao edited a comment on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-748470192 I have now rebased this against master. After @jhorstmann fix to the out of bounds on #8954, it now runs correctly. Here are the results: # no SIMD

[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8796: [Rust] [Experiment] Vec vs current allocations

2020-12-19 Thread GitBox
jorgecarleitao edited a comment on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-748470192 I have now rebased this against master. After @jhorstmann fix to the out of bounds on #8954, it now runs correctly. Here are the results: # no SIMD

[GitHub] [arrow] codecov-io edited a comment on pull request #8836: ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations.

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8836: URL: https://github.com/apache/arrow/pull/8836#issuecomment-748150596 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8836?src=pr&el=h1) Report > Merging [#8836](https://codecov.io/gh/apache/arrow/pull/8836?src=pr&el=desc) (1ea2abe)

[GitHub] [arrow] drusso commented on pull request #8836: ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations.

2020-12-19 Thread GitBox
drusso commented on pull request #8836: URL: https://github.com/apache/arrow/pull/8836#issuecomment-748474928 @jorgecarleitao: Updated, tests are now passing. This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow] kflansburg opened a new pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg opened a new pull request #8968: URL: https://github.com/apache/arrow/pull/8968 Introduce a basic Kafka reader based on `rdkafka`. Exposes an `Iterator` interface which yields `Result`. Columns in the batch are: - **key** (Binary, nullable): The key of a mes

[GitHub] [arrow] codecov-io commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
codecov-io commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748488430 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=h1) Report > Merging [#8968](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=desc) (e087307) into

[GitHub] [arrow] mrkn commented on pull request #8919: ARROW-10604: [GLib][Ruby] Add support for 256-bit decimal

2020-12-19 Thread GitBox
mrkn commented on pull request #8919: URL: https://github.com/apache/arrow/pull/8919#issuecomment-748489044 @kou I released bigdecimal 2.0.3 and 3.0.0. Both versions support `BigDecimal#precision`. This is an automated messa

[GitHub] [arrow] codecov-io edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748488430 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=h1) Report > Merging [#8968](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=desc) (1659ac4)

[GitHub] [arrow] github-actions[bot] commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748489267 https://issues.apache.org/jira/browse/ARROW-10979 This is an automated message from the Apache Git Ser

[GitHub] [arrow] codecov-io edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748488430 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=h1) Report > Merging [#8968](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=desc) (43c781f)

[GitHub] [arrow] andygrove opened a new pull request #8969: ARROW-10985: [Rust] Update unsafe guidelines for adding JIRA references

2020-12-19 Thread GitBox
andygrove opened a new pull request #8969: URL: https://github.com/apache/arrow/pull/8969 See changes for details This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [arrow] github-actions[bot] commented on pull request #8969: ARROW-10985: [Rust] Update unsafe guidelines for adding JIRA references

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8969: URL: https://github.com/apache/arrow/pull/8969#issuecomment-748491532 https://issues.apache.org/jira/browse/ARROW-10985 This is an automated message from the Apache Git Ser

[GitHub] [arrow] Dandandan edited a comment on pull request #8961: ARROW-10885: [Rust][DataFusion] Optimize hash join build vs probe order based on number of rows

2020-12-19 Thread GitBox
Dandandan edited a comment on pull request #8961: URL: https://github.com/apache/arrow/pull/8961#issuecomment-748437288 I checked merging the other PR https://github.com/apache/arrow/pull/8965 which improves the join implementation. Besides being much fastest regardless of this PR, r

[GitHub] [arrow] Dandandan commented on pull request #8961: ARROW-10885: [Rust][DataFusion] Optimize hash join build vs probe order based on number of rows

2020-12-19 Thread GitBox
Dandandan commented on pull request #8961: URL: https://github.com/apache/arrow/pull/8961#issuecomment-748492267 I wrote some details of the PRs for a planned blog post. https://docs.google.com/document/d/1Urxm34rl8DZ5D0vyhlrrBoZK6IHW7WFRN3hsaTfPujg/edit?usp=drivesdk --

[GitHub] [arrow] andygrove commented on a change in pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
andygrove commented on a change in pull request #8967: URL: https://github.com/apache/arrow/pull/8967#discussion_r546254073 ## File path: rust/arrow/src/ipc/reader.rs ## @@ -939,12 +939,13 @@ mod tests { use flate2::read::GzDecoder; use crate::util::integration_util

[GitHub] [arrow] codecov-io commented on pull request #8969: ARROW-10985: [Rust] Update unsafe guidelines for adding JIRA references

2020-12-19 Thread GitBox
codecov-io commented on pull request #8969: URL: https://github.com/apache/arrow/pull/8969#issuecomment-748492682 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8969?src=pr&el=h1) Report > Merging [#8969](https://codecov.io/gh/apache/arrow/pull/8969?src=pr&el=desc) (8ac55cc) into

[GitHub] [arrow] andygrove commented on pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
andygrove commented on pull request #8967: URL: https://github.com/apache/arrow/pull/8967#issuecomment-748492695 Thanks @mqy I like this approach. This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [arrow] andygrove commented on a change in pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
andygrove commented on a change in pull request #8968: URL: https://github.com/apache/arrow/pull/8968#discussion_r546254262 ## File path: rust/kafka/Cargo.toml ## @@ -0,0 +1,38 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agree

[GitHub] [arrow] andygrove commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
andygrove commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748493371 Thanks @kflansburg this is interesting. Would you mind explaining your use case for this? It has been a while since I used Kafka personally and I wondering what the benefit is t

[GitHub] [arrow] andygrove commented on pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
andygrove commented on pull request #8966: URL: https://github.com/apache/arrow/pull/8966#issuecomment-748493660 This looks great. Thanks @seddonm1 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] andygrove edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
andygrove edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748493371 Thanks @kflansburg this is interesting. Would you mind explaining your use case for this? It has been a while since I used Kafka personally and I am wondering what the be

[GitHub] [arrow] waynexia commented on pull request #8856: ARROW-10940: [Rust] Extend sort kernel to ListArray

2020-12-19 Thread GitBox
waynexia commented on pull request #8856: URL: https://github.com/apache/arrow/pull/8856#issuecomment-748494258 > I suggest that we keep the list_sort in the sort kernel for now. Hi @nevi-me, thanks for your reply. From my understanding, `list_sort` is not implemented in this pr. To

[GitHub] [arrow] Dandandan edited a comment on pull request #8961: ARROW-10885: [Rust][DataFusion] Optimize hash join build vs probe order based on number of rows

2020-12-19 Thread GitBox
Dandandan edited a comment on pull request #8961: URL: https://github.com/apache/arrow/pull/8961#issuecomment-748437288 I checked merging the other PR https://github.com/apache/arrow/pull/8965 which improves the join implementation. Besides being much faster regardless of this PR, re

[GitHub] [arrow] Dandandan edited a comment on pull request #8961: ARROW-10885: [Rust][DataFusion] Optimize hash join build vs probe order based on number of rows

2020-12-19 Thread GitBox
Dandandan edited a comment on pull request #8961: URL: https://github.com/apache/arrow/pull/8961#issuecomment-748437288 I checked merging the other PR https://github.com/apache/arrow/pull/8965 which improves the join implementation. Besides being much faster regardless of this PR, re

[GitHub] [arrow] kflansburg commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748495686 Hey @andygrove, this is definitely not a great use-case for Arrow since the format is not columnar, but I'm hoping to implement micro-batch style processing (possibly in DataFu

[GitHub] [arrow] nevi-me commented on pull request #8926: ARROW-8425: [Rust] [Parquet] Correct temporal IO

2020-12-19 Thread GitBox
nevi-me commented on pull request #8926: URL: https://github.com/apache/arrow/pull/8926#issuecomment-748497278 > I did not see any changes in the tests, which I would expect for a change in semantics. Aren't we not testing this yet, or how can I verify that it is now correct? There

[GitHub] [arrow] nevi-me commented on pull request #8856: ARROW-10940: [Rust] Extend sort kernel to ListArray

2020-12-19 Thread GitBox
nevi-me commented on pull request #8856: URL: https://github.com/apache/arrow/pull/8856#issuecomment-748498108 Hey @waynexia, I understand. It's effectively `array_sort` at https://github.com/nevi-me/rust-dataframe/blob/d4ae4ae6a2541b6625e702a350bb08efe42d6975/src/functions/array.rs#L328 ,

[GitHub] [arrow] codecov-io edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748488430 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=h1) Report > Merging [#8968](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=desc) (e0401e7)

[GitHub] [arrow] nevi-me commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
nevi-me commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748500486 Hi @kflansburg this is some great work. I've just gone through the code briefly. > I really like your idea of using Kafka as a transport layer for Arrow Flight messages.

[GitHub] [arrow] kflansburg commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748502283 > Our JSON reader already has the building blocks needed to trivially do this, and after #8938, you should be able to read all nested JSON types. Great, thanks for the ti

[GitHub] [arrow] kflansburg commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748502960 > I'd be interested in seeing how we could go about with implementing this. Giving this some thought, I think we can have configuration fields that indicate the keys and/

[GitHub] [arrow] kflansburg edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748502960 > I'd be interested in seeing how we could go about with implementing this. Giving this some thought, I think we can have configuration fields that indicate the ke

[GitHub] [arrow] nevi-me commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
nevi-me commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748503983 > The stretch goal here could be support for integration with a schema registry, but I haven't worked much with that. An interesting challenge would be how to then structure

[GitHub] [arrow] nevi-me edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
nevi-me edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748503983 > The stretch goal here could be support for integration with a schema registry, but I haven't worked much with that. > The only concern I have is with inconsistent s

[GitHub] [arrow] kflansburg commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748504940 I definitely want to support subscribing to multiple topics, its often the case that multiple topics share the same schema. My concern is that the full Schema may not be possib

[GitHub] [arrow] kflansburg edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748504940 I definitely want to support subscribing to multiple topics, its often the case that multiple topics share the same schema. My concern is that the full Schema may not be

[GitHub] [arrow] codecov-io edited a comment on pull request #8926: ARROW-8425: [Rust] [Parquet] Correct temporal IO

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8926: URL: https://github.com/apache/arrow/pull/8926#issuecomment-745273384 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8926?src=pr&el=h1) Report > Merging [#8926](https://codecov.io/gh/apache/arrow/pull/8926?src=pr&el=desc) (c760e88)

[GitHub] [arrow] codecov-io edited a comment on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748488430 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=h1) Report > Merging [#8968](https://codecov.io/gh/apache/arrow/pull/8968?src=pr&el=desc) (4807976)

[GitHub] [arrow] kflansburg commented on pull request #8968: ARROW-10979: [Rust] Basic Kafka Reader

2020-12-19 Thread GitBox
kflansburg commented on pull request #8968: URL: https://github.com/apache/arrow/pull/8968#issuecomment-748507370 Switching to `cmake` appears to have resolved the Windows build issue, but `cmake` is not installed for AMD64 Debian 10. I'm not sure what the error on the `Dev / Source

[GitHub] [arrow] Dandandan opened a new pull request #8970: ARROW-10986: [Rust][DataFusion] Add average stats to TPC-H benchmarks

2020-12-19 Thread GitBox
Dandandan opened a new pull request #8970: URL: https://github.com/apache/arrow/pull/8970 Tool now outputs average statistic based on all iterations. Also data is collected with more precision. Output: ``` Query 12 iteration 0 took 1076.5 ms Query 12 iteration 1 took 1

[GitHub] [arrow] codecov-io commented on pull request #8970: ARROW-10986: [Rust][DataFusion] Add average stats to TPC-H benchmarks

2020-12-19 Thread GitBox
codecov-io commented on pull request #8970: URL: https://github.com/apache/arrow/pull/8970#issuecomment-748509541 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8970?src=pr&el=h1) Report > Merging [#8970](https://codecov.io/gh/apache/arrow/pull/8970?src=pr&el=desc) (32cc9c3) into

[GitHub] [arrow] github-actions[bot] commented on pull request #8970: ARROW-10986: [Rust][DataFusion] Add average stats to TPC-H benchmarks

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8970: URL: https://github.com/apache/arrow/pull/8970#issuecomment-748509604 https://issues.apache.org/jira/browse/ARROW-10986 This is an automated message from the Apache Git Ser

[GitHub] [arrow] codecov-io edited a comment on pull request #8970: ARROW-10986: [Rust][DataFusion] Add average stats to TPC-H benchmarks

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8970: URL: https://github.com/apache/arrow/pull/8970#issuecomment-748509541 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8970?src=pr&el=h1) Report > Merging [#8970](https://codecov.io/gh/apache/arrow/pull/8970?src=pr&el=desc) (007e48c)

[GitHub] [arrow] Dandandan commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
Dandandan commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748516153 Makes sense @jorgecarleitao thanks This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] alamb commented on pull request #8836: ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations.

2020-12-19 Thread GitBox
alamb commented on pull request #8836: URL: https://github.com/apache/arrow/pull/8836#issuecomment-748523022 Everything looks good from here -- I also skimmed the PR again but I didn't go through it in detail. Thanks again @drusso -

[GitHub] [arrow] alamb closed pull request #8836: ARROW-10808: [Rust][DataFusion] Support nested expressions in aggregations.

2020-12-19 Thread GitBox
alamb closed pull request #8836: URL: https://github.com/apache/arrow/pull/8836 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #8953: ARROW-10952: [Rust] Add pre-commit hook

2020-12-19 Thread GitBox
alamb commented on pull request #8953: URL: https://github.com/apache/arrow/pull/8953#issuecomment-748525420 I think this is looking good @mqy -- thanks for the doc updates. Given it is an opt in and well documented, I am merging this in and we can iterate in subsequent PRs. ---

[GitHub] [arrow] alamb closed pull request #8953: ARROW-10952: [Rust] Add pre-commit hook

2020-12-19 Thread GitBox
alamb closed pull request #8953: URL: https://github.com/apache/arrow/pull/8953 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] yordan-pavlov commented on pull request #8960: ARROW-10540: [Rust] Extended filter kernel to all types and improved performance

2020-12-19 Thread GitBox
yordan-pavlov commented on pull request #8960: URL: https://github.com/apache/arrow/pull/8960#issuecomment-748525543 @jorgecarleitao thanks for the detailed explanation - it's great to see you have thought about optimizing the filtering of both single and multiple columns as much as possib

[GitHub] [arrow] alamb commented on pull request #8936: ARROW-10938: [Rust] upgrade dependency "flatbuffers" to 0.8

2020-12-19 Thread GitBox
alamb commented on pull request #8936: URL: https://github.com/apache/arrow/pull/8936#issuecomment-748525845 Thanks @mqy -- I'll try and look at this carefully tomorrow. This is an automated message from the Apache Git Servi

[GitHub] [arrow] codecov-io edited a comment on pull request #8946: ARROW-10944: [Rust] Implement min/max aggregate kernels for BooleanArray

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8946: URL: https://github.com/apache/arrow/pull/8946#issuecomment-747076330 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8946?src=pr&el=h1) Report > Merging [#8946](https://codecov.io/gh/apache/arrow/pull/8946?src=pr&el=desc) (d24c6f1)

[GitHub] [arrow] kou commented on pull request #8919: ARROW-10604: [GLib][Ruby] Add support for 256-bit decimal

2020-12-19 Thread GitBox
kou commented on pull request #8919: URL: https://github.com/apache/arrow/pull/8919#issuecomment-748526618 Thanks. I've changed to use bigdecimal 2.0.3 or later. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] alamb commented on pull request #8948: ARROW-10943: WIP: HACK! Test bool miri bugs

2020-12-19 Thread GitBox
alamb commented on pull request #8948: URL: https://github.com/apache/arrow/pull/8948#issuecomment-748528180 I found that by running the parquet tests in a loop I can reproduce the failure on my machine locally [more details in JIRA](https://issues.apache.org/jira/browse/ARROW-10943?focuse

[GitHub] [arrow] alamb commented on a change in pull request #8917: ARROW-9828: [Rust] [DataFusion] Support filter pushdown optimisation for TableProvider implementations

2020-12-19 Thread GitBox
alamb commented on a change in pull request #8917: URL: https://github.com/apache/arrow/pull/8917#discussion_r546286050 ## File path: rust/datafusion/src/datasource/datasource.rs ## @@ -48,9 +66,19 @@ pub trait TableProvider { &self, projection: &Option>,

[GitHub] [arrow] alamb closed pull request #8917: ARROW-9828: [Rust] [DataFusion] Support filter pushdown optimisation for TableProvider implementations

2020-12-19 Thread GitBox
alamb closed pull request #8917: URL: https://github.com/apache/arrow/pull/8917 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] alamb commented on pull request #8917: ARROW-9828: [Rust] [DataFusion] Support filter pushdown optimisation for TableProvider implementations

2020-12-19 Thread GitBox
alamb commented on pull request #8917: URL: https://github.com/apache/arrow/pull/8917#issuecomment-748529089 Thanks again @returnString -- @rdettai I believe I understand your concern -- and we can always enhance this API / trait to do pruning at the physical level as well.

[GitHub] [arrow] kflansburg opened a new pull request #8971: ARROW-10987: [Rust] Interpret BinaryArray as JSON

2020-12-19 Thread GitBox
kflansburg opened a new pull request #8971: URL: https://github.com/apache/arrow/pull/8971 Create lightweight wrapper, `JSONArray` to interpret `BinaryArray` values as serialized JSON. Leverage recent work for inferring JSON schema to support conversion to `StructArray`. Example:

[GitHub] [arrow] github-actions[bot] commented on pull request #8971: ARROW-10987: [Rust] Interpret BinaryArray as JSON

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8971: URL: https://github.com/apache/arrow/pull/8971#issuecomment-748530337 https://issues.apache.org/jira/browse/ARROW-10987 This is an automated message from the Apache Git Ser

[GitHub] [arrow] codecov-io commented on pull request #8971: ARROW-10987: [Rust] Interpret BinaryArray as JSON

2020-12-19 Thread GitBox
codecov-io commented on pull request #8971: URL: https://github.com/apache/arrow/pull/8971#issuecomment-748530908 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=h1) Report > Merging [#8971](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=desc) (b6d4cfb) into

[GitHub] [arrow] kou opened a new pull request #8972: ARROW-10988: [C++] Require CMake 3.5 or later

2020-12-19 Thread GitBox
kou opened a new pull request #8972: URL: https://github.com/apache/arrow/pull/8972 Because Ubuntu 16.04 ships CMake 3.5.1. This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] github-actions[bot] commented on pull request #8972: ARROW-10988: [C++] Require CMake 3.5 or later

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8972: URL: https://github.com/apache/arrow/pull/8972#issuecomment-748536366 https://issues.apache.org/jira/browse/ARROW-10988 This is an automated message from the Apache Git Ser

[GitHub] [arrow] kou commented on pull request #8972: ARROW-10988: [C++] Require CMake 3.5 or later

2020-12-19 Thread GitBox
kou commented on pull request #8972: URL: https://github.com/apache/arrow/pull/8972#issuecomment-748538646 @github-actions crossbow submit -g nightly This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #8972: ARROW-10988: [C++] Require CMake 3.5 or later

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8972: URL: https://github.com/apache/arrow/pull/8972#issuecomment-748538790 Revision: 9b061a9d05a7040852f4999145515859e48af22b Submitted crossbow builds: [ursa-labs/crossbow @ actions-818](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] codecov-io edited a comment on pull request #8971: ARROW-10987: [Rust] Interpret BinaryArray as JSON

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8971: URL: https://github.com/apache/arrow/pull/8971#issuecomment-748530908 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=h1) Report > Merging [#8971](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=desc) (45bc767)

[GitHub] [arrow] andygrove commented on pull request #8969: ARROW-10985: [Rust] Update unsafe guidelines for adding JIRA references

2020-12-19 Thread GitBox
andygrove commented on pull request #8969: URL: https://github.com/apache/arrow/pull/8969#issuecomment-748541706 I filed a bunch of JIRAs under https://issues.apache.org/jira/browse/ARROW-10888 to go write these docs for existing usage. ![subtasks-unsafe](https://user-images.github

[GitHub] [arrow] seddonm1 commented on a change in pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
seddonm1 commented on a change in pull request #8966: URL: https://github.com/apache/arrow/pull/8966#discussion_r546302442 ## File path: rust/datafusion/src/physical_plan/string_expressions.rs ## @@ -66,3 +71,73 @@ pub fn concatenate(args: &[ArrayRef]) -> Result { }

[GitHub] [arrow] seddonm1 commented on a change in pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
seddonm1 commented on a change in pull request #8966: URL: https://github.com/apache/arrow/pull/8966#discussion_r546302507 ## File path: rust/datafusion/src/physical_plan/functions.rs ## @@ -280,6 +309,18 @@ fn signature(fun: &BuiltinScalarFunction) -> Signature {

[GitHub] [arrow] codecov-io commented on pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
codecov-io commented on pull request #8966: URL: https://github.com/apache/arrow/pull/8966#issuecomment-748546622 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8966?src=pr&el=h1) Report > Merging [#8966](https://codecov.io/gh/apache/arrow/pull/8966?src=pr&el=desc) (8267501) into

[GitHub] [arrow] seddonm1 commented on a change in pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
seddonm1 commented on a change in pull request #8966: URL: https://github.com/apache/arrow/pull/8966#discussion_r546304152 ## File path: rust/datafusion/tests/sql.rs ## @@ -1826,3 +1826,21 @@ async fn csv_between_expr_negated() -> Result<()> { assert_eq!(expected, actual);

[GitHub] [arrow] seddonm1 commented on a change in pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
seddonm1 commented on a change in pull request #8966: URL: https://github.com/apache/arrow/pull/8966#discussion_r546304202 ## File path: rust/datafusion/src/physical_plan/string_expressions.rs ## @@ -66,3 +71,73 @@ pub fn concatenate(args: &[ArrayRef]) -> Result { }

[GitHub] [arrow] seddonm1 commented on pull request #8966: ARROW-10969: [Rust][DataFusion] Implement basic String ANSI SQL Functions

2020-12-19 Thread GitBox
seddonm1 commented on pull request #8966: URL: https://github.com/apache/arrow/pull/8966#issuecomment-748547812 @jorgecarleitao Thanks for your comments (they really help me learn) and have done a major refactor. Please pay close attention to the comments here: https://github.com/a

[GitHub] [arrow] codecov-io edited a comment on pull request #8971: ARROW-10987: [Rust] Interpret BinaryArray as JSON

2020-12-19 Thread GitBox
codecov-io edited a comment on pull request #8971: URL: https://github.com/apache/arrow/pull/8971#issuecomment-748530908 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=h1) Report > Merging [#8971](https://codecov.io/gh/apache/arrow/pull/8971?src=pr&el=desc) (93aa609)

[GitHub] [arrow] tyrelr opened a new pull request #8973: Iterate primitive buffers by slice

2020-12-19 Thread GitBox
tyrelr opened a new pull request #8973: URL: https://github.com/apache/arrow/pull/8973 Iterating slices instead of indexes seems to improve performance of non-simd arithmetic operations. This adds a new method raw_values_slice to PrimitiveArray (so named to pun off of the raw_valu

[GitHub] [arrow] github-actions[bot] commented on pull request #8973: [Rust] Iterate primitive buffers by slice

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8973: URL: https://github.com/apache/arrow/pull/8973#issuecomment-748558702 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] codecov-io commented on pull request #8973: ARROW-10989: [Rust] Iterate primitive buffers by slice

2020-12-19 Thread GitBox
codecov-io commented on pull request #8973: URL: https://github.com/apache/arrow/pull/8973#issuecomment-748559191 # [Codecov](https://codecov.io/gh/apache/arrow/pull/8973?src=pr&el=h1) Report > Merging [#8973](https://codecov.io/gh/apache/arrow/pull/8973?src=pr&el=desc) (4e06899) into

[GitHub] [arrow] github-actions[bot] commented on pull request #8973: ARROW-10989: [Rust] Iterate primitive buffers by slice

2020-12-19 Thread GitBox
github-actions[bot] commented on pull request #8973: URL: https://github.com/apache/arrow/pull/8973#issuecomment-748559518 https://issues.apache.org/jira/browse/ARROW-10989 This is an automated message from the Apache Git Ser

[GitHub] [arrow] waynexia commented on pull request #8856: ARROW-10940: [Rust] Extend sort kernel to ListArray

2020-12-19 Thread GitBox
waynexia commented on pull request #8856: URL: https://github.com/apache/arrow/pull/8856#issuecomment-748562551 Yes, I think ARROW-10355 requests something like that. But I didn't understand it clearly at the beginning and made this "wrong" PR :stuck_out_tongue: -

[GitHub] [arrow] mqy commented on a change in pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
mqy commented on a change in pull request #8967: URL: https://github.com/apache/arrow/pull/8967#discussion_r546324360 ## File path: rust/arrow/src/ipc/reader.rs ## @@ -939,12 +939,13 @@ mod tests { use flate2::read::GzDecoder; use crate::util::integration_util::*; -

[GitHub] [arrow] mqy commented on a change in pull request #8967: ARROW-10967: [Rust] Make env vars optional

2020-12-19 Thread GitBox
mqy commented on a change in pull request #8967: URL: https://github.com/apache/arrow/pull/8967#discussion_r546324360 ## File path: rust/arrow/src/ipc/reader.rs ## @@ -939,12 +939,13 @@ mod tests { use flate2::read::GzDecoder; use crate::util::integration_util::*; -

[GitHub] [arrow] kiszk commented on pull request #8210: ARROW-10031: [CI][Java] Support Java benchmark in Ursabot

2020-12-19 Thread GitBox
kiszk commented on pull request #8210: URL: https://github.com/apache/arrow/pull/8210#issuecomment-748568948 ping @liyafan82 @fsaintjacques @kszucs This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] kiszk commented on a change in pull request #8955: ARROW-9948: [C++] in Decimal128::FromString raise when scale is out of bounds

2020-12-19 Thread GitBox
kiszk commented on a change in pull request #8955: URL: https://github.com/apache/arrow/pull/8955#discussion_r546328882 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -949,33 +949,31 @@ DecimalStatus BasicDecimal128::Rescale(int32_t original_scale, int32_t new_scale

  1   2   >