[GitHub] [arrow-rs] HaoYang670 opened a new pull request, #2277: Fix bugs in the `from_list` function.

2022-08-01 Thread GitBox
HaoYang670 opened a new pull request, #2277: URL: https://github.com/apache/arrow-rs/pull/2277 # Which issue does this PR close? Closes #1726. # Rationale for this change # What changes are included in this PR? We have the `from_list` (or `from_fixed_size_list`

[GitHub] [arrow] drin commented on pull request #13487: ARROW-8991: [C++] Add new scalar compute function

2022-08-01 Thread GitBox
drin commented on PR #13487: URL: https://github.com/apache/arrow/pull/13487#issuecomment-1202055532 sorry, renaming to match the PR back to ARROW-8991. Since I am already trying to accommodate 64-bit, it didn't make sense to keep the smaller scope that the sub-task was meant to capture. I'

[GitHub] [arrow] github-actions[bot] commented on pull request #13775: ARROW-17280: [C++] Move vendored flatbuffers to private namespace

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13775: URL: https://github.com/apache/arrow/pull/13775#issuecomment-1202044495 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13775: ARROW-17280: [C++] Move vendored flatbuffers to private namespace

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13775: URL: https://github.com/apache/arrow/pull/13775#issuecomment-1202044479 https://issues.apache.org/jira/browse/ARROW-17280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #13775: [C++] Move vendored flatbuffers to private namespace

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13775: URL: https://github.com/apache/arrow/pull/13775#issuecomment-1202041382 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue

[GitHub] [arrow] kosak opened a new pull request, #13775: [C++] Move vendored flatbuffers to private namespace

2022-08-01 Thread GitBox
kosak opened a new pull request, #13775: URL: https://github.com/apache/arrow/pull/13775 When a user's C++ program links to both Arrow and an installation of the Flatbuffers library, the program can crash or send corrupt Arrow messages. The reason for this is version incompatibilit

[GitHub] [arrow-rs] tustvold opened a new issue, #2276: Add unpack8, unpack16 and unpack64

2022-08-01 Thread GitBox
tustvold opened a new issue, #2276: URL: https://github.com/apache/arrow-rs/issues/2276 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently we have vectorised unpacking of 32-bit integers. We then extend or truncate

[GitHub] [arrow-rs] ursabot commented on pull request #2268: Update prost and tonic related crates

2022-08-01 Thread GitBox
ursabot commented on PR #2268: URL: https://github.com/apache/arrow-rs/pull/2268#issuecomment-1202024899 Benchmark runs are scheduled for baseline = b4fa47d9c8323e45985563e2bd1478aa1a23639e and contender = cd45ecbfd4dc87ee62b6b71a174bedbd282d0e4f. cd45ecbfd4dc87ee62b6b71a174bedbd282d0e4f i

[GitHub] [arrow-rs] tustvold merged pull request #2268: Update prost and tonic related crates

2022-08-01 Thread GitBox
tustvold merged PR #2268: URL: https://github.com/apache/arrow-rs/pull/2268 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #3010: feat: Add support for TIME literal values

2022-08-01 Thread GitBox
codecov-commenter commented on PR #3010: URL: https://github.com/apache/arrow-datafusion/pull/3010#issuecomment-1202017034 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/3010?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-datafusion] stuartcarnie commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
stuartcarnie commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1202001744 👋🏻 hi @avantgardnerio I was able to successfully upgrade `arrow-rs` via this draft PR for which all DataFusion tests pass: https://github.com/apache/arrow-datafusion/pull/3

[GitHub] [arrow-datafusion] stuartcarnie opened a new pull request, #3010: feat: Add support for TIME literal values

2022-08-01 Thread GitBox
stuartcarnie opened a new pull request, #3010: URL: https://github.com/apache/arrow-datafusion/pull/3010 This is a **draft PR**, as it depends on an upgraded arrow-rs crate which contains breaking changes. This PR includes commits to resolve the changes and all tests pass, however, more tes

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2275: fix: use signed comparator to compare decimal128 and decimal256

2022-08-01 Thread GitBox
viirya commented on code in PR #2275: URL: https://github.com/apache/arrow-rs/pull/2275#discussion_r935087546 ## arrow/src/util/decimal.rs: ## @@ -245,6 +245,49 @@ macro_rules! def_decimal { }; } +// compare two signed integer which are encoded with little endian. +// le

[GitHub] [arrow] mopcup commented on a diff in pull request #13690: ARROW-17088: [R] Use `.arrow` as extension of IPC files of datasets

2022-08-01 Thread GitBox
mopcup commented on code in PR #13690: URL: https://github.com/apache/arrow/pull/13690#discussion_r935087296 ## r/tests/testthat/test-dataset-write.R: ## @@ -470,7 +502,7 @@ test_that("Dataset writing: unsupported features/input validation", { write_dataset(ds, tempfile(),

[GitHub] [arrow] mopcup commented on a diff in pull request #13690: ARROW-17088: [R] Use `.arrow` as extension of IPC files of datasets

2022-08-01 Thread GitBox
mopcup commented on code in PR #13690: URL: https://github.com/apache/arrow/pull/13690#discussion_r935086860 ## r/tests/testthat/test-dataset-write.R: ## @@ -139,6 +139,36 @@ test_that("Writing a dataset: Parquet->Parquet (default)", { ) }) +test_that("Writing a dataset:

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2275: fix: use signed comparator to compare decimal128 and decimal256

2022-08-01 Thread GitBox
codecov-commenter commented on PR #2275: URL: https://github.com/apache/arrow-rs/pull/2275#issuecomment-1201972606 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2275?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow] ursabot commented on pull request #13766: ARROW-17270: [Docs] Move nightly package instructions to dev docs

2022-08-01 Thread GitBox
ursabot commented on PR #13766: URL: https://github.com/apache/arrow/pull/13766#issuecomment-1201970398 Benchmark runs are scheduled for baseline = e80981ca2a7b8c43f61f41b55842423b8295d633 and contender = 48e27804296233a9ab90b6096e291046ff2db6f3. 48e27804296233a9ab90b6096e291046ff2db6f3 is

[GitHub] [arrow-rs] liukun4515 opened a new pull request, #2275: fix: use signed comparator to compare decimal128 and decimal256

2022-08-01 Thread GitBox
liukun4515 opened a new pull request, #2275: URL: https://github.com/apache/arrow-rs/pull/2275 # Which issue does this PR close? Closes #2256 # Rationale for this change # What changes are included in this PR? # Are there any user-facing ch

[GitHub] [arrow-datafusion] yourenawo commented on issue #2993: Can I select some files for query based on the filtering rules in the directory?

2022-08-01 Thread GitBox
yourenawo commented on issue #2993: URL: https://github.com/apache/arrow-datafusion/issues/2993#issuecomment-1201947104 There are 3000 parquet files in the directory, and I only need to query 100 of them. -- This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [arrow] github-actions[bot] commented on pull request #13774: ARROW-17277: [Go][CSV] Custom csv.Writer formatter for boolean values

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13774: URL: https://github.com/apache/arrow/pull/13774#issuecomment-1201940782 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13774: ARROW-17277: [Go][CSV] Custom csv.Writer formatter for boolean values

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13774: URL: https://github.com/apache/arrow/pull/13774#issuecomment-1201940758 https://issues.apache.org/jira/browse/ARROW-17277 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #13774: [Go][CSV] Custom csv.Writer formatter for boolean values

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13774: URL: https://github.com/apache/arrow/pull/13774#issuecomment-1201940108 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you open an issue

[GitHub] [arrow] ggodik opened a new pull request, #13774: [Go][CSV] Custom csv.Writer formatter for boolean values

2022-08-01 Thread GitBox
ggodik opened a new pull request, #13774: URL: https://github.com/apache/arrow/pull/13774 - Use `WithBoolWriter` to overwrite the default use of `strconv.FormatBool` - uses `strconv.FormatBool` by default - `WithBoolWriter(nil)` will not overwrite default usage ``` w :=

[GitHub] [arrow-rs] HaoYang670 opened a new issue, #2274: Use more `const` functions.

2022-08-01 Thread GitBox
HaoYang670 opened a new issue, #2274: URL: https://github.com/apache/arrow-rs/issues/2274 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As the `const fn` related features have been stabilized in rust 1.61: https://github.com/rus

[GitHub] [arrow] github-actions[bot] commented on pull request #13773: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13773: URL: https://github.com/apache/arrow/pull/13773#issuecomment-1201916011 Revision: 3647aa3d6474604a7ed615621f9503f634e92efc Submitted crossbow builds: [ursacomputing/crossbow @ actions-de212b2c40](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] paleolimbot commented on pull request #13773: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
paleolimbot commented on PR #13773: URL: https://github.com/apache/arrow/pull/13773#issuecomment-1201915500 @github-actions crossbow submit test-r-linux-valgrind -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow-datafusion] ursabot commented on pull request #3003: Fix SQL planner bug when resolving columns with same name as a relation

2022-08-01 Thread GitBox
ursabot commented on PR #3003: URL: https://github.com/apache/arrow-datafusion/pull/3003#issuecomment-1201913829 Benchmark runs are scheduled for baseline = 55a12869f32644ed87d69d9b7617372db5103fb8 and contender = e23925c96c97139d101719fa7a456088eeed6ae9. e23925c96c97139d101719fa7a456088e

[GitHub] [arrow-datafusion] yjshen merged pull request #3003: Fix SQL planner bug when resolving columns with same name as a relation

2022-08-01 Thread GitBox
yjshen merged PR #3003: URL: https://github.com/apache/arrow-datafusion/pull/3003 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arr

[GitHub] [arrow-datafusion] yjshen closed issue #3002: The expression to get an indexed field is only valid for `List` types (`common_sub_expression_eliminate`)

2022-08-01 Thread GitBox
yjshen closed issue #3002: The expression to get an indexed field is only valid for `List` types (`common_sub_expression_eliminate`) URL: https://github.com/apache/arrow-datafusion/issues/3002 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] github-actions[bot] commented on pull request #13773: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13773: URL: https://github.com/apache/arrow/pull/13773#issuecomment-1201905290 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13773: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13773: URL: https://github.com/apache/arrow/pull/13773#issuecomment-1201905271 https://issues.apache.org/jira/browse/ARROW-17252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] paleolimbot opened a new pull request, #13773: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
paleolimbot opened a new pull request, #13773: URL: https://github.com/apache/arrow/pull/13773 An attempt to isolate the change that fixed the valgrind errors from #13746. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow] paleolimbot closed pull request #13746: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
paleolimbot closed pull request #13746: ARROW-17252: [R] Intermittent valgrind failure URL: https://github.com/apache/arrow/pull/13746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow] ursabot commented on pull request #13765: ARROW-17272: [Dev] Pass --add-opens in integration tests

2022-08-01 Thread GitBox
ursabot commented on PR #13765: URL: https://github.com/apache/arrow/pull/13765#issuecomment-1201904309 Benchmark runs are scheduled for baseline = c5173186bbd61641ba05e2848ad860fb696f6095 and contender = e80981ca2a7b8c43f61f41b55842423b8295d633. e80981ca2a7b8c43f61f41b55842423b8295d633 is

[GitHub] [arrow] github-actions[bot] commented on pull request #13772: ARROW-17273: [Go][CSV] Add Timestamp, Date32, Date64 format support to csv.Writer

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13772: URL: https://github.com/apache/arrow/pull/13772#issuecomment-1201898152 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13772: ARROW-17273: [Go][CSV] Add Timestamp, Date32, Date64 format support to csv.Writer

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13772: URL: https://github.com/apache/arrow/pull/13772#issuecomment-1201898136 https://issues.apache.org/jira/browse/ARROW-17273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] ggodik opened a new pull request, #13772: ARROW-17273: [Go][CSV] Add Timestamp, Date32, Date64 format support to csv.Writer

2022-08-01 Thread GitBox
ggodik opened a new pull request, #13772: URL: https://github.com/apache/arrow/pull/13772 Newly supported types - Date32 - Date64 - Timestamp csv.Reader currently supports Timestamps. Not adding Date32/64 support to CSV as the default behavior will stay the same and parse

[GitHub] [arrow] kou merged pull request #13766: ARROW-17270: [Docs] Move nightly package instructions to dev docs

2022-08-01 Thread GitBox
kou merged PR #13766: URL: https://github.com/apache/arrow/pull/13766 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow-datafusion] viirya commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
viirya commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201847738 > expected `i128`, found struct `arrow::util::decimal::Decimal128` Have you tried calling `as_i128()`? -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow-rs] ursabot commented on pull request #2272: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
ursabot commented on PR #2272: URL: https://github.com/apache/arrow-rs/pull/2272#issuecomment-1201839965 Benchmark runs are scheduled for baseline = d4f038a44218f6014ffd580429a0a6dde360f88f and contender = b4fa47d9c8323e45985563e2bd1478aa1a23639e. b4fa47d9c8323e45985563e2bd1478aa1a23639e i

[GitHub] [arrow-rs] tustvold closed issue #2273: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
tustvold closed issue #2273: Use initial capacity for interner hashmap URL: https://github.com/apache/arrow-rs/issues/2273 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [arrow-rs] tustvold merged pull request #2272: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
tustvold merged PR #2272: URL: https://github.com/apache/arrow-rs/pull/2272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-rs] tustvold commented on issue #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
tustvold commented on issue #2270: URL: https://github.com/apache/arrow-rs/issues/2270#issuecomment-1201834760 > Interesting, when is this the case? Would there be situations where the decoder couldn't optimize based on the number of values to skip? It needs to be benchmarked but with

[GitHub] [arrow-rs] thinkharderdev commented on issue #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
thinkharderdev commented on issue #2270: URL: https://github.com/apache/arrow-rs/issues/2270#issuecomment-1201828410 > This system will only be able to pushdown to eliminate decode overheads, i.e. it will be unable to eliminate IO to fetch data (which is fine we have the page index for that

[GitHub] [arrow] ursabot commented on pull request #13767: ARROW-17274: [GO] Remove panic from parquet.file.RowGroupReader.Column(index int)

2022-08-01 Thread GitBox
ursabot commented on PR #13767: URL: https://github.com/apache/arrow/pull/13767#issuecomment-1201818720 Benchmark runs are scheduled for baseline = 877ed5b0817df5d2592b92964c647714c04f417f and contender = c5173186bbd61641ba05e2848ad860fb696f6095. c5173186bbd61641ba05e2848ad860fb696f6095 is

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934943090 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] tustvold commented on issue #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
tustvold commented on issue #2270: URL: https://github.com/apache/arrow-rs/issues/2270#issuecomment-1201811291 I need to think more on this, but some immediate thoughts that may or may not make sense: * This system will only be able to pushdown to eliminate decode overheads, i.e. it

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2272: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
codecov-commenter commented on PR #2272: URL: https://github.com/apache/arrow-rs/pull/2272#issuecomment-1201804186 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2272?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] Dandandan opened a new issue, #2273: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
Dandandan opened a new issue, #2273: URL: https://github.com/apache/arrow-rs/issues/2273 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Writing dictionary encoded arrays using parquet spends lot of time rehashing arrays.

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
codecov-commenter commented on PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#issuecomment-1201799597 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2251?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201798686 > From i128 to Decimal128 But that ripples up to ScalarValues and throughout the codebase quickly. -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201796064 I think we need to just change all these: https://github.com/apache/arrow-datafusion/blob/55a12869f32644ed87d69d9b7617372db5103fb8/datafusion/physical-expr/src/expressions

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #3009: WIP: Implement exact median

2022-08-01 Thread GitBox
codecov-commenter commented on PR #3009: URL: https://github.com/apache/arrow-datafusion/pull/3009#issuecomment-1201795012 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/3009?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] thinkharderdev commented on issue #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
thinkharderdev commented on issue #2270: URL: https://github.com/apache/arrow-rs/issues/2270#issuecomment-1201794786 So basically the implementation in DataFusion is: 1. Take the filter predicate and all projections it needs to evaluate and wrap a `ParquetRecordBatchStream` in a strea

[GitHub] [arrow-rs] tustvold commented on pull request #2268: Update prost and tonic related crates

2022-08-01 Thread GitBox
tustvold commented on PR #2268: URL: https://github.com/apache/arrow-rs/pull/2268#issuecomment-1201793634 Thank you for dealing with this ❤️ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201789160 CC @HaoYang670 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201788502 If I revert the change here: https://github.com/apache/arrow-rs/pull/2140/files to: ``` def_decimal_array!( Decimal128Array, "Decimal128Array",

[GitHub] [arrow-rs] Dandandan opened a new pull request, #2272: Use initial capacity for interner hashmap

2022-08-01 Thread GitBox
Dandandan opened a new pull request, #2272: URL: https://github.com/apache/arrow-rs/pull/2272 # Which issue does this PR close? Closes #. # Rationale for this change This saves rehashing / reshuffling the hashmap when there are quite some distinct values.

[GitHub] [arrow-rs] tustvold commented on issue #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
tustvold commented on issue #2270: URL: https://github.com/apache/arrow-rs/issues/2270#issuecomment-1201782388 Could you perhaps expand upon what you mean by streaming filtering? My understanding of the two major use-cases for this functionality were: * Selecting specific rows identif

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934957161 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934957161 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] thinkharderdev commented on pull request #2271: Row filtering

2022-08-01 Thread GitBox
thinkharderdev commented on PR #2271: URL: https://github.com/apache/arrow-rs/pull/2271#issuecomment-1201770817 @tustvold @alamb @Ted-Jiang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spe

[GitHub] [arrow-rs] thinkharderdev commented on a diff in pull request #2271: Row filtering

2022-08-01 Thread GitBox
thinkharderdev commented on code in PR #2271: URL: https://github.com/apache/arrow-rs/pull/2271#discussion_r934953607 ## parquet/src/arrow/arrow_reader.rs: ## @@ -283,42 +284,112 @@ pub struct ParquetRecordBatchReader { selection: Option>, } +impl ParquetRecordBatchReade

[GitHub] [arrow-rs] thinkharderdev opened a new pull request, #2271: Row filtering

2022-08-01 Thread GitBox
thinkharderdev opened a new pull request, #2271: URL: https://github.com/apache/arrow-rs/pull/2271 # Which issue does this PR close? Closes #2270 # Rationale for this change # What changes are included in this PR? # Are there any user-faci

[GitHub] [arrow-rs] thinkharderdev opened a new issue, #2270: Changes to ParquetRecordBatchStream to support row filtering in DataFusion

2022-08-01 Thread GitBox
thinkharderdev opened a new issue, #2270: URL: https://github.com/apache/arrow-rs/issues/2270 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** The current (non-public) API for skipping records has some shortcomings when try

[GitHub] [arrow-datafusion] waitingkuo commented on pull request #3000: to_timestamp align with postgresql

2022-08-01 Thread GitBox
waitingkuo commented on PR #3000: URL: https://github.com/apache/arrow-datafusion/pull/3000#issuecomment-1201760815 #2998 merged, all test passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934943239 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934943090 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934942068 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow-rs] stuartcarnie commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
stuartcarnie commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934940959 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(tim

[GitHub] [arrow] github-actions[bot] commented on pull request #13771: ARROW-17278: [C++][Benchmarking] Add AsofJoin Ordering Assertion and Benchmark Fixes

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13771: URL: https://github.com/apache/arrow/pull/13771#issuecomment-1201747009 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13771: ARROW-17278: [C++][Benchmarking] Add AsofJoin Ordering Assertion and Benchmark Fixes

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13771: URL: https://github.com/apache/arrow/pull/13771#issuecomment-1201746993 https://issues.apache.org/jira/browse/ARROW-17278 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] iChauster opened a new pull request, #13771: ARROW-17278: [C++][Benchmarking] Add AsofJoin Ordering Assertion and Benchmark Fixes

2022-08-01 Thread GitBox
iChauster opened a new pull request, #13771: URL: https://github.com/apache/arrow/pull/13771 - Add a invalid status if batches are ingested in an incorrect order for any source in AsOfJoin. - Fix benchmarks to use isolated memory pools. - Reuse created memory pools over different benc

[GitHub] [arrow-datafusion] andygrove commented on issue #2925: Add support for exact `median` aggregate function

2022-08-01 Thread GitBox
andygrove commented on issue #2925: URL: https://github.com/apache/arrow-datafusion/issues/2925#issuecomment-1201745574 Thanks @jonmmease. I just submitted a PR with the functionality. I need to clean this up, but I think it is ready for testing. -- This is an automated message from the

[GitHub] [arrow-datafusion] andygrove opened a new pull request, #3009: WIP: Implement exact median

2022-08-01 Thread GitBox
andygrove opened a new pull request, #3009: URL: https://github.com/apache/arrow-datafusion/pull/3009 # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/2925 # Rationale for this change Needed for h2o benchmarks.

[GitHub] [arrow] kou merged pull request #13765: ARROW-17272: [Dev] Pass --add-opens in integration tests

2022-08-01 Thread GitBox
kou merged PR #13765: URL: https://github.com/apache/arrow/pull/13765 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow-rs] avantgardnerio commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
avantgardnerio commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934934282 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(t

[GitHub] [arrow-rs] avantgardnerio commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
avantgardnerio commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934927554 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(t

[GitHub] [arrow] kou commented on a diff in pull request #13769: ARROW-12590: [C++][R] Update copies of Homebrew files to reflect recent updates

2022-08-01 Thread GitBox
kou commented on code in PR #13769: URL: https://github.com/apache/arrow/pull/13769#discussion_r934920983 ## r/tools/autobrew: ## @@ -36,6 +43,13 @@ curl -fsSL https://github.com/$UPSTREAM_ORG/brew/tarball/master | tar xz --strip export HOMEBREW_CACHE="$AUTOBREW" LOCAL_FORMUL

[GitHub] [arrow-rs] avantgardnerio commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
avantgardnerio commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934921087 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(t

[GitHub] [arrow-rs] avantgardnerio commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
avantgardnerio commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934918636 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(t

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2260: Improve `object_store crate` documentation

2022-08-01 Thread GitBox
codecov-commenter commented on PR #2260: URL: https://github.com/apache/arrow-rs/pull/2260#issuecomment-1201713556 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2260?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow] kou commented on a diff in pull request #12914: ARROW-2034: [C++] Filesystem implementation for Azure Blob Storage

2022-08-01 Thread GitBox
kou commented on code in PR #12914: URL: https://github.com/apache/arrow/pull/12914#discussion_r931676422 ## cpp/src/arrow/filesystem/azurefs_test.cc: ## @@ -0,0 +1,531 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements.

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201708740 If I understand correctly, the other XXXArray types have iterators that return the primitive representations (i.e. `u128`), but Decimal now returns a struct with `From` a

[GitHub] [arrow-rs] alamb commented on a diff in pull request #2251: feat: Implement string cast operations for Time32 and Time64

2022-08-01 Thread GitBox
alamb commented on code in PR #2251: URL: https://github.com/apache/arrow-rs/pull/2251#discussion_r934903273 ## arrow/src/compute/kernels/cast.rs: ## @@ -1584,6 +1625,303 @@ fn cast_string_to_date64( Ok(Arc::new(array) as ArrayRef) } +fn seconds_since_midnight(time: &chr

[GitHub] [arrow] bkietz commented on a diff in pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-08-01 Thread GitBox
bkietz commented on code in PR #13613: URL: https://github.com/apache/arrow/pull/13613#discussion_r934907633 ## cpp/src/arrow/engine/substrait/ext_test.cc: ## @@ -198,11 +207,11 @@ TEST(ExtensionIdRegistryTest, RegisterTempFunctions) { for (util::string_view name : kTempF

[GitHub] [arrow] bkietz commented on a diff in pull request #13613: ARROW-15582: [C++] Add support for registering standard Substrait functions

2022-08-01 Thread GitBox
bkietz commented on code in PR #13613: URL: https://github.com/apache/arrow/pull/13613#discussion_r934907411 ## cpp/src/arrow/engine/substrait/ext_test.cc: ## @@ -158,10 +152,10 @@ TEST_P(ExtensionIdRegistryTest, ReregisterFunctions) { auto provider = std::get<0>(GetParam());

[GitHub] [arrow] zeroshade merged pull request #13767: ARROW-17274: [GO] Remove panic from parquet.file.RowGroupReader.Column(index int)

2022-08-01 Thread GitBox
zeroshade merged PR #13767: URL: https://github.com/apache/arrow/pull/13767 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-datafusion] ursabot commented on pull request #3004: add TimeUnit::Second as signature for ToTimestampSeconds

2022-08-01 Thread GitBox
ursabot commented on PR #3004: URL: https://github.com/apache/arrow-datafusion/pull/3004#issuecomment-1201699217 Benchmark runs are scheduled for baseline = c179102492b9b5463c4751c4637a01547b6f2b7b and contender = 55a12869f32644ed87d69d9b7617372db5103fb8. 55a12869f32644ed87d69d9b7617372db

[GitHub] [arrow-datafusion] alamb merged pull request #3004: add TimeUnit::Second as signature for ToTimestampSeconds

2022-08-01 Thread GitBox
alamb merged PR #3004: URL: https://github.com/apache/arrow-datafusion/pull/3004 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arro

[GitHub] [arrow-datafusion] alamb closed issue #2998: Double to_timestamp_seconds produces abnormal result

2022-08-01 Thread GitBox
alamb closed issue #2998: Double to_timestamp_seconds produces abnormal result URL: https://github.com/apache/arrow-datafusion/issues/2998 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow-rs] alamb commented on a diff in pull request #2260: Improve `object_store crate` documentation

2022-08-01 Thread GitBox
alamb commented on code in PR #2260: URL: https://github.com/apache/arrow-rs/pull/2260#discussion_r934897698 ## object_store/src/lib.rs: ## @@ -28,15 +28,128 @@ //! # object_store //! -//! This crate provides APIs for interacting with object storage services. +//! This crate

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2260: Improve `object_store crate` documentation

2022-08-01 Thread GitBox
tustvold commented on code in PR #2260: URL: https://github.com/apache/arrow-rs/pull/2260#discussion_r934742024 ## object_store/src/lib.rs: ## @@ -28,15 +28,128 @@ //! # object_store //! -//! This crate provides APIs for interacting with object storage services. +//! This cr

[GitHub] [arrow-rs] alamb commented on a diff in pull request #2260: Improve `object_store crate` documentation

2022-08-01 Thread GitBox
alamb commented on code in PR #2260: URL: https://github.com/apache/arrow-rs/pull/2260#discussion_r934897213 ## object_store/README.md: ## @@ -19,8 +19,21 @@ # Rust Object Store -A crate providing a generic interface to object stores, such as S3, Azure Blob Storage and Goo

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3007: Upgrade arrow

2022-08-01 Thread GitBox
avantgardnerio commented on PR #3007: URL: https://github.com/apache/arrow-datafusion/pull/3007#issuecomment-1201689127 I fixed a lot of the obvious errors, but I wasn't sure what to do about ``` error[E0308]: mismatched types --> datafusion/physical-expr/src/expressions/binary.r

[GitHub] [arrow-rs] carols10cents commented on pull request #2268: Update prost and tonic related crates

2022-08-01 Thread GitBox
carols10cents commented on PR #2268: URL: https://github.com/apache/arrow-rs/pull/2268#issuecomment-1201682854 🤷🏻‍♀️ or we can sidestep that and install protoc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [arrow-datafusion] andygrove commented on pull request #3003: Fix SQL planner bug when resolving columns with same name as a relation

2022-08-01 Thread GitBox
andygrove commented on PR #3003: URL: https://github.com/apache/arrow-datafusion/pull/3003#issuecomment-1201657238 > i added a simple test case here [andygrove#64](https://github.com/andygrove/arrow-datafusion/pull/64) Thank you @waitingkuo! -- This is an automated message from t

[GitHub] [arrow] github-actions[bot] commented on pull request #13746: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13746: URL: https://github.com/apache/arrow/pull/13746#issuecomment-1201655388 Revision: 690a9a028a56b1bde969256154b9f7043b52666e Submitted crossbow builds: [ursacomputing/crossbow @ actions-4c128215b9](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] paleolimbot commented on pull request #13746: ARROW-17252: [R] Intermittent valgrind failure

2022-08-01 Thread GitBox
paleolimbot commented on PR #13746: URL: https://github.com/apache/arrow/pull/13746#issuecomment-1201654248 @github-actions crossbow submit test-r-linux-valgrind -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #13770: ARROW-17275: [Go][Integration] Handle LargeString/LargeBinary in IPC and integration tests

2022-08-01 Thread GitBox
github-actions[bot] commented on PR #13770: URL: https://github.com/apache/arrow/pull/13770#issuecomment-1201646314 https://issues.apache.org/jira/browse/ARROW-17275 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

  1   2   3   >