[GitHub] [arrow-rs] Ted-Jiang opened a new pull request, #2158: Add integration test for scan rows with selection

2022-07-24 Thread GitBox
Ted-Jiang opened a new pull request, #2158: URL: https://github.com/apache/arrow-rs/pull/2158 # Which issue does this PR close? Closes #2106 . # Rationale for this change # What changes are included in this PR? # Are there any user-facing changes?

[GitHub] [arrow] rok commented on pull request #13506: ARROW-16653: [R] All formats are supported with the lubridate `parse_date_time` binding

2022-07-24 Thread GitBox
rok commented on PR #13506: URL: https://github.com/apache/arrow/pull/13506#issuecomment-1193656226 @paleolimbot @dragosmg I've pushed this a bit further and I think most formats are now covered. Could you please check if this makes sense to you? -- This is an automated message from the A

[GitHub] [arrow] Oooorchid opened a new issue, #13698: [Java][C++] Java VectorSchemaRoot to C++ RecordBatchreader

2022-07-24 Thread GitBox
Oooorchid opened a new issue, #13698: URL: https://github.com/apache/arrow/issues/13698 I have a Java library that is writing an Arrow Table to a VectorSchemaRoot object in memory. And I want read the data with C++. But it keeps getting an error, what should I do? From the docs I on

[GitHub] [arrow] rok commented on a diff in pull request #13506: ARROW-16653: [R] All formats are supported with the lubridate `parse_date_time` binding

2022-07-24 Thread GitBox
rok commented on code in PR #13506: URL: https://github.com/apache/arrow/pull/13506#discussion_r928528243 ## r/tests/testthat/test-dplyr-funcs-datetime.R: ## @@ -2638,14 +2694,18 @@ test_that("build_formats() and build_format_from_order()", { ) ) - # ab not supported

[GitHub] [arrow] rok commented on a diff in pull request #13506: ARROW-16653: [R] All formats are supported with the lubridate `parse_date_time` binding

2022-07-24 Thread GitBox
rok commented on code in PR #13506: URL: https://github.com/apache/arrow/pull/13506#discussion_r928527638 ## r/R/dplyr-datetime-helpers.R: ## @@ -158,74 +158,40 @@ binding_as_date_numeric <- function(x, origin = "1970-01-01") { #' #' @noRd build_formats <- function(orders) {

[GitHub] [arrow] kou commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
kou commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193646276 > is it the same problem on MacOS related to ORC? I don't remember it... Could you share the URL of it? -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193640209 Revision: 53a80060f659bca9c4bf53e6bcaccbb48d04dd23 Submitted crossbow builds: [ursacomputing/crossbow @ actions-1552706800](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya commented on code in PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#discussion_r928506171 ## arrow/src/compute/kernels/comparison.rs: ## @@ -1931,177 +1804,107 @@ where Ok(BooleanArray::from(data)) } -macro_rules! typed_cmp { Review Comment: We ca

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya commented on code in PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#discussion_r928506171 ## arrow/src/compute/kernels/comparison.rs: ## @@ -1931,177 +1804,107 @@ where Ok(BooleanArray::from(data)) } -macro_rules! typed_cmp { Review Comment: We ca

[GitHub] [arrow] REASY commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
REASY commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193628206 @kou is it the same problem on MacOS related to ORC? The problem I do not have MacOS device, but will give a try with AWS... -- This is an automated message from the Apache Git Service. T

[GitHub] [arrow] kou commented on pull request #13697: ARROW-17194: [CI][Conan] Enable glog

2022-07-24 Thread GitBox
kou commented on PR #13697: URL: https://github.com/apache/arrow/pull/13697#issuecomment-1193615357 @github-actions crossbow submit -g conan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] kou commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
kou commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193603218 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] kou commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
kou commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193603138 We still have a problem. We need to fix the problem before we merge this. Is there a person who wants to work on this? -- This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow] kou commented on pull request #13618: ARROW-17080: [Java] Add a top-level CMakeLists.txt for JNI

2022-07-24 Thread GitBox
kou commented on PR #13618: URL: https://github.com/apache/arrow/pull/13618#issuecomment-1193599456 OK. Updated. Note that the documentation will be updated again by ARROW-17081. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow-datafusion] liurenjie1024 commented on issue #2633: Introducing a new optimizer framework for datafusion.

2022-07-24 Thread GitBox
liurenjie1024 commented on issue #2633: URL: https://github.com/apache/arrow-datafusion/issues/2633#issuecomment-1193584924 I would like to donate this optimizer to datafusion-contrib so that we can develop it with community. -- This is an automated message from the Apache Git Service. T

[GitHub] [arrow] ursabot commented on pull request #13696: ARROW-17191: [C++][FlightRPC] Handle inlined slices after concatenation

2022-07-24 Thread GitBox
ursabot commented on PR #13696: URL: https://github.com/apache/arrow/pull/13696#issuecomment-1193575988 Benchmark runs are scheduled for baseline = 75ca3b21bc39ed91274aca7da57ee9328ae3c087 and contender = ef6049a2ee5673d0944a0b4f70ff9c70e0419a22. ef6049a2ee5673d0944a0b4f70ff9c70e0419a22 is

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#issuecomment-1193562609 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2157?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya commented on code in PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#discussion_r928452024 ## arrow/src/compute/kernels/comparison.rs: ## @@ -1964,6 +1838,195 @@ macro_rules! typed_cmp { }}; } +fn eq_typed_compares(left: &dyn Array, right: &dyn Array)

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya commented on code in PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#discussion_r928452024 ## arrow/src/compute/kernels/comparison.rs: ## @@ -1964,6 +1838,195 @@ macro_rules! typed_cmp { }}; } +fn eq_typed_compares(left: &dyn Array, right: &dyn Array)

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya commented on code in PR #2157: URL: https://github.com/apache/arrow-rs/pull/2157#discussion_r928451736 ## arrow/src/compute/kernels/comparison.rs: ## @@ -40,168 +39,72 @@ use regex::{escape, Regex}; use std::any::type_name; use std::collections::HashMap; -/// Helper f

[GitHub] [arrow-rs] ursabot commented on pull request #2140: Add Decimal128Iter and Decimal256Iter and do maximum precision/scale check

2022-07-24 Thread GitBox
ursabot commented on PR #2140: URL: https://github.com/apache/arrow-rs/pull/2140#issuecomment-1193539918 Benchmark runs are scheduled for baseline = 1621c713d724b0cd4aabccfa3243714789283df5 and contender = 73153fec814a8871fe9e6ea6a7bc66198118cd25. 73153fec814a8871fe9e6ea6a7bc66198118cd25 i

[GitHub] [arrow-rs] viirya opened a new pull request, #2157: Use ArrayAccessor in Comparison Kernels

2022-07-24 Thread GitBox
viirya opened a new pull request, #2157: URL: https://github.com/apache/arrow-rs/pull/2157 # Which issue does this PR close? Closes #2135. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chan

[GitHub] [arrow-rs] viirya closed issue #2138: Use ArrayAccessor in Decimal128Iter and Decimal256Iter

2022-07-24 Thread GitBox
viirya closed issue #2138: Use ArrayAccessor in Decimal128Iter and Decimal256Iter URL: https://github.com/apache/arrow-rs/issues/2138 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

[GitHub] [arrow-rs] viirya closed issue #2139: Check precision and scale against maximum value when constructing Decimal128 and Decimal256

2022-07-24 Thread GitBox
viirya closed issue #2139: Check precision and scale against maximum value when constructing Decimal128 and Decimal256 URL: https://github.com/apache/arrow-rs/issues/2139 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use th

[GitHub] [arrow-rs] viirya merged pull request #2140: Add Decimal128Iter and Decimal256Iter and do maximum precision/scale check

2022-07-24 Thread GitBox
viirya merged PR #2140: URL: https://github.com/apache/arrow-rs/pull/2140 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow] github-actions[bot] commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193518156 Revision: a71371e3ed50a4a9d933af2e260a22d4113cb1b3 Submitted crossbow builds: [ursacomputing/crossbow @ actions-571edda43c](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] kou commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
kou commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193517545 @github-actions crossbow submit java-jars -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow-datafusion] liurenjie1024 commented on issue #2350: Enable discussion tab?

2022-07-24 Thread GitBox
liurenjie1024 commented on issue #2350: URL: https://github.com/apache/arrow-datafusion/issues/2350#issuecomment-1193517063 It has been enabled, so we can close this now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow-datafusion] liurenjie1024 closed issue #2350: Enable discussion tab?

2022-07-24 Thread GitBox
liurenjie1024 closed issue #2350: Enable discussion tab? URL: https://github.com/apache/arrow-datafusion/issues/2350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [arrow-rs] HaoYang670 commented on issue #2156: The `GenericStringBuilder` should use `GenericBinaryBuilder`

2022-07-24 Thread GitBox
HaoYang670 commented on issue #2156: URL: https://github.com/apache/arrow-rs/issues/2156#issuecomment-1193512179 I will work on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [arrow-rs] HaoYang670 opened a new issue, #2156: The `GenericStringBuilder` should use `GenericBinaryBuilder`

2022-07-24 Thread GitBox
HaoYang670 opened a new issue, #2156: URL: https://github.com/apache/arrow-rs/issues/2156 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** Currently, the string builder uses the list builder: ```rust pub struct GenericStringBu

[GitHub] [arrow] github-actions[bot] commented on pull request #13643: ARROW-17116: [C++][Gandiva] Adding RepeatStr Function

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13643: URL: https://github.com/apache/arrow/pull/13643#issuecomment-1193502870 https://issues.apache.org/jira/browse/ARROW-17116 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #13643: ARROW-17116: [C++][Gandiva] Adding RepeatStr Function

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13643: URL: https://github.com/apache/arrow/pull/13643#issuecomment-1193502902 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #13643: ARROW-17116: [C++][Gandiva] Adding RepeatStr Function

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13643: URL: https://github.com/apache/arrow/pull/13643#issuecomment-1193502886 :warning: Ticket **has no components in JIRA**, make sure you assign one. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] kou commented on pull request #13643: ARROW 17116: [C++] [GANDIVA] Adding RepeatStr Function

2022-07-24 Thread GitBox
kou commented on PR #13643: URL: https://github.com/apache/arrow/pull/13643#issuecomment-1193502473 The MinGW failures were resolved by #13696. Could you rebase on mater? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] kou merged pull request #13696: ARROW-17191: [C++][FlightRPC] Handle inlined slices after concatenation

2022-07-24 Thread GitBox
kou merged PR #13696: URL: https://github.com/apache/arrow/pull/13696 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.or

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #2960: test: add test for parquet decimal and pruning for decimal column

2022-07-24 Thread GitBox
liukun4515 commented on code in PR #2960: URL: https://github.com/apache/arrow-datafusion/pull/2960#discussion_r928343239 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -1023,6 +1023,49 @@ mod tests { Ok(()) } +#[tokio::test] +async fn re

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2154: Avoid decoding unneeded values in ByteArrayDecoderDictionary

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2154: URL: https://github.com/apache/arrow-rs/pull/2154#issuecomment-1193443566 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2154?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-ballista] avantgardnerio commented on pull request #93: Add FlightSQL support

2022-07-24 Thread GitBox
avantgardnerio commented on PR #93: URL: https://github.com/apache/arrow-ballista/pull/93#issuecomment-1193418199 The official JDBC driver needs prepared statement support in order to work: ![Screenshot from 2022-07-24 17-33-56](https://user-images.githubusercontent.com/3855243/180670

[GitHub] [arrow-ballista] avantgardnerio commented on pull request #93: Add FlightSQL support

2022-07-24 Thread GitBox
avantgardnerio commented on PR #93: URL: https://github.com/apache/arrow-ballista/pull/93#issuecomment-1193412003 Working via custom JDBC driver in DataGrip: ![Screenshot from 2022-07-24 17-05-50](https://user-images.githubusercontent.com/3855243/180669481-752d84a6-d9c4-401a-aff6-2d6d

[GitHub] [arrow] wjones127 commented on a diff in pull request #13665: ARROW-17100: [C++][Parquet] Fix backwards compatibility for ParquetV2 data pages written prior to 3.0.0 per ARROW-10353

2022-07-24 Thread GitBox
wjones127 commented on code in PR #13665: URL: https://github.com/apache/arrow/pull/13665#discussion_r928317730 ## cpp/src/parquet/metadata.cc: ## @@ -58,6 +58,13 @@ const ApplicationVersion& ApplicationVersion::PARQUET_MR_FIXED_STATS_VERSION() { return version; } +const

[GitHub] [arrow] nealrichardson commented on pull request #13625: ARROW-16612: [R] Support inferring compression from filename for all readers/writers

2022-07-24 Thread GitBox
nealrichardson commented on PR #13625: URL: https://github.com/apache/arrow/pull/13625#issuecomment-1193401806 > Something about the `filesystem` change in the readable/writable file creation seems to have caused failures on Windows although I don't know why this would be the case only on W

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2155: Implement `peek_next_page` and `skip_next_page` for `InMemoryColumnCh…

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2155: URL: https://github.com/apache/arrow-rs/pull/2155#issuecomment-1193401557 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2155?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] thinkharderdev opened a new pull request, #2155: Implement `peek_next_page` and `skip_next_page` for `InMemoryColumnCh…

2022-07-24 Thread GitBox
thinkharderdev opened a new pull request, #2155: URL: https://github.com/apache/arrow-rs/pull/2155 …unkReader` # Which issue does this PR close? Closes #2129 . # Rationale for this change # What changes are included in this PR? # Ar

[GitHub] [arrow] ursabot commented on pull request #13694: ARROW-16997: [Doc][Dev] Update arrow/dev README

2022-07-24 Thread GitBox
ursabot commented on PR #13694: URL: https://github.com/apache/arrow/pull/13694#issuecomment-1193393480 Benchmark runs are scheduled for baseline = 70904dffef25a8c883a1a829c66a1d30a7d9c249 and contender = 75ca3b21bc39ed91274aca7da57ee9328ae3c087. 75ca3b21bc39ed91274aca7da57ee9328ae3c087 is

[GitHub] [arrow-ballista] avantgardnerio commented on pull request #93: Add FlightSQL support

2022-07-24 Thread GitBox
avantgardnerio commented on PR #93: URL: https://github.com/apache/arrow-ballista/pull/93#issuecomment-1193383857 I was able to retrieve query results from some custom Kotlin JDBC client code I'll post shortly: ``` [Test worker] INFO FlightSqlStatement - Execute SQL: SELECT 'keep a

[GitHub] [arrow-rs] thinkharderdev opened a new pull request, #2154: Avoid decoding unneeded values in ByteArrayDecoderDictionary

2022-07-24 Thread GitBox
thinkharderdev opened a new pull request, #2154: URL: https://github.com/apache/arrow-rs/pull/2154 # Which issue does this PR close? Closes #2088 # Rationale for this change # What changes are included in this PR? # Are there any user-fac

[GitHub] [arrow-ballista] avantgardnerio opened a new pull request, #93: Add FlightSQL support

2022-07-24 Thread GitBox
avantgardnerio opened a new pull request, #93: URL: https://github.com/apache/arrow-ballista/pull/93 # Which issue does this PR close? Closes #92. # Rationale for this change To use off the shelf tools with Ballista. # What changes are included in this PR?

[GitHub] [arrow-ballista] avantgardnerio opened a new issue, #92: Ballista should support Arrow FlightSQL

2022-07-24 Thread GitBox
avantgardnerio opened a new issue, #92: URL: https://github.com/apache/arrow-ballista/issues/92 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** In order to integrate with the wide ecosystem of database tools (tableau, etc) Bal

[GitHub] [arrow-rs] codecov-commenter commented on pull request #2122: Implement skip_rep_levels

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2122: URL: https://github.com/apache/arrow-rs/pull/2122#issuecomment-1193378608 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+S

[GitHub] [arrow-rs] ursabot commented on pull request #2151: Break out docs CI job to its own github action

2022-07-24 Thread GitBox
ursabot commented on PR #2151: URL: https://github.com/apache/arrow-rs/pull/2151#issuecomment-1193375170 Benchmark runs are scheduled for baseline = fce66260015642e691f246d29666d8f9c3197f8a and contender = 1621c713d724b0cd4aabccfa3243714789283df5. 1621c713d724b0cd4aabccfa3243714789283df5 i

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2140: Add Decimal128Iter and Decimal256Iter and do maximum precision/scale check

2022-07-24 Thread GitBox
tustvold commented on code in PR #2140: URL: https://github.com/apache/arrow-rs/pull/2140#discussion_r928298669 ## arrow/src/array/array_decimal.rs: ## @@ -368,19 +374,13 @@ impl From for Decimal256Array { } } -impl<'a> IntoIterator for &'a Decimal128Array { -type It

[GitHub] [arrow-rs] viirya merged pull request #2151: Break out docs CI job to its own github action

2022-07-24 Thread GitBox
viirya merged PR #2151: URL: https://github.com/apache/arrow-rs/pull/2151 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #2960: test: add test for parquet decimal and pruning for decimal column

2022-07-24 Thread GitBox
tustvold commented on code in PR #2960: URL: https://github.com/apache/arrow-datafusion/pull/2960#discussion_r928298397 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -1023,6 +1023,49 @@ mod tests { Ok(()) } +#[tokio::test] +async fn read

[GitHub] [arrow] lidavidm merged pull request #13694: ARROW-16997: [Doc][Dev] Update arrow/dev README

2022-07-24 Thread GitBox
lidavidm merged PR #13694: URL: https://github.com/apache/arrow/pull/13694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

[GitHub] [arrow] github-actions[bot] commented on pull request #13696: ARROW-17191: [C++][FlightRPC] Handle inlined slices after concatenation

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13696: URL: https://github.com/apache/arrow/pull/13696#issuecomment-1193371990 https://issues.apache.org/jira/browse/ARROW-17191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] lidavidm opened a new pull request, #13696: ARROW-17191: [C++][FlightRPC] Handle inlined slices after concatenation

2022-07-24 Thread GitBox
lidavidm opened a new pull request, #13696: URL: https://github.com/apache/arrow/pull/13696 See the JIRA for details, but essentially: data was getting corrupted since allocating a new `shared_ptr` control block was overwriting the data in a `GrpcBuffer`. This happened because the data was

[GitHub] [arrow-rs] heyrutvik commented on pull request #2038: Implement FromIterator for Builders

2022-07-24 Thread GitBox
heyrutvik commented on PR #2038: URL: https://github.com/apache/arrow-rs/pull/2038#issuecomment-1193369899 > Perhaps we could split this up and get the change for the primitive readers in or something? Hey @tustvold, apologies for the late reply. I didn't get much time to work on thi

[GitHub] [arrow] krcrouse commented on pull request #13126: ARROW-12526: Pre-generating pyarrow.compute and creating a docstring additions system for pyarrow functions

2022-07-24 Thread GitBox
krcrouse commented on PR #13126: URL: https://github.com/apache/arrow/pull/13126#issuecomment-1193349631 Hi, @jorisvandenbossche and @pitrou, I've pushed new updates to this branch based on the comments, including the fully generated source files in `pyarrow/_compute_generated.py`

[GitHub] [arrow] adzcai commented on pull request #13157: ARROW-16584: [Java] Java JNI with S3 support

2022-07-24 Thread GitBox
adzcai commented on PR #13157: URL: https://github.com/apache/arrow/pull/13157#issuecomment-1193348766 Hi, just wanted to ask for a current status update? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #2958: Simplify expressions with `NOT` clause

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2958: URL: https://github.com/apache/arrow-datafusion/pull/2958#issuecomment-1193347032 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/2958?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] ursabot commented on pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
ursabot commented on PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#issuecomment-1193345271 Benchmark runs are scheduled for baseline = 7746e7dd4eb2be66bc6aa84b9dcebd2a76036d00 and contender = fce66260015642e691f246d29666d8f9c3197f8a. fce66260015642e691f246d29666d8f9c3197f8a i

[GitHub] [arrow-rs] tustvold commented on issue #2153: Use vectorized unpacking in ColumnLevelDecoderImpl for ColumnLevelDecoderImpl

2022-07-24 Thread GitBox
tustvold commented on issue #2153: URL: https://github.com/apache/arrow-rs/issues/2153#issuecomment-1193344551 FWIW it would be even more optimal to push the skip down to the RLEDecoder, but that might be more complex to implement -- This is an automated message from the Apache Git Servic

[GitHub] [arrow-rs] tustvold closed issue #2107: Support skip_def_levels for ColumnLevelDecoder

2022-07-24 Thread GitBox
tustvold closed issue #2107: Support skip_def_levels for ColumnLevelDecoder URL: https://github.com/apache/arrow-rs/issues/2107 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-rs] tustvold merged pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
tustvold merged PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
Ted-Jiang commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928274742 ## parquet/src/arrow/record_reader/definition_levels.rs: ## @@ -248,7 +254,9 @@ struct PackedDecoder { impl PackedDecoder { fn next_rle_block(&mut self) -> Re

[GitHub] [arrow-datafusion] AssHero commented on pull request #2958: Simplify expressions with `NOT` clause

2022-07-24 Thread GitBox
AssHero commented on PR #2958: URL: https://github.com/apache/arrow-datafusion/pull/2958#issuecomment-1193339353 Refine the code according to the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
Ted-Jiang commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928273409 ## parquet/src/column/reader/decoder.rs: ## @@ -318,10 +318,41 @@ impl ColumnLevelDecoder for ColumnLevelDecoderImpl { impl DefinitionLevelDecoder for ColumnLevelDec

[GitHub] [arrow-rs] Ted-Jiang opened a new issue, #2153: Use vectorized unpacking in ColumnLevelDecoderImpl for ColumnLevelDecoderImpl

2022-07-24 Thread GitBox
Ted-Jiang opened a new issue, #2153: URL: https://github.com/apache/arrow-rs/issues/2153 It might be faster to decode to a temporary buffer, to allow vectorized unpacking, but definitely something that can be done as a follow up _Originally posted by @tustvold in https://github.com/a

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
tustvold commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928273217 ## parquet/src/arrow/record_reader/definition_levels.rs: ## @@ -248,7 +254,9 @@ struct PackedDecoder { impl PackedDecoder { fn next_rle_block(&mut self) -> Res

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
Ted-Jiang commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928273120 ## parquet/src/arrow/record_reader/definition_levels.rs: ## @@ -248,7 +254,9 @@ struct PackedDecoder { impl PackedDecoder { fn next_rle_block(&mut self) -> Re

[GitHub] [arrow-datafusion-python] kylebrooks-8451 opened a new issue, #12: Implement DataFrame execute_stream and execute_stream_partitioned

2022-07-24 Thread GitBox
kylebrooks-8451 opened a new issue, #12: URL: https://github.com/apache/arrow-datafusion-python/issues/12 arrow-rs now has [zero-copy to PyArrow for RecordBatchReaders](https://github.com/apache/arrow-rs/blob/b2cf02c7a8a5027d037fc359323bc0ed45b943de/arrow/src/pyarrow.rs#L204). We coul

[GitHub] [arrow] github-actions[bot] commented on pull request #13311: ARROW-16340: [Python] Move all Python related code into PyArrow

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13311: URL: https://github.com/apache/arrow/pull/13311#issuecomment-119029 Revision: a9a3e912adf70b18c8cd7c572221f103414dcea8 Submitted crossbow builds: [ursacomputing/crossbow @ actions-167817167a](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] AlenkaF commented on pull request #13311: ARROW-16340: [Python] Move all Python related code into PyArrow

2022-07-24 Thread GitBox
AlenkaF commented on PR #13311: URL: https://github.com/apache/arrow/pull/13311#issuecomment-1193332836 @github-actions crossbow submit *python* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [arrow] PursuitOfDataScience opened a new issue, #13695: read_csv_arrow() does not support ".csv.gz"

2022-07-24 Thread GitBox
PursuitOfDataScience opened a new issue, #13695: URL: https://github.com/apache/arrow/issues/13695 Hi, I tried to read some ".csv.gz" files by using `read_csv_arrow()` and it gave me such an error message "Error in url(file, open = "rb") : URL scheme unsupported by this method." How

[GitHub] [arrow] github-actions[bot] commented on pull request #13694: ARROW-16997: [Doc][Dev] Update arrow/dev README

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13694: URL: https://github.com/apache/arrow/pull/13694#issuecomment-1193332300 https://issues.apache.org/jira/browse/ARROW-16997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #13694: ARROW-16997: [Doc][Dev] Update arrow/dev README

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13694: URL: https://github.com/apache/arrow/pull/13694#issuecomment-1193332305 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow-datafusion] AssHero commented on a diff in pull request #2958: Simplify expressions with `NOT` clause

2022-07-24 Thread GitBox
AssHero commented on code in PR #2958: URL: https://github.com/apache/arrow-datafusion/pull/2958#discussion_r928268943 ## datafusion/optimizer/src/simplify_expressions.rs: ## @@ -185,6 +181,97 @@ fn as_bool_lit(expr: Expr) -> Option { } } +/// negate a Not clause +/// in

[GitHub] [arrow-datafusion-python] kylebrooks-8451 opened a new issue, #11: Support fsspec based filesystems as ObjectStores

2022-07-24 Thread GitBox
kylebrooks-8451 opened a new issue, #11: URL: https://github.com/apache/arrow-datafusion-python/issues/11 Allow for using a Python fsspec filesystem as an ObjectStore which would allow DataFusion Python to support any filesystem supported by fsspec. I have this working for an internal proj

[GitHub] [arrow] github-actions[bot] commented on pull request #13311: ARROW-16340: [Python] Move all Python related code into PyArrow

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13311: URL: https://github.com/apache/arrow/pull/13311#issuecomment-1193324689 Revision: a9a3e912adf70b18c8cd7c572221f103414dcea8 Submitted crossbow builds: [ursacomputing/crossbow @ actions-b43d962138](https://github.com/ursacomputing/crossbow/bra

[GitHub] [arrow] AlenkaF commented on pull request #13311: ARROW-16340: [Python] Move all Python related code into PyArrow

2022-07-24 Thread GitBox
AlenkaF commented on PR #13311: URL: https://github.com/apache/arrow/pull/13311#issuecomment-1193324349 @github-actions crossbow submit *wheel* -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] github-actions[bot] commented on pull request #13693: ARROW-17092: [Docs] Add note about "Feather" to the IPC file format document

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13693: URL: https://github.com/apache/arrow/pull/13693#issuecomment-1193318787 https://issues.apache.org/jira/browse/ARROW-17092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] github-actions[bot] commented on pull request #13693: ARROW-17092: [Docs] Add note about "Feather" to the IPC file format document

2022-07-24 Thread GitBox
github-actions[bot] commented on PR #13693: URL: https://github.com/apache/arrow/pull/13693#issuecomment-1193318790 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
tustvold commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928243863 ## parquet/src/arrow/record_reader/definition_levels.rs: ## @@ -248,7 +254,9 @@ struct PackedDecoder { impl PackedDecoder { fn next_rle_block(&mut self) -> Res

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #2148: Port `object_store` integration tests, use github actions

2022-07-24 Thread GitBox
tustvold commented on code in PR #2148: URL: https://github.com/apache/arrow-rs/pull/2148#discussion_r928243422 ## object_store/src/gcp.rs: ## @@ -575,6 +583,18 @@ mod test { service_account: String, } +impl GoogleCloudConfig { +fn build(self) -> Resu

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
Ted-Jiang commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928241471 ## parquet/src/column/reader/decoder.rs: ## @@ -318,10 +318,41 @@ impl ColumnLevelDecoder for ColumnLevelDecoderImpl { impl DefinitionLevelDecoder for ColumnLevelDec

[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #2960: test: add test for parquet decimal and pruning for decimal column

2022-07-24 Thread GitBox
codecov-commenter commented on PR #2960: URL: https://github.com/apache/arrow-datafusion/pull/2960#issuecomment-1193290734 # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/2960?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_

[GitHub] [arrow-rs] Ted-Jiang commented on a diff in pull request #2111: Support skip_def_levels (only max_def_levels=1) for ColumnLevelDecoder

2022-07-24 Thread GitBox
Ted-Jiang commented on code in PR #2111: URL: https://github.com/apache/arrow-rs/pull/2111#discussion_r928235258 ## parquet/src/arrow/record_reader/definition_levels.rs: ## @@ -226,10 +226,27 @@ impl ColumnLevelDecoder for DefinitionLevelBufferDecoder { impl DefinitionLevelDeco

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #2960: test: add test for parquet decimal and pruning for decimal column

2022-07-24 Thread GitBox
liukun4515 commented on code in PR #2960: URL: https://github.com/apache/arrow-datafusion/pull/2960#discussion_r928228561 ## datafusion/core/src/datasource/file_format/parquet.rs: ## @@ -1023,6 +1023,49 @@ mod tests { Ok(()) } +#[tokio::test] +async fn re

[GitHub] [arrow-datafusion] liukun4515 opened a new pull request, #2960: test: add test for parquet decimal and pruning for decimal column

2022-07-24 Thread GitBox
liukun4515 opened a new pull request, #2960: URL: https://github.com/apache/arrow-datafusion/pull/2960 # Which issue does this PR close? Closes #. # Rationale for this change # What changes are included in this PR? # Are there any user-facing chang

[GitHub] [arrow] ursabot commented on pull request #13650: ARROW-16703: [R] Refactor map_batches() so it can stream results

2022-07-24 Thread GitBox
ursabot commented on PR #13650: URL: https://github.com/apache/arrow/pull/13650#issuecomment-1193265546 Benchmark runs are scheduled for baseline = ee2e9448c8565820ba38a2df9e44ab6055e5df1d and contender = 70904dffef25a8c883a1a829c66a1d30a7d9c249. 70904dffef25a8c883a1a829c66a1d30a7d9c249 is