[GitHub] [arrow] js8544 commented on issue #13981: [C++][Compute]Performance of arrow::compute compared to raw operations on `arrow::Array`

2022-08-31 Thread GitBox
js8544 commented on issue #13981: URL: https://github.com/apache/arrow/issues/13981#issuecomment-1233828791 @drin Thanks for your reply. I checked my compilation setup and discovered that I somehow compiled arrow with no optimization at all, i.e. O0. I changed the following options and rer

[GitHub] [arrow] js8544 commented on issue #13981: [C++][Compute]Performance of arrow::compute compared to raw operations on `arrow::Array`

2022-08-31 Thread GitBox
js8544 commented on issue #13981: URL: https://github.com/apache/arrow/issues/13981#issuecomment-1233826245 @drin Thanks for your reply. I checked my compilation setup and discovered that I somehow compiled arrow without any optimization. -- This is an automated message from the Apache G

[GitHub] [arrow] kou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
kou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233804471 Not reproduced... I think that we can defer this problem as a follow-up task because package build works and general Apache Arrow C++ package test is passed. (The failure was occurred afte

[GitHub] [arrow-rs] HaoYang670 opened a new pull request, #2621: Add `Decimal128` to `DataType::is_numeric` and clean the numeric casting code

2022-08-31 Thread GitBox
HaoYang670 opened a new pull request, #2621: URL: https://github.com/apache/arrow-rs/pull/2621 # Which issue does this PR close? Closes #2611. # Rationale for this change # What changes are included in this PR? # Are there any user-facing c

[GitHub] [arrow] marsupialtail commented on pull request #13799: ARROW-17299: [C++][Python] Expose the Scanner kDefaultBatchReadahead and kDefaultFragmentReadahead parameters

2022-08-31 Thread GitBox
marsupialtail commented on PR #13799: URL: https://github.com/apache/arrow/pull/13799#issuecomment-1233741477 done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubs

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #3185: optimizer: add framework for the rule of pre-add cast to the literal in comparison binary

2022-08-31 Thread GitBox
liukun4515 commented on code in PR #3185: URL: https://github.com/apache/arrow-datafusion/pull/3185#discussion_r960204356 ## datafusion/optimizer/src/pre_cast_lit_in_comparison.rs: ## @@ -0,0 +1,311 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more co

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #3309: add SQL support for tinyint and all unsigned INTs

2022-08-31 Thread GitBox
liukun4515 commented on PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#issuecomment-1233717164 > I've added the unsigned ints to this PR as well, matching what's in the issue I have the opposite opinion for supporting the unsigned value in SQL Level now.

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #3309: add SQL support for tinyint and all unsigned INTs

2022-08-31 Thread GitBox
liukun4515 commented on PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#issuecomment-1233709759 I think the signed value is ok for me like small int, but the unsigned value make me concern about the compatibility and consistent behavior with other database system like

[GitHub] [arrow] kou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
kou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233709178 > Hmm, both `centos-9-stream` builds have failed in the test step and it's not really clear why. @kou Would you have any clue? Example: https://github.com/ursacomputing/crossbow/runs/81158451

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #3309: add SQL support for tinyint and all unsigned INTs

2022-08-31 Thread GitBox
liukun4515 commented on PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#issuecomment-1233706833 Please hold this pr, I want to review it and has concern about the data type supported in the SQL level. @alamb -- This is an automated message from the Apache Git Serv

[GitHub] [arrow-datafusion] liukun4515 commented on a diff in pull request #3301: Finish integrating `Expr::Is[Not]True` and similar expressions

2022-08-31 Thread GitBox
liukun4515 commented on code in PR #3301: URL: https://github.com/apache/arrow-datafusion/pull/3301#discussion_r960195458 ## datafusion/physical-expr/src/planner.rs: ## @@ -83,6 +83,84 @@ pub fn create_physical_expr( } } } +Expr::Is

[GitHub] [arrow-datafusion] liukun4515 commented on issue #3304: Add type coercion/validation for `Is[Not]True` and similar statements

2022-08-31 Thread GitBox
liukun4515 commented on issue #3304: URL: https://github.com/apache/arrow-datafusion/issues/3304#issuecomment-1233704567 The left expr must be BOOLEAN DATA TYPE, and can't be casted to the boolean type -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow-datafusion] liukun4515 commented on issue #3304: Add type coercion/validation for `Is[Not]True` and similar statements

2022-08-31 Thread GitBox
liukun4515 commented on issue #3304: URL: https://github.com/apache/arrow-datafusion/issues/3304#issuecomment-1233704074 ``` boolean IS TRUE → booleanTest whether boolean expression yields true.true IS TRUE → tNULL::boolean IS TRUE → f (rather than NULL) -- boolean IS NOT T

[GitHub] [arrow-datafusion] liukun4515 commented on issue #3304: Add type coercion/validation for `Is[Not]True` and similar statements

2022-08-31 Thread GitBox
liukun4515 commented on issue #3304: URL: https://github.com/apache/arrow-datafusion/issues/3304#issuecomment-1233703756 @andygrove we should follow this doc from PG https://www.postgresql.org/docs/current/functions-comparison.html -- This is an automated message from the Apache Git Se

[GitHub] [arrow] ursabot commented on pull request #14001: ARROW-17079: [C++] Raise proper error message instead of error code for S3 errors

2022-08-31 Thread GitBox
ursabot commented on PR #14001: URL: https://github.com/apache/arrow/pull/14001#issuecomment-1233694871 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/bd2cfe5e04574f989ceb06b5213b9456...a253e80a4b1a4638b0b7056596fcbb32/)

[GitHub] [arrow] ursabot commented on pull request #14001: ARROW-17079: [C++] Raise proper error message instead of error code for S3 errors

2022-08-31 Thread GitBox
ursabot commented on PR #14001: URL: https://github.com/apache/arrow/pull/14001#issuecomment-1233694765 Benchmark runs are scheduled for baseline = 13a7b605ede88ca15b053f119909c48d0919c6f8 and contender = 46f38dca3decfae329da4e32992aa5e286802af6. 46f38dca3decfae329da4e32992aa5e286802af6 is

[GitHub] [arrow-datafusion] iajoiner commented on issue #3312: Remove panics from `datafusion-expr` crate

2022-08-31 Thread GitBox
iajoiner commented on issue #3312: URL: https://github.com/apache/arrow-datafusion/issues/3312#issuecomment-1233694408 I will take this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] cyb70289 commented on pull request #14016: ARROW-17587: [Go] Cast From Extension Types

2022-08-31 Thread GitBox
cyb70289 commented on PR #14016: URL: https://github.com/apache/arrow/pull/14016#issuecomment-1233676384 Does this PR depends on previous one? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow-ballista] andygrove opened a new pull request, #183: Prepare for 0.8.0 release

2022-08-31 Thread GitBox
andygrove opened a new pull request, #183: URL: https://github.com/apache/arrow-ballista/pull/183 # Which issue does this PR close? N/A # Rationale for this change I am planning on releasing Ballista 0.8.0 shortly after DataFusion 12.0.0 is released.

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #3301: Finish integrating `Expr::Is[Not]True` and similar expressions

2022-08-31 Thread GitBox
andygrove commented on code in PR #3301: URL: https://github.com/apache/arrow-datafusion/pull/3301#discussion_r960173802 ## datafusion/physical-expr/src/planner.rs: ## @@ -83,6 +83,84 @@ pub fn create_physical_expr( } } } +Expr::IsT

[GitHub] [arrow] Oooorchid closed issue #13703: [Java][C++] Java VectorSchemaRoot to C++ RecordBatchreader

2022-08-31 Thread GitBox
Oooorchid closed issue #13703: [Java][C++] Java VectorSchemaRoot to C++ RecordBatchreader URL: https://github.com/apache/arrow/issues/13703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow] kou commented on pull request #13911: ARROW-17081: [Java][Datasets] Move JNI build configuration from cpp/ to java/

2022-08-31 Thread GitBox
kou commented on PR #13911: URL: https://github.com/apache/arrow/pull/13911#issuecomment-1233636032 @lwhite1 @davisusanibar This is ready to review/merge. (But this is difficult to review without CMake skill...) I think that "Java / AMD64 Windows Server 2022 Java JDK 11" failure is

[GitHub] [arrow] cyb70289 commented on a diff in pull request #14015: ARROW-17586: [Go] String To Numeric cast functions

2022-08-31 Thread GitBox
cyb70289 commented on code in PR #14015: URL: https://github.com/apache/arrow/pull/14015#discussion_r960142254 ## go/arrow/compute/cast_test.go: ## @@ -1211,6 +1214,94 @@ func (c *CastSuite) TestDecimalToFloating() { } } +func (c *CastSuite) TestStringToInt() { +

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3319: Remove panics from `datafusion-core` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3319: URL: https://github.com/apache/arrow-datafusion/issues/3319 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/core/src/avro_to_arrow/arro

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3318: Remove panics from `datafusion-proto` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3318: URL: https://github.com/apache/arrow-datafusion/issues/3318 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/proto/build.rs":33 let

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3317: Remove panics from `datafusion-row` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3317: URL: https://github.com/apache/arrow-datafusion/issues/3317 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/row/src/accessor.rs":234

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3316: Remove panics from `datafusion-physical-expr` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3316: URL: https://github.com/apache/arrow-datafusion/issues/3316 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/physical-expr/src/aggregate

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3315: Remove panics from `datafusion-sql` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3315: URL: https://github.com/apache/arrow-datafusion/issues/3315 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/sql/examples/sql.rs":46

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3314: Remove panics from `datafusion-optimizer` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3314: URL: https://github.com/apache/arrow-datafusion/issues/3314 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/optimizer/src/common_subexp

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3313: Remove panics from `datafusion-common` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3313: URL: https://github.com/apache/arrow-datafusion/issues/3313 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/common/src/from_slice.rs":6

[GitHub] [arrow-datafusion] andygrove opened a new issue, #3312: Remove panics from `datafusion-expr` crate

2022-08-31 Thread GitBox
andygrove opened a new issue, #3312: URL: https://github.com/apache/arrow-datafusion/issues/3312 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** ``` "/home/andy/git/apache/arrow-datafusion/datafusion/expr/src/binary_rule.rs

[GitHub] [arrow] cyb70289 commented on a diff in pull request #13640: ARROW-14635: [Python][C++] add O_DIRECT support to writes

2022-08-31 Thread GitBox
cyb70289 commented on code in PR #13640: URL: https://github.com/apache/arrow/pull/13640#discussion_r960137984 ## cpp/src/arrow/io/file.cc: ## @@ -378,6 +378,109 @@ Status FileOutputStream::Write(const void* data, int64_t length) { int FileOutputStream::file_descriptor() con

[GitHub] [arrow] pcmoritz merged pull request #14001: ARROW-17079: [C++] Raise proper error message instead of error code for S3 errors

2022-08-31 Thread GitBox
pcmoritz merged PR #14001: URL: https://github.com/apache/arrow/pull/14001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apac

[GitHub] [arrow-ballista] avantgardnerio commented on pull request #181: Update task status to the its job curator scheduler

2022-08-31 Thread GitBox
avantgardnerio commented on PR #181: URL: https://github.com/apache/arrow-ballista/pull/181#issuecomment-1233586534 @alamb might be able to help with reviews as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow-rs] liukun4515 commented on pull request #2586: update flight doc and change the result of SchemaResult

2022-08-31 Thread GitBox
liukun4515 commented on PR #2586: URL: https://github.com/apache/arrow-rs/pull/2586#issuecomment-1233570047 I think we have many thought about the `protoc` and generated file, can we discuss it in this https://github.com/apache/arrow-rs/issues/2616 In this PR, how about we just focus

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3309: add SQL support for tinyint and all unsigned INTs

2022-08-31 Thread GitBox
alamb commented on code in PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#discussion_r960101701 ## docs/source/user-guide/sql/data_types.md: ## @@ -34,15 +35,20 @@ are mapped to [Arrow data types](https://docs.rs/arrow/latest/arrow/datatypes/en ## Numeri

[GitHub] [arrow] wjones127 commented on a diff in pull request #14018: ARROW-14161: [C++][Docs] Improve Parquet C++ docs

2022-08-31 Thread GitBox
wjones127 commented on code in PR #14018: URL: https://github.com/apache/arrow/pull/14018#discussion_r960079278 ## cpp/src/parquet/properties.h: ## @@ -607,7 +622,7 @@ static constexpr bool kArrowDefaultUseThreads = false; // Default number of rows to read when using ::arrow::R

[GitHub] [arrow] github-actions[bot] commented on pull request #14018: ARROW-14161: [C++][Docs] Improve Parquet C++ docs

2022-08-31 Thread GitBox
github-actions[bot] commented on PR #14018: URL: https://github.com/apache/arrow/pull/14018#issuecomment-1233493508 https://issues.apache.org/jira/browse/ARROW-14161 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] wjones127 opened a new pull request, #14018: ARROW-14161: [C++][Docs] Improve Parquet C++ docs

2022-08-31 Thread GitBox
wjones127 opened a new pull request, #14018: URL: https://github.com/apache/arrow/pull/14018 Also improving a few APIs along the way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [arrow] ianmcook commented on pull request #14017: ARROW-17589: [Docs] Clarify that C data interface is not strictly for within a single process

2022-08-31 Thread GitBox
ianmcook commented on PR #14017: URL: https://github.com/apache/arrow/pull/14017#issuecomment-1233487671 I am not sure whether this distinction between "running in the same process" and "able to access the same memory address space" is of real practical significance. Is it actually possible

[GitHub] [arrow] zeroshade commented on a diff in pull request #14006: ARROW-17551: [Go] Implement Temporal Cast Functions

2022-08-31 Thread GitBox
zeroshade commented on code in PR #14006: URL: https://github.com/apache/arrow/pull/14006#discussion_r960075894 ## go/arrow/compute/internal/kernels/constant_factor_amd64.go: ## @@ -0,0 +1,17 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

[GitHub] [arrow-datafusion] matthewmturner commented on pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
matthewmturner commented on PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#issuecomment-1233482397 @avantgardnerio This is very cool. Unfortunately I haven't been able to work on datafusion lately but I could definitely see integrating this into datafusion tui again wh

[GitHub] [arrow] lidavidm commented on a diff in pull request #14006: ARROW-17551: [Go] Implement Temporal Cast Functions

2022-08-31 Thread GitBox
lidavidm commented on code in PR #14006: URL: https://github.com/apache/arrow/pull/14006#discussion_r960073199 ## go/arrow/compute/internal/kernels/constant_factor_amd64.go: ## @@ -0,0 +1,17 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

[GitHub] [arrow] zeroshade commented on a diff in pull request #14006: ARROW-17551: [Go] Implement Temporal Cast Functions

2022-08-31 Thread GitBox
zeroshade commented on code in PR #14006: URL: https://github.com/apache/arrow/pull/14006#discussion_r960072457 ## go/arrow/compute/internal/kernels/constant_factor_amd64.go: ## @@ -0,0 +1,17 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contribut

[GitHub] [arrow] ursabot commented on pull request #14013: ARROW-17573: [Go][Parquet] ByteArray statistics can cause memory leak

2022-08-31 Thread GitBox
ursabot commented on PR #14013: URL: https://github.com/apache/arrow/pull/14013#issuecomment-1233479944 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/7eb54c5676ea4a63ac33f34c37f0e58a...bd2cfe5e04574f989ceb06b5213b9456/)

[GitHub] [arrow] ursabot commented on pull request #14013: ARROW-17573: [Go][Parquet] ByteArray statistics can cause memory leak

2022-08-31 Thread GitBox
ursabot commented on PR #14013: URL: https://github.com/apache/arrow/pull/14013#issuecomment-1233479701 Benchmark runs are scheduled for baseline = 0c76c664d7f6f6ca0c094b1eb51f57a17a6860fd and contender = 13a7b605ede88ca15b053f119909c48d0919c6f8. 13a7b605ede88ca15b053f119909c48d0919c6f8 is

[GitHub] [arrow] github-actions[bot] commented on pull request #14017: ARROW-17589: [Docs] Clarify that C data interface is not strictly for within a single process

2022-08-31 Thread GitBox
github-actions[bot] commented on PR #14017: URL: https://github.com/apache/arrow/pull/14017#issuecomment-1233473452 https://issues.apache.org/jira/browse/ARROW-17589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] pcmoritz commented on pull request #14001: ARROW-17079: [C++] Raise proper error message instead of error code for S3 errors

2022-08-31 Thread GitBox
pcmoritz commented on PR #14001: URL: https://github.com/apache/arrow/pull/14001#issuecomment-1233456609 Thanks, I updated the PR and left a comment about it (one of the builds is using an older version of the SDK that doesn't have this error in the enum yet). This error was added ~2 years

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#issuecomment-1233456171 I guess I should mention @matthewmturner since I linked that ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [arrow] icexelloss commented on a diff in pull request #13880: ARROW-17412: [C++] AsofJoin multiple keys and types

2022-08-31 Thread GitBox
icexelloss commented on code in PR #13880: URL: https://github.com/apache/arrow/pull/13880#discussion_r960054357 ## cpp/src/arrow/compute/exec/asof_join_node.cc: ## @@ -294,10 +452,22 @@ class InputState { // Index of the time col col_index_t time_col_index_; // Index o

[GitHub] [arrow] icexelloss commented on a diff in pull request #13880: ARROW-17412: [C++] AsofJoin multiple keys and types

2022-08-31 Thread GitBox
icexelloss commented on code in PR #13880: URL: https://github.com/apache/arrow/pull/13880#discussion_r960053008 ## cpp/src/arrow/compute/exec/asof_join_node.cc: ## @@ -294,10 +452,22 @@ class InputState { // Index of the time col col_index_t time_col_index_; // Index o

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2608: Cast timestamp array to string array with timezone

2022-08-31 Thread GitBox
viirya commented on code in PR #2608: URL: https://github.com/apache/arrow-rs/pull/2608#discussion_r960034963 ## arrow/src/compute/kernels/mod.rs: ## @@ -35,5 +35,6 @@ pub mod sort; pub mod substring; pub mod take; pub mod temporal; +// pub(crate) use temporal::extract_compon

[GitHub] [arrow] lidavidm commented on a diff in pull request #14006: ARROW-17551: [Go] Implement Temporal Cast Functions

2022-08-31 Thread GitBox
lidavidm commented on code in PR #14006: URL: https://github.com/apache/arrow/pull/14006#discussion_r960021948 ## go/arrow/compute/internal/kernels/constant_factor_amd64.go: ## @@ -0,0 +1,17 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributo

[GitHub] [arrow-rs] viirya commented on a diff in pull request #2608: Cast timestamp array to string array with timezone

2022-08-31 Thread GitBox
viirya commented on code in PR #2608: URL: https://github.com/apache/arrow-rs/pull/2608#discussion_r960034640 ## arrow/src/compute/kernels/temporal.rs: ## @@ -28,33 +28,33 @@ use chrono::format::{parse, Parsed}; use chrono::FixedOffset; macro_rules! extract_component_from_ar

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#issuecomment-1233420175 Oh, also related to https://github.com/apache/arrow-datafusion/issues/2025 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow-datafusion] andygrove commented on pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
andygrove commented on PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#issuecomment-1233418173 I think this looks like a great start. Thanks @avantgardnerio -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kou commented on pull request #13677: ARROW-17089: [Python] Use `.arrow` as extension for IPC file dataset

2022-08-31 Thread GitBox
kou commented on PR #13677: URL: https://github.com/apache/arrow/pull/13677#issuecomment-1233418053 Could you wait for a few days to collect comments from people as much as possible? Then, could you send an email that summarizes this discussion (many people support the option 1.) and ask

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on code in PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#discussion_r960017681 ## datafusion/sql/src/parser.rs: ## @@ -41,10 +41,7 @@ fn parse_file_type(s: &str) -> Result { "NDJSON" => Ok(FileType::NdJson), "CSV"

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on code in PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#discussion_r960017681 ## datafusion/sql/src/parser.rs: ## @@ -41,10 +41,7 @@ fn parse_file_type(s: &str) -> Result { "NDJSON" => Ok(FileType::NdJson), "CSV"

[GitHub] [arrow-adbc] lidavidm merged pull request #97: [Go][Format] initial attempt at wrapping adbc for database/sql

2022-08-31 Thread GitBox
lidavidm merged PR #97: URL: https://github.com/apache/arrow-adbc/pull/97 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow-datafusion] avantgardnerio commented on a diff in pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on code in PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#discussion_r960013451 ## datafusion/sql/src/parser.rs: ## @@ -41,10 +41,7 @@ fn parse_file_type(s: &str) -> Result { "NDJSON" => Ok(FileType::NdJson), "CSV"

[GitHub] [arrow-datafusion] avantgardnerio commented on pull request #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio commented on PR #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311#issuecomment-1233404922 @andygrove @alamb @houqp This was a quick and dirty implementation, but I thought I would throw it out there to collect your thoughts on it. If something like this is dee

[GitHub] [arrow-datafusion] avantgardnerio opened a new pull request, #3311: Custom table provider factories

2022-08-31 Thread GitBox
avantgardnerio opened a new pull request, #3311: URL: https://github.com/apache/arrow-datafusion/pull/3311 # Which issue does this PR close? Closes #3310. # Rationale for this change I would like to allow users to register custom table types at runtime. # What cha

[GitHub] [arrow] pitrou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
pitrou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233402581 @xhochy What do you think about @h-vetinari 's suggestion [just above](https://github.com/apache/arrow/pull/13991#issuecomment-1233328876)? -- This is an automated message from the Apac

[GitHub] [arrow-datafusion] avantgardnerio opened a new issue, #3310: Support registration of custom TableProviders through SQL

2022-08-31 Thread GitBox
avantgardnerio opened a new issue, #3310: URL: https://github.com/apache/arrow-datafusion/issues/3310 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** As a DataFusion / Ballista user, one awesome feature is that I can create cu

[GitHub] [arrow] nealrichardson commented on a diff in pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
nealrichardson commented on code in PR #13991: URL: https://github.com/apache/arrow/pull/13991#discussion_r959996740 ## r/configure: ## @@ -208,22 +208,19 @@ if grep raspbian /etc/os-release >/dev/null 2>&1; then PKG_LIBS="-latomic $PKG_LIBS" fi -# If libarrow uses the old

[GitHub] [arrow-nanoarrow] nealrichardson commented on a diff in pull request #39: [R] Scaffold nanoarrow_schema, nanoarrow_array, and nanoarrow_array_stream S3 classes

2022-08-31 Thread GitBox
nealrichardson commented on code in PR #39: URL: https://github.com/apache/arrow-nanoarrow/pull/39#discussion_r959988111 ## r/R/pkg-arrow.R: ## @@ -0,0 +1,168 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOT

[GitHub] [arrow] h-vetinari commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
h-vetinari commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233375603 > Hmm, how could I do that? Create an issue on the feedstock for someone from the maintainers (I'm not one) to create a branch for 1.8.x; add it to `conda-forge.yml` under `bot/

[GitHub] [arrow-nanoarrow] nealrichardson commented on a diff in pull request #39: [R] Scaffold nanoarrow_schema, nanoarrow_array, and nanoarrow_array_stream S3 classes

2022-08-31 Thread GitBox
nealrichardson commented on code in PR #39: URL: https://github.com/apache/arrow-nanoarrow/pull/39#discussion_r959982497 ## r/NAMESPACE: ## @@ -1,4 +1,33 @@ # Generated by roxygen2: do not edit by hand +S3method(as_nanoarrow_array,Array) +S3method(as_nanoarrow_array,ChunkedAr

[GitHub] [arrow-nanoarrow] nealrichardson commented on a diff in pull request #39: [R] Scaffold nanoarrow_schema, nanoarrow_array, and nanoarrow_array_stream S3 classes

2022-08-31 Thread GitBox
nealrichardson commented on code in PR #39: URL: https://github.com/apache/arrow-nanoarrow/pull/39#discussion_r959982497 ## r/NAMESPACE: ## @@ -1,4 +1,33 @@ # Generated by roxygen2: do not edit by hand +S3method(as_nanoarrow_array,Array) +S3method(as_nanoarrow_array,ChunkedAr

[GitHub] [arrow-rs] sunchao commented on a diff in pull request #2608: Cast timestamp array to string array with timezone

2022-08-31 Thread GitBox
sunchao commented on code in PR #2608: URL: https://github.com/apache/arrow-rs/pull/2608#discussion_r959969219 ## arrow/src/compute/kernels/temporal.rs: ## @@ -28,33 +28,33 @@ use chrono::format::{parse, Parsed}; use chrono::FixedOffset; macro_rules! extract_component_from_a

[GitHub] [arrow-datafusion] kmitchener commented on pull request #3309: add tinyint SQL support

2022-08-31 Thread GitBox
kmitchener commented on PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#issuecomment-1233355829 I've added the unsigned ints to this PR as well, matching what's in the issue -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] ursabot commented on pull request #14012: MINOR: [Go] Update direct go dependencies

2022-08-31 Thread GitBox
ursabot commented on PR #14012: URL: https://github.com/apache/arrow/pull/14012#issuecomment-1233350076 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/71bff535c16a4b76afa4e285d55e6728...7eb54c5676ea4a63ac33f34c37f0e58a/)

[GitHub] [arrow] ursabot commented on pull request #14012: MINOR: [Go] Update direct go dependencies

2022-08-31 Thread GitBox
ursabot commented on PR #14012: URL: https://github.com/apache/arrow/pull/14012#issuecomment-1233349775 Benchmark runs are scheduled for baseline = cf27001da088d882a7d460cddd84a0202f3d8eba and contender = 0c76c664d7f6f6ca0c094b1eb51f57a17a6860fd. 0c76c664d7f6f6ca0c094b1eb51f57a17a6860fd is

[GitHub] [arrow-rs] viirya commented on pull request #2608: Cast timestamp array to string array with timezone

2022-08-31 Thread GitBox
viirya commented on PR #2608: URL: https://github.com/apache/arrow-rs/pull/2608#issuecomment-126578 cc @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow] marsupialtail commented on a diff in pull request #13931: ARROW-17481: [C++] [Python] Major performance improvements to CSV reading from S3

2022-08-31 Thread GitBox
marsupialtail commented on code in PR #13931: URL: https://github.com/apache/arrow/pull/13931#discussion_r959949490 ## cpp/src/arrow/io/interfaces.h: ## @@ -343,5 +344,8 @@ ARROW_EXPORT Result>> MakeInputStreamIterator( std::shared_ptr stream, int64_t block_size); +ARROW

[GitHub] [arrow] pitrou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
pitrou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-124277 > try to rebuild 1.8 with the current pinning in conda-forge Hmm, how could I do that? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow] pitrou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
pitrou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-123633 Unfortunately we had to pin aws-sdk-cpp because of https://github.com/aws/aws-sdk-cpp/issues/1809 -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [arrow-rs] ursabot commented on pull request #2618: Support comparison between DictionaryArray and BooleanArray

2022-08-31 Thread GitBox
ursabot commented on PR #2618: URL: https://github.com/apache/arrow-rs/pull/2618#issuecomment-123140 Benchmark runs are scheduled for baseline = 47995d0b062bbccfaa98cab13db6fb444855a4ab and contender = f9a309445550e596e32a0dbf29879946f44c1dc6. f9a309445550e596e32a0dbf29879946f44c1dc6 i

[GitHub] [arrow] marsupialtail commented on pull request #13640: ARROW-14635: [Python][C++] add O_DIRECT support to writes

2022-08-31 Thread GitBox
marsupialtail commented on PR #13640: URL: https://github.com/apache/arrow/pull/13640#issuecomment-120909 Have addressed most of @pitrou comments except the hard coded 4096. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-rs] viirya commented on pull request #2618: Support comparison between DictionaryArray and BooleanArray

2022-08-31 Thread GitBox
viirya commented on PR #2618: URL: https://github.com/apache/arrow-rs/pull/2618#issuecomment-1233328905 Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscri

[GitHub] [arrow] h-vetinari commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
h-vetinari commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233328876 > The current state of things is here: Seems there's not much pinned except `aws-sdk-cpp`, however that one's really [old](https://github.com/aws/aws-sdk-cpp/tags?after=1.9.1). C

[GitHub] [arrow-rs] viirya merged pull request #2618: Support comparison between DictionaryArray and BooleanArray

2022-08-31 Thread GitBox
viirya merged PR #2618: URL: https://github.com/apache/arrow-rs/pull/2618 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apach

[GitHub] [arrow-rs] viirya closed issue #2617: Support comparison between DictionaryArray and BooleanArray

2022-08-31 Thread GitBox
viirya closed issue #2617: Support comparison between DictionaryArray and BooleanArray URL: https://github.com/apache/arrow-rs/issues/2617 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [arrow-datafusion] ozgrakkurt commented on pull request #3124: impl binary ops between binary arrays and scalars

2022-08-31 Thread GitBox
ozgrakkurt commented on PR #3124: URL: https://github.com/apache/arrow-datafusion/pull/3124#issuecomment-1233325000 @alamb sorry, can you review again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-rs] ursabot commented on pull request #2619: MINOR: move gcp.rs to gcp/mod.rs

2022-08-31 Thread GitBox
ursabot commented on PR #2619: URL: https://github.com/apache/arrow-rs/pull/2619#issuecomment-1233315882 Benchmark runs are scheduled for baseline = 9f4d56d53925c61ce99195390b264aa2cca6dac2 and contender = 47995d0b062bbccfaa98cab13db6fb444855a4ab. 47995d0b062bbccfaa98cab13db6fb444855a4ab i

[GitHub] [arrow] pitrou commented on pull request #14010: ARROW-17571: [Benchmarks] Default build for PyArrow seems to be debug

2022-08-31 Thread GitBox
pitrou commented on PR #14010: URL: https://github.com/apache/arrow/pull/14010#issuecomment-1233315115 > If I understand correctly you are confirming this PR makes the correct change in the Benchmark build? Yes :-) -- This is an automated message from the Apache Git Service. To res

[GitHub] [arrow-datafusion] ozgrakkurt commented on pull request #3124: impl binary ops between binary arrays and scalars

2022-08-31 Thread GitBox
ozgrakkurt commented on PR #3124: URL: https://github.com/apache/arrow-datafusion/pull/3124#issuecomment-1233315061 @alamb committing fix for digest function signature and formatting errors -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] anjakefala commented on pull request #13947: ARROW-17464: [C++] Store/Read float16 values in Parquet as FixedSizeBinary

2022-08-31 Thread GitBox
anjakefala commented on PR #13947: URL: https://github.com/apache/arrow/pull/13947#issuecomment-1233313868 (Force push was needed to update my fork with updates from `apache/arrow` repository. No new changes.) -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3124: impl binary ops between binary arrays and scalars

2022-08-31 Thread GitBox
alamb commented on code in PR #3124: URL: https://github.com/apache/arrow-datafusion/pull/3124#discussion_r959928739 ## datafusion/expr/src/function.rs: ## @@ -73,7 +73,23 @@ macro_rules! make_utf8_to_return_type { make_utf8_to_return_type!(utf8_to_str_type, DataType::LargeUt

[GitHub] [arrow] pitrou commented on a diff in pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
pitrou commented on code in PR #13991: URL: https://github.com/apache/arrow/pull/13991#discussion_r959929666 ## docs/source/developers/cpp/building.rst: ## @@ -39,8 +39,8 @@ out-of-source. If you are not familiar with this terminology: Building requires: -* A C++11-enabled

[GitHub] [arrow] pitrou commented on pull request #13991: ARROW-17545: [C++][CI] Mandate C++17 instead of C++11

2022-08-31 Thread GitBox
pitrou commented on PR #13991: URL: https://github.com/apache/arrow/pull/13991#issuecomment-1233311734 > what versions of grpc, google-cloud-cpp (& protobuf) are you using? I'm presuming some parts are pinned / constrained, or is everything free? The current state of things is here:

[GitHub] [arrow-rs] avantgardnerio commented on issue #2616: Do not check in the generated file from `proto`

2022-08-31 Thread GitBox
avantgardnerio commented on issue #2616: URL: https://github.com/apache/arrow-rs/issues/2616#issuecomment-1233310536 > as another set of instructions :point_up: this, 100% scripts as docs never get out of date and the testing is built in :) It's too bad docker doesn't support s

[GitHub] [arrow-rs] tustvold merged pull request #2619: MINOR: move gcp.rs to gcp/mod.rs

2022-08-31 Thread GitBox
tustvold merged PR #2619: URL: https://github.com/apache/arrow-rs/pull/2619 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow-rs] alamb commented on issue #2178: object_store: Support assuming roles directly when using AWS S3

2022-08-31 Thread GitBox
alamb commented on issue #2178: URL: https://github.com/apache/arrow-rs/issues/2178#issuecomment-1233305958 > I think my preferred path forward would be for SDK-based implementations to be maintained out of tree, in the same vein as HDFS This is a neat idea 👍 -- This is an automat

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #3309: add tinyint SQL support

2022-08-31 Thread GitBox
alamb commented on code in PR #3309: URL: https://github.com/apache/arrow-datafusion/pull/3309#discussion_r959915916 ## datafusion/core/tests/sql/mod.rs: ## @@ -628,7 +628,7 @@ async fn register_aggregate_csv_by_sql(ctx: &SessionContext) { " CREATE EXTERNAL TA

[GitHub] [arrow-rs] tustvold commented on issue #2178: object_store: Support assuming roles directly when using AWS S3

2022-08-31 Thread GitBox
tustvold commented on issue #2178: URL: https://github.com/apache/arrow-rs/issues/2178#issuecomment-1233299616 Part of the motivation for moving away from the SDKs was we had serious difficulties providing consistent behaviour across stores, with missing functionality and inconsistent error

[GitHub] [arrow-rs] alamb commented on issue #2616: Do not check in the generated file from `proto`

2022-08-31 Thread GitBox
alamb commented on issue #2616: URL: https://github.com/apache/arrow-rs/issues/2616#issuecomment-1233294962 Our CI jobs probably also need to install protoc which we can point to as another set of instructions of what to use -- This is an automated message from the Apache Git Service. To

[GitHub] [arrow-adbc] zeroshade commented on a diff in pull request #97: initial attempt at wrapping adbc for database/sql

2022-08-31 Thread GitBox
zeroshade commented on code in PR #97: URL: https://github.com/apache/arrow-adbc/pull/97#discussion_r959911742 ## adbc.h: ## @@ -654,6 +654,71 @@ AdbcStatusCode AdbcConnectionReadPartition(struct AdbcConnection* connection, /// enabled. #define ADBC_CONNECTION_OPTION_AUTOCO

[GitHub] [arrow-rs] alamb commented on issue #2387: Decimal Precision Validation

2022-08-31 Thread GitBox
alamb commented on issue #2387: URL: https://github.com/apache/arrow-rs/issues/2387#issuecomment-1233293594 I think it is a safe option to follow the C/C++ interface. 👍 I don't really have a strong opinion here other than like @tustvold I would like a consistent rule that we follow.

  1   2   3   4   >