[GitHub] [arrow-datafusion] unconsolable commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
unconsolable commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1061213238 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -2016,14 +2016,6 @@ mod tests { col("c1").eq(bool_expr.clone()), // u32 A

[GitHub] [arrow-datafusion] unconsolable commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
unconsolable commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1061213238 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -2016,14 +2016,6 @@ mod tests { col("c1").eq(bool_expr.clone()), // u32 A

[GitHub] [arrow-datafusion] unconsolable commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
unconsolable commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1061213238 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -2016,14 +2016,6 @@ mod tests { col("c1").eq(bool_expr.clone()), // u32 A

[GitHub] [arrow-datafusion] unconsolable commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
unconsolable commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1061212718 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -2016,14 +2016,6 @@ mod tests { col("c1").eq(bool_expr.clone()), // u32 A

[GitHub] [arrow] ursabot commented on pull request #15170: MINOR: [C++][Parquet] Make DeltaBitPack roundtrip tests faster

2023-01-03 Thread GitBox
ursabot commented on PR #15170: URL: https://github.com/apache/arrow/pull/15170#issuecomment-1370584445 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/347d2385608d40f19b03004cd5f8264b...ce063557f7a14c2db3c9645e9936572d/)

[GitHub] [arrow] ursabot commented on pull request #15170: MINOR: [C++][Parquet] Make DeltaBitPack roundtrip tests faster

2023-01-03 Thread GitBox
ursabot commented on PR #15170: URL: https://github.com/apache/arrow/pull/15170#issuecomment-1370584303 Benchmark runs are scheduled for baseline = 63b91cc1f7131356537ab9cbb84ed108d6f9102e and contender = 793e5f6251255cfe812f4f187f2924224fefad8b. 793e5f6251255cfe812f4f187f2924224fefad8b is

[GitHub] [arrow-datafusion] crepererum commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
crepererum commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1061206032 ## datafusion/core/src/physical_plan/planner.rs: ## @@ -2016,14 +2016,6 @@ mod tests { col("c1").eq(bool_expr.clone()), // u32 AND

[GitHub] [arrow-rs] tustvold commented on pull request #3431: Support decimal int32/64 for writer

2023-01-03 Thread GitBox
tustvold commented on PR #3431: URL: https://github.com/apache/arrow-rs/pull/3431#issuecomment-1370578823 > I think the data in the arrow ecosystem is exchanged by IPC format Sometimes, but an important property is that data written by one implementation to CSV, Parquet, or whatever c

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4777: Pipeline-friendly Bounded Memory Window Executor

2023-01-03 Thread GitBox
mustafasrepo commented on code in PR #4777: URL: https://github.com/apache/arrow-datafusion/pull/4777#discussion_r1061201955 ## datafusion/core/tests/sql/window.rs: ## @@ -2351,3 +2353,251 @@ async fn test_window_agg_sort_orderby_reversed_partitionby_reversed_plan() -> Re

[GitHub] [arrow] mapleFU commented on issue #15107: [C++][Parquet] Support RLE Encoder for Boolean type

2023-01-03 Thread GitBox
mapleFU commented on issue #15107: URL: https://github.com/apache/arrow/issues/15107#issuecomment-1370573808 @pitrou mind take a look? All change is ok for me. Maybe I can use `PlainEncoder` to cache all values and write to `RleEncoder` onces first, and later another patch to change to `Rle

[GitHub] [arrow-datafusion] liukun4515 opened a new pull request, #4810: Add test cases: row group filter with missing statistics for decimal data type

2023-01-03 Thread GitBox
liukun4515 opened a new pull request, #4810: URL: https://github.com/apache/arrow-datafusion/pull/4810 # Which issue does this PR close? From the comments @alamb https://github.com/apache/arrow-datafusion/pull/4742#discussion_r1059755347 If the min or max is None which

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4777: Pipeline-friendly Bounded Memory Window Executor

2023-01-03 Thread GitBox
mustafasrepo commented on code in PR #4777: URL: https://github.com/apache/arrow-datafusion/pull/4777#discussion_r1061197203 ## datafusion/physical-expr/src/window/window_expr.rs: ## @@ -132,3 +151,129 @@ pub fn reverse_order_bys(order_bys: &[PhysicalSortExpr]) -> Vec), +Ag

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4777: Pipeline-friendly Bounded Memory Window Executor

2023-01-03 Thread GitBox
mustafasrepo commented on code in PR #4777: URL: https://github.com/apache/arrow-datafusion/pull/4777#discussion_r1061179316 ## datafusion/core/src/physical_plan/common.rs: ## @@ -95,6 +96,47 @@ pub async fn collect(stream: SendableRecordBatchStream) -> Result

[GitHub] [arrow-datafusion] mustafasrepo commented on a diff in pull request #4777: Pipeline-friendly Bounded Memory Window Executor

2023-01-03 Thread GitBox
mustafasrepo commented on code in PR #4777: URL: https://github.com/apache/arrow-datafusion/pull/4777#discussion_r1061176930 ## datafusion/core/src/physical_optimizer/pipeline_checker.rs: ## @@ -301,7 +301,7 @@ mod sql_tests { let case = QueryCase { sql: "S

[GitHub] [arrow-datafusion] ygf11 commented on pull request #4797: rewrite the function `ensure_any_column_reference_is_unambiguous`

2023-01-03 Thread GitBox
ygf11 commented on PR #4797: URL: https://github.com/apache/arrow-datafusion/pull/4797#issuecomment-1370539781 Make sense to me. Thank you @HaoYang670. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow-datafusion] cfraz89 commented on pull request #4788: fix: ListingSchemaProvider directory paths (related: #4204)

2023-01-03 Thread GitBox
cfraz89 commented on PR #4788: URL: https://github.com/apache/arrow-datafusion/pull/4788#issuecomment-1370507129 No problems, thanks the great library! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] vibhatha commented on a diff in pull request #15097: GH-15096: [C++] Substrait ProjectRel Emit Optimization

2023-01-03 Thread GitBox
vibhatha commented on code in PR #15097: URL: https://github.com/apache/arrow/pull/15097#discussion_r1061147280 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -216,6 +216,20 @@ void CheckRoundTripResult(const std::shared_ptr expected_table, compute::AssertTablesEqual

[GitHub] [arrow] westonpace commented on a diff in pull request #15097: GH-15096: [C++] Substrait ProjectRel Emit Optimization

2023-01-03 Thread GitBox
westonpace commented on code in PR #15097: URL: https://github.com/apache/arrow/pull/15097#discussion_r1061142323 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -216,6 +216,20 @@ void CheckRoundTripResult(const std::shared_ptr expected_table, compute::AssertTablesEqu

[GitHub] [arrow] zhixingheyi-tian commented on pull request #14353: ARROW-17735: [C++][Parquet] Optimize parquet reading for String/Binary type

2023-01-03 Thread GitBox
zhixingheyi-tian commented on PR #14353: URL: https://github.com/apache/arrow/pull/14353#issuecomment-1370500077 @cyb70289 @wjones127 During the last two weeks, was suffering high fever. Sorry! Now, Will continue to support this patch! -- This is an automated message from the A

[GitHub] [arrow] ursabot commented on pull request #14518: ARROW-18101: [R] RecordBatchReaderHead from ExecPlan with UDF cannot be read

2023-01-03 Thread GitBox
ursabot commented on PR #14518: URL: https://github.com/apache/arrow/pull/14518#issuecomment-1370497342 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/70ee2038cde649c98614cde200be7dd6...347d2385608d40f19b03004cd5f8264b/)

[GitHub] [arrow] ursabot commented on pull request #14518: ARROW-18101: [R] RecordBatchReaderHead from ExecPlan with UDF cannot be read

2023-01-03 Thread GitBox
ursabot commented on PR #14518: URL: https://github.com/apache/arrow/pull/14518#issuecomment-1370497203 Benchmark runs are scheduled for baseline = 139a13e320b9a47161ff506c90c5facaec8b773c and contender = 63b91cc1f7131356537ab9cbb84ed108d6f9102e. 63b91cc1f7131356537ab9cbb84ed108d6f9102e is

[GitHub] [arrow-rs] Jefffrey commented on pull request #3450: Meaningful error message for map builder with null keys

2023-01-03 Thread GitBox
Jefffrey commented on PR #3450: URL: https://github.com/apache/arrow-rs/pull/3450#issuecomment-1370479310 cc @tustvold -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [arrow-rs] Jefffrey commented on a diff in pull request #3450: Meaningful error message for map builder with null keys

2023-01-03 Thread GitBox
Jefffrey commented on code in PR #3450: URL: https://github.com/apache/arrow-rs/pull/3450#discussion_r1061128427 ## arrow-array/src/builder/map_builder.rs: ## @@ -143,56 +142,49 @@ impl MapBuilder { /// Builds the [`MapArray`] pub fn finish(&mut self) -> MapArray {

[GitHub] [arrow] vibhatha commented on a diff in pull request #15097: GH-15096: [C++] Substrait ProjectRel Emit Optimization

2023-01-03 Thread GitBox
vibhatha commented on code in PR #15097: URL: https://github.com/apache/arrow/pull/15097#discussion_r1061128100 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -216,6 +216,20 @@ void CheckRoundTripResult(const std::shared_ptr expected_table, compute::AssertTablesEqual

[GitHub] [arrow-rs] Jefffrey commented on a diff in pull request #3450: Meaningful error message for map builder with null keys

2023-01-03 Thread GitBox
Jefffrey commented on code in PR #3450: URL: https://github.com/apache/arrow-rs/pull/3450#discussion_r1061127939 ## arrow-array/src/array/map_array.rs: ## @@ -110,11 +110,12 @@ impl From for ArrayData { impl MapArray { fn try_new_from_array_data(data: ArrayData) -> Resul

[GitHub] [arrow-rs] Jefffrey opened a new pull request, #3450: Meaningful error message for map builder with null keys

2023-01-03 Thread GitBox
Jefffrey opened a new pull request, #3450: URL: https://github.com/apache/arrow-rs/pull/3450 # Which issue does this PR close? Follow-on from discussion in PR: https://github.com/apache/arrow-rs/pull/3205#discussion_r1033776747 # Rationale for this change Have c

[GitHub] [arrow-julia] quinnj commented on pull request #369: Create CompatHelper.yml

2023-01-03 Thread GitBox
quinnj commented on PR #369: URL: https://github.com/apache/arrow-julia/pull/369#issuecomment-1370470309 @CarloLucibello, can you remind me what teh CompatHelper does for the repo? I can't quite remember what it's used for. -- This is an automated message from the Apache Git Service. To r

[GitHub] [arrow-julia] quinnj commented on a diff in pull request #369: Create CompatHelper.yml

2023-01-03 Thread GitBox
quinnj commented on code in PR #369: URL: https://github.com/apache/arrow-julia/pull/369#discussion_r1061123184 ## .github/workflows/CompatHelper.yml: ## @@ -0,0 +1,45 @@ +name: CompatHelper Review Comment: One of the CI checks is failing because we have to include the apach

[GitHub] [arrow-ballista] dependabot[bot] commented on pull request #563: Update arrow requirement from 28.0.0 to 29.0.0

2023-01-03 Thread GitBox
dependabot[bot] commented on PR #563: URL: https://github.com/apache/arrow-ballista/pull/563#issuecomment-1370469980 Superseded by #583. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [arrow-ballista] dependabot[bot] closed pull request #563: Update arrow requirement from 28.0.0 to 29.0.0

2023-01-03 Thread GitBox
dependabot[bot] closed pull request #563: Update arrow requirement from 28.0.0 to 29.0.0 URL: https://github.com/apache/arrow-ballista/pull/563 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow-ballista] dependabot[bot] opened a new pull request, #583: Update arrow requirement from 28.0.0 to 30.0.0

2023-01-03 Thread GitBox
dependabot[bot] opened a new pull request, #583: URL: https://github.com/apache/arrow-ballista/pull/583 Updates the requirements on [arrow](https://github.com/apache/arrow-rs) to permit the latest version. Changelog Sourced from https://github.com/apache/arrow-rs/blob/master/CHANGE

[GitHub] [arrow-ballista] dependabot[bot] commented on pull request #583: Update arrow requirement from 28.0.0 to 30.0.0

2023-01-03 Thread GitBox
dependabot[bot] commented on PR #583: URL: https://github.com/apache/arrow-ballista/pull/583#issuecomment-1370469964 The following labels could not be found: `auto-dependencies`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow-julia] quinnj commented on issue #370: github releases not in sync

2023-01-03 Thread GitBox
quinnj commented on issue #370: URL: https://github.com/apache/arrow-julia/issues/370#issuecomment-1370469296 There was the issue at one point where TagBot was mad because there was already a v2.0.0 tag (from Arrowtypes), so it couldn't make a v2.0.0 tag for Arrow.jl. -- This is an auto

[GitHub] [arrow-rs] winding-lines commented on a diff in pull request #3436: object_store: builder configuration api

2023-01-03 Thread GitBox
winding-lines commented on code in PR #3436: URL: https://github.com/apache/arrow-rs/pull/3436#discussion_r1061122387 ## object_store/src/aws/mod.rs: ## @@ -472,6 +614,49 @@ impl AmazonS3Builder { self } +/// Set an option on the builder via a key - value pai

[GitHub] [arrow-rs] JayjeetAtGithub opened a new pull request, #3449: Use tz json

2023-01-03 Thread GitBox
JayjeetAtGithub opened a new pull request, #3449: URL: https://github.com/apache/arrow-rs/pull/3449 # Which issue does this PR close? Closes #3416 # What changes are included in this PR? * Modify `array_value_to_string` to print a trailing `Z` in case UTC timestamps *

[GitHub] [arrow-rs] JayjeetAtGithub commented on pull request #3417: Use `array_value_to_string` when writing CSV, support RFC3339 style timestamps in arrow-json

2023-01-03 Thread GitBox
JayjeetAtGithub commented on PR #3417: URL: https://github.com/apache/arrow-rs/pull/3417#issuecomment-1370440978 Okay, looks like `arrow-csv` changes needs a little bit more scrutiny. So, for now, I am raising a PR just for `arrow-json`. -- This is an automated message from the Apache Git

[GitHub] [arrow-rs] liukun4515 commented on pull request #3431: Support decimal int32/64 for writer

2023-01-03 Thread GitBox
liukun4515 commented on PR #3431: URL: https://github.com/apache/arrow-rs/pull/3431#issuecomment-1370433954 I also find the schema mapping in java version of the `parquet-mr` project: https://github.com/apache/parquet-mr/blob/433de8df33fcf31927f7b51456be9f53e64d48b9/parquet-arrow/src/main/ja

[GitHub] [arrow] github-actions[bot] commented on pull request #15182: GH-15074: [Parquet][C++] change 16-bit page_ordinal to 32-bit

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15182: URL: https://github.com/apache/arrow/pull/15182#issuecomment-1370430853 :warning: GitHub issue #15074 **has no components**, please add labels for components. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] github-actions[bot] commented on pull request #15182: GH-15074: [Parquet][C++] change 16-bit page_ordinal to 32-bit

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15182: URL: https://github.com/apache/arrow/pull/15182#issuecomment-1370430848 :warning: GitHub issue #15074 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the mes

[GitHub] [arrow] github-actions[bot] commented on pull request #15182: GH-15074: [Parquet][C++] change 16-bit page_ordinal to 32-bit

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15182: URL: https://github.com/apache/arrow/pull/15182#issuecomment-1370430835 * Closes: #15074 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] mapleFU opened a new pull request, #15182: GH-15074: [Parquet][C++] change 16-bit page_ordinal to 32-bit

2023-01-03 Thread GitBox
mapleFU opened a new pull request, #15182: URL: https://github.com/apache/arrow/pull/15182 As we mentioned in https://github.com/apache/arrow/issues/15074 . `int16_t page_ordinal` may causing overflow. So, we need to change it to 32-bit. * [x] Implement the logic * [ ] Testing

[GitHub] [arrow-rs] liukun4515 commented on pull request #3431: Support decimal int32/64 for writer

2023-01-03 Thread GitBox
liukun4515 commented on PR #3431: URL: https://github.com/apache/arrow-rs/pull/3431#issuecomment-1370428422 > What is the ecosystem support for this like? Do all arrow implementations understand how to convert to a decimal128 from i32 or i64? Just wondering if we need to put this behind an

[GitHub] [arrow] mapleFU commented on pull request #14351: ARROW-17904: [C++] Parquet Implement crc in reading and writing Page for DATA_PAGE (v1)

2023-01-03 Thread GitBox
mapleFU commented on PR #14351: URL: https://github.com/apache/arrow/pull/14351#issuecomment-1370423619 @pitrou ping -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [arrow] assignUser opened a new pull request, #15181: MINOR: Use http in preview-docs job summary

2023-01-03 Thread GitBox
assignUser opened a new pull request, #15181: URL: https://github.com/apache/arrow/pull/15181 S3 does not support https so the link does not work atm. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] ursabot commented on pull request #15149: ARROW-18318: [Python] Expose Scalar.validate()

2023-01-03 Thread GitBox
ursabot commented on PR #15149: URL: https://github.com/apache/arrow/pull/15149#issuecomment-1370421043 ['Python', 'R'] benchmarks have high level of regressions. [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/c544098879694b2d88cdede50a11ecd8...70ee2038cde649c98614cde200be7dd6/)

[GitHub] [arrow] ursabot commented on pull request #15149: ARROW-18318: [Python] Expose Scalar.validate()

2023-01-03 Thread GitBox
ursabot commented on PR #15149: URL: https://github.com/apache/arrow/pull/15149#issuecomment-1370420930 Benchmark runs are scheduled for baseline = f4ed8185ebc1804092de46e58078414910587958 and contender = 139a13e320b9a47161ff506c90c5facaec8b773c. 139a13e320b9a47161ff506c90c5facaec8b773c is

[GitHub] [arrow] felipecrv commented on pull request #15083: ARROW-18403: [C++] Add support for nullary and n-ary aggregate functions

2023-01-03 Thread GitBox
felipecrv commented on PR #15083: URL: https://github.com/apache/arrow/pull/15083#issuecomment-1370410329 @westonpace please review. I don't intend to merge the batch of commits that add the `"covariance"` function in this PR. They are present to show that n-ary aggregate functions work, bu

[GitHub] [arrow] mapleFU commented on pull request #15124: GH-15052: [C++][Parquet] Fixing DELTA_BINARY_PACKED when reading only 1

2023-01-03 Thread GitBox
mapleFU commented on PR #15124: URL: https://github.com/apache/arrow/pull/15124#issuecomment-1370401746 @pitrou comment fixed and merged your patch which could make test faster. Mind take a look? -- This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [arrow-rs] wjones127 opened a new issue, #3448: Add column_by_name() for RecordBatch

2023-01-03 Thread GitBox
wjones127 opened a new issue, #3448: URL: https://github.com/apache/arrow-rs/issues/3448 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** This is quite verbose: ```rust let col: Option = record_batch.column(record_bat

[GitHub] [arrow] vibhatha commented on a diff in pull request #15097: GH-15096: [C++] Substrait ProjectRel Emit Optimization

2023-01-03 Thread GitBox
vibhatha commented on code in PR #15097: URL: https://github.com/apache/arrow/pull/15097#discussion_r1061076859 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -2551,6 +2551,122 @@ TEST(SubstraitRoundTrip, ProjectRelOnFunctionWithEmit) { /*include_

[GitHub] [arrow-rs] winding-lines commented on a diff in pull request #3436: object_store: builder configuration api

2023-01-03 Thread GitBox
winding-lines commented on code in PR #3436: URL: https://github.com/apache/arrow-rs/pull/3436#discussion_r1061075024 ## object_store/src/aws/mod.rs: ## @@ -472,6 +614,49 @@ impl AmazonS3Builder { self } +/// Set an option on the builder via a key - value pai

[GitHub] [arrow] westonpace commented on a diff in pull request #15097: GH-15096: [C++] Substrait ProjectRel Emit Optimization

2023-01-03 Thread GitBox
westonpace commented on code in PR #15097: URL: https://github.com/apache/arrow/pull/15097#discussion_r1061075203 ## cpp/src/arrow/engine/substrait/serde_test.cc: ## @@ -2551,6 +2551,122 @@ TEST(SubstraitRoundTrip, ProjectRelOnFunctionWithEmit) { /*includ

[GitHub] [arrow-rs] askoa opened a new pull request, #3447: Parquet writer v2: clear buffer after page flush

2023-01-03 Thread GitBox
askoa opened a new pull request, #3447: URL: https://github.com/apache/arrow-rs/pull/3447 # Which issue does this PR close? Closes #3408 # Rationale for this change # What changes are included in this PR? The issue is not completely fixed. I added a test `fallbac

[GitHub] [arrow-adbc] lidavidm commented on issue #308: NotSupportedError for postgres CHAR / VARCHAR columns

2023-01-03 Thread GitBox
lidavidm commented on issue #308: URL: https://github.com/apache/arrow-adbc/issues/308#issuecomment-1370387845 It will be googletest/googlebenchmark (the unit tests for all projects here are already googletest; only the SQLite driver itself is pure C, and even that I sort of wonder if it's

[GitHub] [arrow-adbc] WillAyd commented on issue #308: NotSupportedError for postgres CHAR / VARCHAR columns

2023-01-03 Thread GitBox
WillAyd commented on issue #308: URL: https://github.com/apache/arrow-adbc/issues/308#issuecomment-1370376855 Are you planning on using googletest for the benchmark / test suite or is the idea to stick entirely to C? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] westonpace commented on a diff in pull request #15180: ARROW-11402: [C++] prefer casting literal over casting field ref

2023-01-03 Thread GitBox
westonpace commented on code in PR #15180: URL: https://github.com/apache/arrow/pull/15180#discussion_r1061054085 ## cpp/src/arrow/compute/exec/expression.cc: ## @@ -368,6 +368,153 @@ bool Expression::IsSatisfiable() const { namespace { +TypeHolder SmallestTypeFor(const arr

[GitHub] [arrow] github-actions[bot] commented on pull request #15180: ARROW-11402: [C++] prefer casting literal over casting field ref

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15180: URL: https://github.com/apache/arrow/pull/15180#issuecomment-1370360434 :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow] github-actions[bot] commented on pull request #15180: ARROW-11402: [C++] prefer casting literal over casting field ref

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15180: URL: https://github.com/apache/arrow/pull/15180#issuecomment-1370360423 https://issues.apache.org/jira/browse/ARROW-11402 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [arrow] westonpace opened a new pull request, #15180: ARROW-11402: [C++] prefer casting literal over casting field ref

2023-01-03 Thread GitBox
westonpace opened a new pull request, #15180: URL: https://github.com/apache/arrow/pull/15180 I ran into this problem while trying to work out partition pruning in the new scan node. I feel like this is a somewhat naive approach but it seems to work. I think it would fail if a `Disp

[GitHub] [arrow-rs] ursabot commented on pull request #3445: Minor: run arrow-integration-test{,ing} clippy after arrow clippy

2023-01-03 Thread GitBox
ursabot commented on PR #3445: URL: https://github.com/apache/arrow-rs/pull/3445#issuecomment-1370344267 Benchmark runs are scheduled for baseline = dc91a244e34a7ffe88f5d0676cef59843926c41d and contender = b82b35f11ed6c3eabee85e7fd72d003db260f9c0. b82b35f11ed6c3eabee85e7fd72d003db260f9c0 i

[GitHub] [arrow-rs] alamb merged pull request #3445: Minor: run arrow-integration-test{,ing} clippy after arrow clippy

2023-01-03 Thread GitBox
alamb merged PR #3445: URL: https://github.com/apache/arrow-rs/pull/3445 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache

[GitHub] [arrow] github-actions[bot] commented on pull request #15179: GH-15042: [C++][Parquet] Update stats on subsequent batches of dictionaries

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15179: URL: https://github.com/apache/arrow/pull/15179#issuecomment-1370341971 :warning: GitHub issue #15042 **has no components**, please add labels for components. -- This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [arrow] github-actions[bot] commented on pull request #15179: GH-15042: [C++][Parquet] Update stats on subsequent batches of dictionaries

2023-01-03 Thread GitBox
github-actions[bot] commented on PR #15179: URL: https://github.com/apache/arrow/pull/15179#issuecomment-1370341957 * Closes: #15042 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [arrow] eitsupi commented on a diff in pull request #14093: ARROW-17416: [R] Implement lubridate::with_tz and lubridate::force_tz

2023-01-03 Thread GitBox
eitsupi commented on code in PR #14093: URL: https://github.com/apache/arrow/pull/14093#discussion_r1061041156 ## r/R/dplyr-funcs-datetime.R: ## @@ -429,6 +430,48 @@ register_bindings_datetime_conversion <- function() { }) } +register_bindings_datetime_timezone <- function

[GitHub] [arrow-datafusion] alamb commented on pull request #4607: Make SchemaProvider::table async

2023-01-03 Thread GitBox
alamb commented on PR #4607: URL: https://github.com/apache/arrow-datafusion/pull/4607#issuecomment-1370338108 Will review this tomorrow. Note the CI is failing -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] ursabot commented on pull request #15152: GH-15150: [C++][FlightRPC] Wait for side effects in DoAction

2023-01-03 Thread GitBox
ursabot commented on PR #15152: URL: https://github.com/apache/arrow/pull/15152#issuecomment-1370334984 ['Python', 'R'] benchmarks have high level of regressions. [test-mac-arm](https://conbench.ursa.dev/compare/runs/6023f2f07b7846de93de9a33b8b74688...cd5019da5a5341ad8d1508fdb906211e/)

[GitHub] [arrow] ursabot commented on pull request #15152: GH-15150: [C++][FlightRPC] Wait for side effects in DoAction

2023-01-03 Thread GitBox
ursabot commented on PR #15152: URL: https://github.com/apache/arrow/pull/15152#issuecomment-1370334813 Benchmark runs are scheduled for baseline = 240ebb75b57bb05551c9103ec3dee11c23fd0aca and contender = f4ed8185ebc1804092de46e58078414910587958. f4ed8185ebc1804092de46e58078414910587958 is

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3437: Parquet record API: timestamp as signed integer

2023-01-03 Thread GitBox
tustvold commented on code in PR #3437: URL: https://github.com/apache/arrow-rs/pull/3437#discussion_r1061033699 ## parquet/src/record/api.rs: ## @@ -793,12 +793,12 @@ impl fmt::Display for Field { /// Input `value` is a number of days since the epoch in UTC. /// Date is displ

[GitHub] [arrow-rs] tustvold commented on issue #3434: Support All Types in make_builder

2023-01-03 Thread GitBox
tustvold commented on issue #3434: URL: https://github.com/apache/arrow-rs/issues/3434#issuecomment-1370319724 https://docs.rs/arrow-array/latest/arrow_array/macro.downcast_primitive.html gives the types which could then be used to tyype the builders. The array downcast macros are built on

[GitHub] [arrow-rs] changhiskhan commented on issue #3434: Support All Types in make_builder

2023-01-03 Thread GitBox
changhiskhan commented on issue #3434: URL: https://github.com/apache/arrow-rs/issues/3434#issuecomment-1370314897 Does downcast_primitive macro help here? Since this is for builders and not arrays? Thanks for taking this up! Curious how to dynamically paramterize ListBuilder here effective

[GitHub] [arrow-adbc] lidavidm commented on pull request #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
lidavidm commented on PR #312: URL: https://github.com/apache/arrow-adbc/pull/312#issuecomment-1370313148 Ahh, okay - I'll fix that as well. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4607: Make SchemaProvider::table async

2023-01-03 Thread GitBox
tustvold commented on code in PR #4607: URL: https://github.com/apache/arrow-datafusion/pull/4607#discussion_r1061021957 ## datafusion/core/src/execution/context.rs: ## @@ -1642,10 +1642,37 @@ impl SessionState { match &statement { DFStatement::Statement(

[GitHub] [arrow-adbc] WillAyd commented on pull request #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
WillAyd commented on PR #312: URL: https://github.com/apache/arrow-adbc/pull/312#issuecomment-1370309731 In any case I still think worth doing what you have here - so lgtm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow-adbc] WillAyd commented on pull request #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
WillAyd commented on PR #312: URL: https://github.com/apache/arrow-adbc/pull/312#issuecomment-1370309603 Ah OK I see what you are doing here with the scripts. In the issue I was mainly referring to the README instructions for how to install the postgres driver https://github.com/apac

[GitHub] [arrow-adbc] lidavidm commented on issue #308: NotSupportedError for postgres CHAR / VARCHAR columns

2023-01-03 Thread GitBox
lidavidm commented on issue #308: URL: https://github.com/apache/arrow-adbc/issues/308#issuecomment-1370292807 The overall task list I have is (in no particular order, there are some dependencies between tasks) - Set up a basic benchmark suite so we can start comparing ourselves -

[GitHub] [arrow-adbc] lidavidm commented on pull request #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
lidavidm commented on PR #312: URL: https://github.com/apache/arrow-adbc/pull/312#issuecomment-1370291970 Hmm, which part of the docs are you looking at specifically then? What this will do is when you `pip install adbc-driver-postgresql` it will also download and install `adbc-driver

[GitHub] [arrow-adbc] WillAyd commented on issue #308: NotSupportedError for postgres CHAR / VARCHAR columns

2023-01-03 Thread GitBox
WillAyd commented on issue #308: URL: https://github.com/apache/arrow-adbc/issues/308#issuecomment-1370291598 Sounds good. Pretty interested to watch this package develop - if you have pointers on things to look at to make that a reality sooner would gladly take a look -- This is an auto

[GitHub] [arrow-adbc] WillAyd commented on pull request #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
WillAyd commented on PR #312: URL: https://github.com/apache/arrow-adbc/pull/312#issuecomment-1370290853 Might be missing the overall point of using `no-deps` here but the current issue only appears at runtime, which I don't think this would change? Or is the goal here just to be more expli

[GitHub] [arrow] WillAyd commented on pull request #15041: GH-14937: [C++] String Sort / Rank Benchmarks

2023-01-03 Thread GitBox
WillAyd commented on PR #15041: URL: https://github.com/apache/arrow/pull/15041#issuecomment-1370288509 Latest commit isn't idiomatic but pushing for what I think are the "correct" numbers first. Here's an interesting comparison of the benchmark for the different types: ```sh will

[GitHub] [arrow-datafusion] wolffcm opened a new issue, #4809: Support Gap Filling on Time Series Data

2023-01-03 Thread GitBox
wolffcm opened a new issue, #4809: URL: https://github.com/apache/arrow-datafusion/issues/4809 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** A common use case when working with time series data is to compute an aggregate val

[GitHub] [arrow-datafusion] ursabot commented on pull request #4807: Enable PhysicalOptimizerRule lazily (#4806)

2023-01-03 Thread GitBox
ursabot commented on PR #4807: URL: https://github.com/apache/arrow-datafusion/pull/4807#issuecomment-1370269634 Benchmark runs are scheduled for baseline = 34a8b864d08d6199547c019414b0ac77b75d8e61 and contender = 68dc644c31892238bfc014feec764e916261800b. 68dc644c31892238bfc014feec764e916

[GitHub] [arrow-datafusion] ursabot commented on pull request #4805: Move default catalog and schema onto ConfigOptions (#3887)

2023-01-03 Thread GitBox
ursabot commented on PR #4805: URL: https://github.com/apache/arrow-datafusion/pull/4805#issuecomment-1370269651 Benchmark runs are scheduled for baseline = 68dc644c31892238bfc014feec764e916261800b and contender = 5b70e3543f5c10832e43ed25e7d4166cf0c1df78. 5b70e3543f5c10832e43ed25e7d4166cf

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4607: Make SchemaProvider::table async

2023-01-03 Thread GitBox
tustvold commented on code in PR #4607: URL: https://github.com/apache/arrow-datafusion/pull/4607#discussion_r1060990226 ## datafusion/core/src/execution/context.rs: ## @@ -1643,6 +1621,72 @@ impl SessionState { self } +/// Creates a [`LogicalPlan`] from the

[GitHub] [arrow-datafusion] tustvold commented on a diff in pull request #4607: Make SchemaProvider::table async

2023-01-03 Thread GitBox
tustvold commented on code in PR #4607: URL: https://github.com/apache/arrow-datafusion/pull/4607#discussion_r1060989938 ## datafusion/core/src/execution/context.rs: ## @@ -1643,6 +1621,72 @@ impl SessionState { self } +/// Creates a [`LogicalPlan`] from the

[GitHub] [arrow-datafusion] tustvold merged pull request #4805: Move default catalog and schema onto ConfigOptions (#3887)

2023-01-03 Thread GitBox
tustvold merged PR #4805: URL: https://github.com/apache/arrow-datafusion/pull/4805 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-datafusion] tustvold closed issue #4806: Physical Optimizer Config Mutation Doesn't Take Effect

2023-01-03 Thread GitBox
tustvold closed issue #4806: Physical Optimizer Config Mutation Doesn't Take Effect URL: https://github.com/apache/arrow-datafusion/issues/4806 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow-datafusion] tustvold merged pull request #4807: Enable PhysicalOptimizerRule lazily (#4806)

2023-01-03 Thread GitBox
tustvold merged PR #4807: URL: https://github.com/apache/arrow-datafusion/pull/4807 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@a

[GitHub] [arrow-datafusion] andygrove commented on issue #4808: Remove DFParser

2023-01-03 Thread GitBox
andygrove commented on issue #4808: URL: https://github.com/apache/arrow-datafusion/issues/4808#issuecomment-1370264668 Makes sense to me. Also, just as an FYI, when DFParser was introduced, we did not have a good mechanism for selectively overriding parsing in sqlparser-rs but we d

[GitHub] [arrow-datafusion] tustvold commented on issue #3777: An asynchronous version of `CatalogList`/`CatalogProvider`/`SchemaProvider`

2023-01-03 Thread GitBox
tustvold commented on issue #3777: URL: https://github.com/apache/arrow-datafusion/issues/3777#issuecomment-1370262246 https://github.com/apache/arrow-datafusion/issues/3777 is now ready for review, PTAL. Hopefully we can get this in to make the release at the end of this week :smile: -

[GitHub] [arrow-nanoarrow] paleolimbot merged pull request #86: Revisit some poorly chosen names

2023-01-03 Thread GitBox
paleolimbot merged PR #86: URL: https://github.com/apache/arrow-nanoarrow/pull/86 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arr

[GitHub] [arrow-datafusion] tustvold commented on pull request #4607: POC: Async catalog

2023-01-03 Thread GitBox
tustvold commented on PR #4607: URL: https://github.com/apache/arrow-datafusion/pull/4607#issuecomment-1370260052 I think this is now ready for review, it would be amazing if this could make this weeks release as this has been a frequently requested feature -- This is an automated messag

[GitHub] [arrow] wjones127 commented on a diff in pull request #15101: GH-15100: [C++][Parquet] Add benchmark for reading strings from Parquet

2023-01-03 Thread GitBox
wjones127 commented on code in PR #15101: URL: https://github.com/apache/arrow/pull/15101#discussion_r1060973607 ## cpp/src/parquet/arrow/reader_writer_benchmark.cc: ## @@ -197,6 +197,52 @@ BENCHMARK_TEMPLATE2(BM_WriteColumn, true, DoubleType); BENCHMARK_TEMPLATE2(BM_WriteColum

[GitHub] [arrow] wjones127 commented on a diff in pull request #15101: GH-15100: [C++][Parquet] Add benchmark for reading strings from Parquet

2023-01-03 Thread GitBox
wjones127 commented on code in PR #15101: URL: https://github.com/apache/arrow/pull/15101#discussion_r1060972611 ## cpp/src/parquet/arrow/reader_writer_benchmark.cc: ## @@ -197,6 +197,52 @@ BENCHMARK_TEMPLATE2(BM_WriteColumn, true, DoubleType); BENCHMARK_TEMPLATE2(BM_WriteColum

[GitHub] [arrow-datafusion] andygrove commented on a diff in pull request #4803: Move ConfigOptions to core

2023-01-03 Thread GitBox
andygrove commented on code in PR #4803: URL: https://github.com/apache/arrow-datafusion/pull/4803#discussion_r1060971835 ## datafusion/optimizer/src/optimizer.rs: ## @@ -92,9 +92,12 @@ pub struct OptimizerContext { impl OptimizerContext { /// Create optimizer config

[GitHub] [arrow] wjones127 commented on a diff in pull request #15101: GH-15100: [C++][Parquet] Add benchmark for reading strings from Parquet

2023-01-03 Thread GitBox
wjones127 commented on code in PR #15101: URL: https://github.com/apache/arrow/pull/15101#discussion_r1060970751 ## cpp/src/parquet/arrow/reader_writer_benchmark.cc: ## @@ -197,6 +197,52 @@ BENCHMARK_TEMPLATE2(BM_WriteColumn, true, DoubleType); BENCHMARK_TEMPLATE2(BM_WriteColum

[GitHub] [arrow-rs] declark1 opened a new issue, #3446: object_store: temporary aws credentials not refreshed?

2023-01-03 Thread GitBox
declark1 opened a new issue, #3446: URL: https://github.com/apache/arrow-rs/issues/3446 I'm using the AmazonS3Builder to create an AmazonS3 object store in my service. The service runs in Kubernetes and temporary service account credentials are injected into the pods via IAM Roles for Servi

[GitHub] [arrow-adbc] lidavidm opened a new pull request, #312: fix(python): add driver -> driver manager dependency

2023-01-03 Thread GitBox
lidavidm opened a new pull request, #312: URL: https://github.com/apache/arrow-adbc/pull/312 Fixes #307. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-m

[GitHub] [arrow] zeroshade merged pull request #15177: GH-15174: [Go][FlightRPC] Expose Flight Server Desc and RegisterFlightService

2023-01-03 Thread GitBox
zeroshade merged PR #15177: URL: https://github.com/apache/arrow/pull/15177 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apa

[GitHub] [arrow] mariusvniekerk commented on pull request #15177: GH-15174: [Go][FlightRPC] Expose Flight Server Desc and RegisterFlightService

2023-01-03 Thread GitBox
mariusvniekerk commented on PR #15177: URL: https://github.com/apache/arrow/pull/15177#issuecomment-137080 Looks like it would work nicely for using flight with an already existing grpc server. -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #4792: remove Operator::{Like,NotLike,ILike,NotILike}

2023-01-03 Thread GitBox
alamb commented on code in PR #4792: URL: https://github.com/apache/arrow-datafusion/pull/4792#discussion_r1060940947 ## datafusion/expr/src/type_coercion/binary.rs: ## @@ -858,6 +850,13 @@ mod tests { }}; } +macro_rules! test_coercion_like_rule { Review Com

  1   2   3   4   >