[GitHub] [arrow-datafusion] xudong963 opened a new pull request #1707: refine test in repartition.rs & coalesce_batches.rs

2022-01-30 Thread GitBox
xudong963 opened a new pull request #1707: URL: https://github.com/apache/arrow-datafusion/pull/1707 During I read partition related code, I found there are some places to be refined -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [arrow] ursabot edited a comment on pull request #12170: ARROW-14461 [R] write_dataset() allows users to pass invalid additional arguments

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12170: URL: https://github.com/apache/arrow/pull/12170#issuecomment-1024331333 Benchmark runs are scheduled for baseline = 39367db2dab321dbbf4d12d2229020614b049dde and contender = 07ec0a12d430dc9151678b6f00d5c6fc0598f034. 07ec0a12d430dc9151678b6f0

[GitHub] [arrow] github-actions[bot] commented on pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
github-actions[bot] commented on pull request #12297: URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025098133 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow-datafusion] yjshen opened a new issue #1708: Introduce a `Vec` based row-wise representation for DataFusion

2022-01-30 Thread GitBox
yjshen opened a new issue #1708: URL: https://github.com/apache/arrow-datafusion/issues/1708 **Is your feature request related to a problem or challenge? Please describe what you are trying to do.** **_Many pipeline-breaking operators are inherently row-based:_** For sort tha

[GitHub] [arrow-rs] alamb commented on issue #1240: Get `Unknown configuration option rust-version` when running the rust format command

2022-01-30 Thread GitBox
alamb commented on issue #1240: URL: https://github.com/apache/arrow-rs/issues/1240#issuecomment-1025114661 Hi @HaoYang670 -- I do see the same problem and I agree with your assessment of the problem. I had assumed it was related to not using nightly, but it happens for me with nig

[GitHub] [arrow-datafusion] alamb commented on issue #1705: Simplify creating new `ListingTable`

2022-01-30 Thread GitBox
alamb commented on issue #1705: URL: https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025115375 > @alamb @houqp @seddonm1 what do you think about this proposal? I think the usecase of defaulting schema and format makes a lot of sense If you are going to

[GitHub] [arrow-datafusion] alamb closed issue #1698: Implement TableProvider for DataFrameImpl to allow registration of logical plans

2022-01-30 Thread GitBox
alamb closed issue #1698: URL: https://github.com/apache/arrow-datafusion/issues/1698 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
alamb merged pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795161808 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvider

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795161808 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvider

[GitHub] [arrow-datafusion] alamb edited a comment on issue #1693: Expression Simplification for`Expr::Case` expressions

2022-01-30 Thread GitBox
alamb edited a comment on issue #1693: URL: https://github.com/apache/arrow-datafusion/issues/1693#issuecomment-1025116217 > Do you mean that you can get the no NULLS or all NULLS from the statistics of the chunk or row group of parquet? Yes. The usecase is that in one of our

[GitHub] [arrow-datafusion] alamb commented on issue #1693: Expression Simplification for`Expr::Case` expressions

2022-01-30 Thread GitBox
alamb commented on issue #1693: URL: https://github.com/apache/arrow-datafusion/issues/1693#issuecomment-1025116217 > Do you mean that you can get the no NULLS or all NULLS from the statistics of the chunk or row group of parquet? Yes. The usecase is that in one of our suppor

[GitHub] [arrow] ursabot edited a comment on pull request #12208: ARROW-14419 [R] Add filter + join test

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12208: URL: https://github.com/apache/arrow/pull/12208#issuecomment-1024375630 Benchmark runs are scheduled for baseline = 07ec0a12d430dc9151678b6f00d5c6fc0598f034 and contender = c5b757fe607b1e5824053da279a727e35e877e0a. c5b757fe607b1e5824053da27

[GitHub] [arrow-rs] alamb commented on a change in pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795166839 ## File path: arrow/src/datatypes/datatype.rs ## @@ -189,6 +194,12 @@ impl fmt::Display for DataType { } } +/// The maximum precision for [DataType

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-30 Thread GitBox
codecov-commenter edited a comment on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1019476786 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1223?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] alamb opened a new pull request #1249: Use new DecimalArray creation API in arrow crate

2022-01-30 Thread GitBox
alamb opened a new pull request #1249: URL: https://github.com/apache/arrow-rs/pull/1249 Builds on https://github.com/apache/arrow-rs/pull/1223 so draft until that is done Rationale: https://github.com/apache/arrow-rs/pull/1223 introduces a more performant and idiomatic API f

[GitHub] [arrow-rs] alamb commented on pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-30 Thread GitBox
alamb commented on pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1025135391 This PR is now ready for (re) review @liukun4515 cc @sweb as you contributed the initial `DecimalArray` implementation (thanks again!) You can see examples of how

[GitHub] [arrow-rs] alamb commented on a change in pull request #1249: Use new DecimalArray creation API in arrow crate

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1249: URL: https://github.com/apache/arrow-rs/pull/1249#discussion_r795177165 ## File path: arrow/src/compute/kernels/take.rs ## @@ -496,27 +496,30 @@ where IndexType: ArrowNumericType, IndexType::Native: ToPrimitive, { -

[GitHub] [arrow-rs] alamb commented on a change in pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795177240 ## File path: arrow/src/array/builder.rs ## @@ -1153,87 +1153,6 @@ pub struct FixedSizeBinaryBuilder { builder: FixedSizeListBuilder, } -pub const

[GitHub] [arrow-rs] alamb commented on a change in pull request #1223: `DecimalArray` API ergonomics: add iter(), create from iter(), change precision / scale

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1223: URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795177390 ## File path: arrow/src/csv/reader.rs ## @@ -900,15 +899,8 @@ fn parse_decimal_with_parameter(s: &str, precision: usize, scale: usize) -> Resu if

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1249: Use new DecimalArray creation API in arrow crate

2022-01-30 Thread GitBox
codecov-commenter commented on pull request #1249: URL: https://github.com/apache/arrow-rs/pull/1249#issuecomment-1025137843 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1249?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1247: Use new DecimalArray creation API in parquet crate

2022-01-30 Thread GitBox
codecov-commenter edited a comment on pull request #1247: URL: https://github.com/apache/arrow-rs/pull/1247#issuecomment-1024943559 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1247?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-datafusion] alamb merged pull request #1707: refine test in repartition.rs & coalesce_batches.rs

2022-01-30 Thread GitBox
alamb merged pull request #1707: URL: https://github.com/apache/arrow-datafusion/pull/1707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb merged pull request #1706: Fuzz test for spillable sort

2022-01-30 Thread GitBox
alamb merged pull request #1706: URL: https://github.com/apache/arrow-datafusion/pull/1706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb closed issue #1573: SQL tests for when sorting exceeded available memory and had to spill to disk

2022-01-30 Thread GitBox
alamb closed issue #1573: URL: https://github.com/apache/arrow-datafusion/issues/1573 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1695: Lazy TempDir creation in DiskManager

2022-01-30 Thread GitBox
alamb merged pull request #1695: URL: https://github.com/apache/arrow-datafusion/pull/1695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] alamb commented on a change in pull request #1685: Incorporate dyn scalar kernels

2022-01-30 Thread GitBox
alamb commented on a change in pull request #1685: URL: https://github.com/apache/arrow-datafusion/pull/1685#discussion_r795181843 ## File path: datafusion/src/physical_plan/expressions/binary.rs ## @@ -878,26 +946,64 @@ impl PhysicalExpr for BinaryExpr { } } +/// The b

[GitHub] [arrow-datafusion] alamb closed issue #1610: Switch datafusion to using `eq_dyn_scalar`, etc kernels

2022-01-30 Thread GitBox
alamb closed issue #1610: URL: https://github.com/apache/arrow-datafusion/issues/1610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsub

[GitHub] [arrow-datafusion] alamb merged pull request #1685: Incorporate dyn scalar kernels

2022-01-30 Thread GitBox
alamb merged pull request #1685: URL: https://github.com/apache/arrow-datafusion/pull/1685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-

[GitHub] [arrow-datafusion] thinkharderdev opened a new pull request #1709: Create SchemaAdapter trait to map table schema to file schemas

2022-01-30 Thread GitBox
thinkharderdev opened a new pull request #1709: URL: https://github.com/apache/arrow-datafusion/pull/1709 # Which issue does this PR close? Closes #1669 # Rationale for this change Previously, we added the ability to merge parquet files on read when they c

[GitHub] [arrow] ursabot edited a comment on pull request #11956: ARROW-10456: [R] Implement MapType and MapArray

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #11956: URL: https://github.com/apache/arrow/pull/11956#issuecomment-1024422478 Benchmark runs are scheduled for baseline = c5b757fe607b1e5824053da279a727e35e877e0a and contender = f92219d05e0255157f628baa445824a96ff94ada. f92219d05e0255157f628baa4

[GitHub] [arrow] okadakk commented on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-30 Thread GitBox
okadakk commented on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025147669 I'm sorry I made a pull request all at once. Try to split from the next. Thank you for your quick review! -- This is an automated message from the Apache Git Service. To resp

[GitHub] [arrow-datafusion] Dandandan commented on pull request #1707: refine test in repartition.rs & coalesce_batches.rs

2022-01-30 Thread GitBox
Dandandan commented on pull request #1707: URL: https://github.com/apache/arrow-datafusion/pull/1707#issuecomment-1025171457 Nice cleanup! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow-rs] Dandandan commented on pull request #1248: POC: Specialized filter kernels

2022-01-30 Thread GitBox
Dandandan commented on pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025171943 This is really cool and promising!. In my experience the filter kernels can be quite expensive in benchmarks. Great stuff 😃 -- This is an automated message from the Apache

[GitHub] [arrow-datafusion] matthewmturner commented on issue #1705: Simplify creating new `ListingTable`

2022-01-30 Thread GitBox
matthewmturner commented on issue #1705: URL: https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025175512 @alamb great idea. Will do that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] ursabot edited a comment on pull request #12284: ARROW-15495: [C++][FlightRPC] Require Protobuf/gRPC SOURCEs to match

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12284: URL: https://github.com/apache/arrow/pull/12284#issuecomment-1024732280 Benchmark runs are scheduled for baseline = f92219d05e0255157f628baa445824a96ff94ada and contender = ed3113b8bd286b8cf29b1d349fa9f3444706347c. ed3113b8bd286b8cf29b1d349

[GitHub] [arrow] Crystrix opened a new pull request #12298: Support null type in product aggregation

2022-01-30 Thread GitBox
Crystrix opened a new pull request #12298: URL: https://github.com/apache/arrow/pull/12298 The product of an empty array or min_count == 0 returns an int64 scalar of 1, otherwise return a null int64 scalar. -- This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [arrow] github-actions[bot] commented on pull request #12298: Support null type in product aggregation

2022-01-30 Thread GitBox
github-actions[bot] commented on pull request #12298: URL: https://github.com/apache/arrow/pull/12298#issuecomment-1025186966 Thanks for opening a pull request! If this is not a [minor PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes). Could you op

[GitHub] [arrow] github-actions[bot] commented on pull request #12298: ARROW-15505: [C++][Compute] Support null type in product aggregation

2022-01-30 Thread GitBox
github-actions[bot] commented on pull request #12298: URL: https://github.com/apache/arrow/pull/12298#issuecomment-1025187165 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [arrow] Crystrix opened a new pull request #12299: ARROW-15506: [C++][Compute] Support Null type in hash_sum/hash_product/hash_mean

2022-01-30 Thread GitBox
Crystrix opened a new pull request #12299: URL: https://github.com/apache/arrow/pull/12299 - If min_count == 0 and skip_nulls == true `hash_sum` returns an int64 scalar of 0, otherwise return a int64 scalar of null - If min_count == 0 and skip_nulls == true `hash_product` returns an int6

[GitHub] [arrow] github-actions[bot] commented on pull request #12299: ARROW-15506: [C++][Compute] Support Null type in hash_sum/hash_product/hash_mean

2022-01-30 Thread GitBox
github-actions[bot] commented on pull request #12299: URL: https://github.com/apache/arrow/pull/12299#issuecomment-1025190119 https://issues.apache.org/jira/browse/ARROW-15506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795218939 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvid

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795221151 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvid

[GitHub] [arrow-datafusion] houqp commented on issue #1705: Simplify creating new `ListingTable`

2022-01-30 Thread GitBox
houqp commented on issue #1705: URL: https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025199602 Both of your idea and @alamb's suggestions of builder pattern sounds like a good plan to me :+1: Thank you for bring this up. -- This is an automated message from the Apac

[GitHub] [arrow-datafusion] OscarTHZhang opened a new issue #1710: Column names for SQL queries on CSV files should not be case sensitive

2022-01-30 Thread GitBox
OscarTHZhang opened a new issue #1710: URL: https://github.com/apache/arrow-datafusion/issues/1710 **Describe the bug** If the column names in a CSV file are uppercase, then typing lowercase column names in SQL queries will result in an Error: `Invalid identifier for schema`. **T

[GitHub] [arrow] ursabot edited a comment on pull request #12292: ARROW-15499: [Python] Fix import error in pyarrow._orc

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12292: URL: https://github.com/apache/arrow/pull/12292#issuecomment-1024761980 Benchmark runs are scheduled for baseline = ed3113b8bd286b8cf29b1d349fa9f3444706347c and contender = fcab4814f658e3adf181f122d016c2b04a2667c6. fcab4814f658e3adf181f122d

[GitHub] [arrow-datafusion] wjones127 opened a new pull request #1711: Add tests and CI for optional pyarrow module

2022-01-30 Thread GitBox
wjones127 opened a new pull request #1711: URL: https://github.com/apache/arrow-datafusion/pull/1711 # Which issue does this PR close? Closes #1635. # Rationale for this change The build was reported as broken, so we should test it. In addition, datafusion-contri

[GitHub] [arrow-datafusion] cpcloud commented on a change in pull request #1699: Implement TableProvider for DataFrameImpl

2022-01-30 Thread GitBox
cpcloud commented on a change in pull request #1699: URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795250108 ## File path: datafusion/src/execution/dataframe_impl.rs ## @@ -62,6 +68,60 @@ impl DataFrameImpl { } } +#[async_trait] +impl TableProvid

[GitHub] [arrow] kou closed pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
kou closed pull request #12297: URL: https://github.com/apache/arrow/pull/12297 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...

[GitHub] [arrow-datafusion] cpcloud opened a new pull request #1712: generalize table provider df impl

2022-01-30 Thread GitBox
cpcloud opened a new pull request #1712: URL: https://github.com/apache/arrow-datafusion/pull/1712 This is a draft PR following up a [discussion](https://github.com/apache/arrow-datafusion/pull/1699#discussion_r794994473) on #1699. Issues: 1. The execution context state

[GitHub] [arrow] ursabot commented on pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
ursabot commented on pull request #12297: URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804 Benchmark runs are scheduled for baseline = cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender = 690e22f8256d2d4fe548cdbdaf2d70362780fdff. 690e22f8256d2d4fe548cdbdaf2d7036

[GitHub] [arrow] ursabot edited a comment on pull request #12288: ARROW-15497: [C++][Homebrew] Use Clang Tools 12

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12288: URL: https://github.com/apache/arrow/pull/12288#issuecomment-1024991556 Benchmark runs are scheduled for baseline = fcab4814f658e3adf181f122d016c2b04a2667c6 and contender = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe. ff37b7adf21b319c0d08b2eb0

[GitHub] [arrow] ursabot edited a comment on pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12297: URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804 Benchmark runs are scheduled for baseline = cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender = 690e22f8256d2d4fe548cdbdaf2d70362780fdff. 690e22f8256d2d4fe548cdbda

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1248: POC: Specialized filter kernels

2022-01-30 Thread GitBox
codecov-commenter edited a comment on pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025014491 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1248?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow-rs] tustvold commented on pull request #1248: POC: Specialized filter kernels

2022-01-30 Thread GitBox
tustvold commented on pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025251778 I found some time this afternoon, so bashed out porting the filter context abstraction (caching the selection vector) and fixing up the null buffer construction. Here'

[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #1248: POC: Specialized filter kernels

2022-01-30 Thread GitBox
codecov-commenter edited a comment on pull request #1248: URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025014491 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1248?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm

[GitHub] [arrow] ursabot edited a comment on pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12297: URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804 Benchmark runs are scheduled for baseline = cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender = 690e22f8256d2d4fe548cdbdaf2d70362780fdff. 690e22f8256d2d4fe548cdbda

[GitHub] [arrow-datafusion] wjones127 commented on pull request #1711: Add tests and CI for optional pyarrow module

2022-01-30 Thread GitBox
wjones127 commented on pull request #1711: URL: https://github.com/apache/arrow-datafusion/pull/1711#issuecomment-1025259032 cc @alamb @kszucs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #115: Split the logical operators out into separate source files

2022-01-30 Thread GitBox
HaoYang670 commented on issue #115: URL: https://github.com/apache/arrow-datafusion/issues/115#issuecomment-1025260638 Could you please give more detail? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow-datafusion] HaoYang670 commented on issue #373: change from `pub(super)` or `pub(crate)` when reusing `fn`s in aggregate.rs in window function implementations

2022-01-30 Thread GitBox
HaoYang670 commented on issue #373: URL: https://github.com/apache/arrow-datafusion/issues/373#issuecomment-1025261367 I'd like to have a try if no one has been doing it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow-datafusion] houqp commented on pull request #1712: generalize table provider df impl

2022-01-30 Thread GitBox
houqp commented on pull request #1712: URL: https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025262177 @cpcloud good point on lack of access to the trait_upcasting feature :( I found an interesting workaround for this online based on https://articles.bchlr.de/traits-dyna

[GitHub] [arrow-rs] HaoYang670 opened a new pull request #1250: Add docs examples for dynamically compare functions

2022-01-30 Thread GitBox
HaoYang670 opened a new pull request #1250: URL: https://github.com/apache/arrow-rs/pull/1250 Signed-off-by: remzi <[email protected]> # Which issue does this PR close? Closes #1202. # Rationale for this change Make it easier for users to under

[GitHub] [arrow-rs] codecov-commenter commented on pull request #1250: Add docs examples for dynamically compare functions

2022-01-30 Thread GitBox
codecov-commenter commented on pull request #1250: URL: https://github.com/apache/arrow-rs/pull/1250#issuecomment-1025266523 # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1250?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T

[GitHub] [arrow-datafusion] houqp commented on pull request #1712: generalize table provider df impl

2022-01-30 Thread GitBox
houqp commented on pull request #1712: URL: https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914 As for the first issue you mentioned, I think we need to avoid referencing both `DataFrameImpl` and `ExecutionContext` within the default TableProvider implementation b

[GitHub] [arrow-datafusion] houqp edited a comment on pull request #1712: generalize table provider df impl

2022-01-30 Thread GitBox
houqp edited a comment on pull request #1712: URL: https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914 As for the first issue you mentioned, I think we need to avoid referencing both `DataFrameImpl` and `ExecutionContext` within the default TableProvider implement

[GitHub] [arrow-datafusion] houqp edited a comment on pull request #1712: generalize table provider df impl

2022-01-30 Thread GitBox
houqp edited a comment on pull request #1712: URL: https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914 As for the first issue you mentioned, I think we need to avoid referencing both `DataFrameImpl` and `ExecutionContext` within the default TableProvider implement

[GitHub] [arrow] ursabot edited a comment on pull request #12269: ARROW-15462: [GLib] Add GArrow{Month,DayTime,MonthDayNano}Interval{Scalar,Array,ArrayBuilder}

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12269: URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025001545 Benchmark runs are scheduled for baseline = ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe and contender = cc4e2a54309813e636ba50bcd22a7b71d3d9. cc4e2a54309813e636ba5

[GitHub] [arrow-datafusion] hntd187 commented on issue #1544: Streaming support for DataFusion

2022-01-30 Thread GitBox
hntd187 commented on issue #1544: URL: https://github.com/apache/arrow-datafusion/issues/1544#issuecomment-1025309474 So just updating on this I'm starting to look into state management for streaming now. Tell me, do we have any concept right now if accessing or manipulating partitions or

[GitHub] [arrow] ursabot edited a comment on pull request #12297: ARROW-15503: [GLib][Release] Avoid deprecation warning

2022-01-30 Thread GitBox
ursabot edited a comment on pull request #12297: URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804 Benchmark runs are scheduled for baseline = cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender = 690e22f8256d2d4fe548cdbdaf2d70362780fdff. 690e22f8256d2d4fe548cdbda

[GitHub] [arrow-datafusion] HaoYang670 opened a new pull request #1713: Add upper bound for public function `signature`

2022-01-30 Thread GitBox
HaoYang670 opened a new pull request #1713: URL: https://github.com/apache/arrow-datafusion/pull/1713 Signed-off-by: remzi <[email protected]> # Which issue does this PR close? Closes #373. # Rationale for this change # What changes are included

[GitHub] [arrow-datafusion] houqp commented on issue #1544: Streaming support for DataFusion

2022-01-30 Thread GitBox
houqp commented on issue #1544: URL: https://github.com/apache/arrow-datafusion/issues/1544#issuecomment-1025411442 @hntd187 most of the file partition pruning and partition column population logic lives in the ListingTable module as far as I know. -- This is an automated message from t

[GitHub] [arrow-datafusion] xudong963 opened a new pull request #1714: make clearer

2022-01-30 Thread GitBox
xudong963 opened a new pull request #1714: URL: https://github.com/apache/arrow-datafusion/pull/1714 `select_to_plan` is the core of generating the logical plan, making it clearer will be nice for newcomers. BTW, this is my last PR in 2021 (Chinese year) 😄 Happy Spring Festival 🎆!

[GitHub] [arrow-datafusion] matthewmturner opened a new pull request #1715: Create ListingTableConfig which includes file format and schema inference

2022-01-30 Thread GitBox
matthewmturner opened a new pull request #1715: URL: https://github.com/apache/arrow-datafusion/pull/1715 # Which issue does this PR close? Closes #1705 # Rationale for this change # What changes are included in this PR? # Are there any user-faci

[GitHub] [arrow-datafusion] houqp commented on pull request #1711: Add tests and CI for optional pyarrow module

2022-01-30 Thread GitBox
houqp commented on pull request #1711: URL: https://github.com/apache/arrow-datafusion/pull/1711#issuecomment-1025446533 @wjones127 looks like the newly added CI job is failing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [arrow-datafusion] houqp commented on pull request #1665: Fix can not load parquet table form spark in datafusion-cli.

2022-01-30 Thread GitBox
houqp commented on pull request #1665: URL: https://github.com/apache/arrow-datafusion/pull/1665#issuecomment-1025454650 Nice work @Ted-Jiang :+1: @Jimexist @alamb you want to take a final look? -- This is an automated message from the Apache Git Service. To respond to the message, pleas