xudong963 opened a new pull request #1707:
URL: https://github.com/apache/arrow-datafusion/pull/1707
During I read partition related code, I found there are some places to be
refined
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
ursabot edited a comment on pull request #12170:
URL: https://github.com/apache/arrow/pull/12170#issuecomment-1024331333
Benchmark runs are scheduled for baseline =
39367db2dab321dbbf4d12d2229020614b049dde and contender =
07ec0a12d430dc9151678b6f00d5c6fc0598f034.
07ec0a12d430dc9151678b6f0
github-actions[bot] commented on pull request #12297:
URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025098133
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
yjshen opened a new issue #1708:
URL: https://github.com/apache/arrow-datafusion/issues/1708
**Is your feature request related to a problem or challenge? Please describe
what you are trying to do.**
**_Many pipeline-breaking operators are inherently row-based:_**
For sort tha
alamb commented on issue #1240:
URL: https://github.com/apache/arrow-rs/issues/1240#issuecomment-1025114661
Hi @HaoYang670 -- I do see the same problem and I agree with your
assessment of the problem.
I had assumed it was related to not using nightly, but it happens for me
with nig
alamb commented on issue #1705:
URL:
https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025115375
> @alamb @houqp @seddonm1 what do you think about this proposal?
I think the usecase of defaulting schema and format makes a lot of sense
If you are going to
alamb closed issue #1698:
URL: https://github.com/apache/arrow-datafusion/issues/1698
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsub
alamb merged pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-
alamb commented on a change in pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795161808
##
File path: datafusion/src/execution/dataframe_impl.rs
##
@@ -62,6 +68,60 @@ impl DataFrameImpl {
}
}
+#[async_trait]
+impl TableProvider
alamb commented on a change in pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795161808
##
File path: datafusion/src/execution/dataframe_impl.rs
##
@@ -62,6 +68,60 @@ impl DataFrameImpl {
}
}
+#[async_trait]
+impl TableProvider
alamb edited a comment on issue #1693:
URL:
https://github.com/apache/arrow-datafusion/issues/1693#issuecomment-1025116217
> Do you mean that you can get the no NULLS or all NULLS from the statistics
of the chunk or row group of parquet?
Yes.
The usecase is that in one of our
alamb commented on issue #1693:
URL:
https://github.com/apache/arrow-datafusion/issues/1693#issuecomment-1025116217
> Do you mean that you can get the no NULLS or all NULLS from the statistics
of the chunk or row group of parquet?
Yes.
The usecase is that in one of our suppor
ursabot edited a comment on pull request #12208:
URL: https://github.com/apache/arrow/pull/12208#issuecomment-1024375630
Benchmark runs are scheduled for baseline =
07ec0a12d430dc9151678b6f00d5c6fc0598f034 and contender =
c5b757fe607b1e5824053da279a727e35e877e0a.
c5b757fe607b1e5824053da27
alamb commented on a change in pull request #1223:
URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795166839
##
File path: arrow/src/datatypes/datatype.rs
##
@@ -189,6 +194,12 @@ impl fmt::Display for DataType {
}
}
+/// The maximum precision for [DataType
codecov-commenter edited a comment on pull request #1223:
URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1019476786
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1223?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm
alamb opened a new pull request #1249:
URL: https://github.com/apache/arrow-rs/pull/1249
Builds on https://github.com/apache/arrow-rs/pull/1223 so draft until that
is done
Rationale:
https://github.com/apache/arrow-rs/pull/1223 introduces a more performant
and idiomatic API f
alamb commented on pull request #1223:
URL: https://github.com/apache/arrow-rs/pull/1223#issuecomment-1025135391
This PR is now ready for (re) review @liukun4515
cc @sweb as you contributed the initial `DecimalArray` implementation
(thanks again!)
You can see examples of how
alamb commented on a change in pull request #1249:
URL: https://github.com/apache/arrow-rs/pull/1249#discussion_r795177165
##
File path: arrow/src/compute/kernels/take.rs
##
@@ -496,27 +496,30 @@ where
IndexType: ArrowNumericType,
IndexType::Native: ToPrimitive,
{
-
alamb commented on a change in pull request #1223:
URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795177240
##
File path: arrow/src/array/builder.rs
##
@@ -1153,87 +1153,6 @@ pub struct FixedSizeBinaryBuilder {
builder: FixedSizeListBuilder,
}
-pub const
alamb commented on a change in pull request #1223:
URL: https://github.com/apache/arrow-rs/pull/1223#discussion_r795177390
##
File path: arrow/src/csv/reader.rs
##
@@ -900,15 +899,8 @@ fn parse_decimal_with_parameter(s: &str, precision: usize,
scale: usize) -> Resu
if
codecov-commenter commented on pull request #1249:
URL: https://github.com/apache/arrow-rs/pull/1249#issuecomment-1025137843
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1249?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T
codecov-commenter edited a comment on pull request #1247:
URL: https://github.com/apache/arrow-rs/pull/1247#issuecomment-1024943559
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1247?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm
alamb merged pull request #1707:
URL: https://github.com/apache/arrow-datafusion/pull/1707
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-
alamb merged pull request #1706:
URL: https://github.com/apache/arrow-datafusion/pull/1706
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-
alamb closed issue #1573:
URL: https://github.com/apache/arrow-datafusion/issues/1573
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsub
alamb merged pull request #1695:
URL: https://github.com/apache/arrow-datafusion/pull/1695
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-
alamb commented on a change in pull request #1685:
URL: https://github.com/apache/arrow-datafusion/pull/1685#discussion_r795181843
##
File path: datafusion/src/physical_plan/expressions/binary.rs
##
@@ -878,26 +946,64 @@ impl PhysicalExpr for BinaryExpr {
}
}
+/// The b
alamb closed issue #1610:
URL: https://github.com/apache/arrow-datafusion/issues/1610
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsub
alamb merged pull request #1685:
URL: https://github.com/apache/arrow-datafusion/pull/1685
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-
thinkharderdev opened a new pull request #1709:
URL: https://github.com/apache/arrow-datafusion/pull/1709
# Which issue does this PR close?
Closes #1669
# Rationale for this change
Previously, we added the ability to merge parquet files on read when they
c
ursabot edited a comment on pull request #11956:
URL: https://github.com/apache/arrow/pull/11956#issuecomment-1024422478
Benchmark runs are scheduled for baseline =
c5b757fe607b1e5824053da279a727e35e877e0a and contender =
f92219d05e0255157f628baa445824a96ff94ada.
f92219d05e0255157f628baa4
okadakk commented on pull request #12269:
URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025147669
I'm sorry I made a pull request all at once. Try to split from the next.
Thank you for your quick review!
--
This is an automated message from the Apache Git Service.
To resp
Dandandan commented on pull request #1707:
URL:
https://github.com/apache/arrow-datafusion/pull/1707#issuecomment-1025171457
Nice cleanup!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp
Dandandan commented on pull request #1248:
URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025171943
This is really cool and promising!. In my experience the filter kernels can
be quite expensive in benchmarks. Great stuff 😃
--
This is an automated message from the Apache
matthewmturner commented on issue #1705:
URL:
https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025175512
@alamb great idea. Will do that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL
ursabot edited a comment on pull request #12284:
URL: https://github.com/apache/arrow/pull/12284#issuecomment-1024732280
Benchmark runs are scheduled for baseline =
f92219d05e0255157f628baa445824a96ff94ada and contender =
ed3113b8bd286b8cf29b1d349fa9f3444706347c.
ed3113b8bd286b8cf29b1d349
Crystrix opened a new pull request #12298:
URL: https://github.com/apache/arrow/pull/12298
The product of an empty array or min_count == 0 returns an int64 scalar of
1, otherwise return a null int64 scalar.
--
This is an automated message from the Apache Git Service.
To respond to the me
github-actions[bot] commented on pull request #12298:
URL: https://github.com/apache/arrow/pull/12298#issuecomment-1025186966
Thanks for opening a pull request!
If this is not a [minor
PR](https://github.com/apache/arrow/blob/master/CONTRIBUTING.md#Minor-Fixes).
Could you op
github-actions[bot] commented on pull request #12298:
URL: https://github.com/apache/arrow/pull/12298#issuecomment-1025187165
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
Crystrix opened a new pull request #12299:
URL: https://github.com/apache/arrow/pull/12299
- If min_count == 0 and skip_nulls == true `hash_sum` returns an int64
scalar of 0, otherwise return a int64 scalar of null
- If min_count == 0 and skip_nulls == true `hash_product` returns an int6
github-actions[bot] commented on pull request #12299:
URL: https://github.com/apache/arrow/pull/12299#issuecomment-1025190119
https://issues.apache.org/jira/browse/ARROW-15506
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub a
cpcloud commented on a change in pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795218939
##
File path: datafusion/src/execution/dataframe_impl.rs
##
@@ -62,6 +68,60 @@ impl DataFrameImpl {
}
}
+#[async_trait]
+impl TableProvid
cpcloud commented on a change in pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795221151
##
File path: datafusion/src/execution/dataframe_impl.rs
##
@@ -62,6 +68,60 @@ impl DataFrameImpl {
}
}
+#[async_trait]
+impl TableProvid
houqp commented on issue #1705:
URL:
https://github.com/apache/arrow-datafusion/issues/1705#issuecomment-1025199602
Both of your idea and @alamb's suggestions of builder pattern sounds like a
good plan to me :+1: Thank you for bring this up.
--
This is an automated message from the Apac
OscarTHZhang opened a new issue #1710:
URL: https://github.com/apache/arrow-datafusion/issues/1710
**Describe the bug**
If the column names in a CSV file are uppercase, then typing lowercase
column names in SQL queries will result in an Error: `Invalid identifier for
schema`.
**T
ursabot edited a comment on pull request #12292:
URL: https://github.com/apache/arrow/pull/12292#issuecomment-1024761980
Benchmark runs are scheduled for baseline =
ed3113b8bd286b8cf29b1d349fa9f3444706347c and contender =
fcab4814f658e3adf181f122d016c2b04a2667c6.
fcab4814f658e3adf181f122d
wjones127 opened a new pull request #1711:
URL: https://github.com/apache/arrow-datafusion/pull/1711
# Which issue does this PR close?
Closes #1635.
# Rationale for this change
The build was reported as broken, so we should test it.
In addition, datafusion-contri
cpcloud commented on a change in pull request #1699:
URL: https://github.com/apache/arrow-datafusion/pull/1699#discussion_r795250108
##
File path: datafusion/src/execution/dataframe_impl.rs
##
@@ -62,6 +68,60 @@ impl DataFrameImpl {
}
}
+#[async_trait]
+impl TableProvid
kou closed pull request #12297:
URL: https://github.com/apache/arrow/pull/12297
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscr...
cpcloud opened a new pull request #1712:
URL: https://github.com/apache/arrow-datafusion/pull/1712
This is a draft PR following up a
[discussion](https://github.com/apache/arrow-datafusion/pull/1699#discussion_r794994473)
on #1699.
Issues:
1. The execution context state
ursabot commented on pull request #12297:
URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804
Benchmark runs are scheduled for baseline =
cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender =
690e22f8256d2d4fe548cdbdaf2d70362780fdff.
690e22f8256d2d4fe548cdbdaf2d7036
ursabot edited a comment on pull request #12288:
URL: https://github.com/apache/arrow/pull/12288#issuecomment-1024991556
Benchmark runs are scheduled for baseline =
fcab4814f658e3adf181f122d016c2b04a2667c6 and contender =
ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe.
ff37b7adf21b319c0d08b2eb0
ursabot edited a comment on pull request #12297:
URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804
Benchmark runs are scheduled for baseline =
cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender =
690e22f8256d2d4fe548cdbdaf2d70362780fdff.
690e22f8256d2d4fe548cdbda
codecov-commenter edited a comment on pull request #1248:
URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025014491
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1248?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm
tustvold commented on pull request #1248:
URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025251778
I found some time this afternoon, so bashed out porting the filter context
abstraction (caching the selection vector) and fixing up the null buffer
construction.
Here'
codecov-commenter edited a comment on pull request #1248:
URL: https://github.com/apache/arrow-rs/pull/1248#issuecomment-1025014491
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1248?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm
ursabot edited a comment on pull request #12297:
URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804
Benchmark runs are scheduled for baseline =
cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender =
690e22f8256d2d4fe548cdbdaf2d70362780fdff.
690e22f8256d2d4fe548cdbda
wjones127 commented on pull request #1711:
URL:
https://github.com/apache/arrow-datafusion/pull/1711#issuecomment-1025259032
cc @alamb @kszucs
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to t
HaoYang670 commented on issue #115:
URL:
https://github.com/apache/arrow-datafusion/issues/115#issuecomment-1025260638
Could you please give more detail?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above
HaoYang670 commented on issue #373:
URL:
https://github.com/apache/arrow-datafusion/issues/373#issuecomment-1025261367
I'd like to have a try if no one has been doing it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
houqp commented on pull request #1712:
URL:
https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025262177
@cpcloud good point on lack of access to the trait_upcasting feature :( I
found an interesting workaround for this online based on
https://articles.bchlr.de/traits-dyna
HaoYang670 opened a new pull request #1250:
URL: https://github.com/apache/arrow-rs/pull/1250
Signed-off-by: remzi <[email protected]>
# Which issue does this PR close?
Closes #1202.
# Rationale for this change
Make it easier for users to under
codecov-commenter commented on pull request #1250:
URL: https://github.com/apache/arrow-rs/pull/1250#issuecomment-1025266523
#
[Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1250?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=T
houqp commented on pull request #1712:
URL:
https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914
As for the first issue you mentioned, I think we need to avoid referencing
both `DataFrameImpl` and `ExecutionContext` within the default TableProvider
implementation b
houqp edited a comment on pull request #1712:
URL:
https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914
As for the first issue you mentioned, I think we need to avoid referencing
both `DataFrameImpl` and `ExecutionContext` within the default TableProvider
implement
houqp edited a comment on pull request #1712:
URL:
https://github.com/apache/arrow-datafusion/pull/1712#issuecomment-1025266914
As for the first issue you mentioned, I think we need to avoid referencing
both `DataFrameImpl` and `ExecutionContext` within the default TableProvider
implement
ursabot edited a comment on pull request #12269:
URL: https://github.com/apache/arrow/pull/12269#issuecomment-1025001545
Benchmark runs are scheduled for baseline =
ff37b7adf21b319c0d08b2eb09ecbd8db0794cbe and contender =
cc4e2a54309813e636ba50bcd22a7b71d3d9.
cc4e2a54309813e636ba5
hntd187 commented on issue #1544:
URL:
https://github.com/apache/arrow-datafusion/issues/1544#issuecomment-1025309474
So just updating on this I'm starting to look into state management for
streaming now. Tell me, do we have any concept right now if accessing or
manipulating partitions or
ursabot edited a comment on pull request #12297:
URL: https://github.com/apache/arrow/pull/12297#issuecomment-1025235804
Benchmark runs are scheduled for baseline =
cc4e2a54309813e636ba50bcd22a7b71d3d9 and contender =
690e22f8256d2d4fe548cdbdaf2d70362780fdff.
690e22f8256d2d4fe548cdbda
HaoYang670 opened a new pull request #1713:
URL: https://github.com/apache/arrow-datafusion/pull/1713
Signed-off-by: remzi <[email protected]>
# Which issue does this PR close?
Closes #373.
# Rationale for this change
# What changes are included
houqp commented on issue #1544:
URL:
https://github.com/apache/arrow-datafusion/issues/1544#issuecomment-1025411442
@hntd187 most of the file partition pruning and partition column population
logic lives in the ListingTable module as far as I know.
--
This is an automated message from t
xudong963 opened a new pull request #1714:
URL: https://github.com/apache/arrow-datafusion/pull/1714
`select_to_plan` is the core of generating the logical plan, making it
clearer will be nice for newcomers.
BTW, this is my last PR in 2021 (Chinese year) 😄
Happy Spring Festival 🎆!
matthewmturner opened a new pull request #1715:
URL: https://github.com/apache/arrow-datafusion/pull/1715
# Which issue does this PR close?
Closes #1705
# Rationale for this change
# What changes are included in this PR?
# Are there any user-faci
houqp commented on pull request #1711:
URL:
https://github.com/apache/arrow-datafusion/pull/1711#issuecomment-1025446533
@wjones127 looks like the newly added CI job is failing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHu
houqp commented on pull request #1665:
URL:
https://github.com/apache/arrow-datafusion/pull/1665#issuecomment-1025454650
Nice work @Ted-Jiang :+1: @Jimexist @alamb you want to take a final look?
--
This is an automated message from the Apache Git Service.
To respond to the message, pleas
75 matches
Mail list logo