Chen-Yuan-Lai commented on PR #15946:
URL: https://github.com/apache/datafusion/pull/15946#issuecomment-2871094058
### Summary of this change
1. **Enhancement of `DataFusionError` Enum Variants**
* Updated the `ExecutionJoin` and `External` variants to maintain
consistent backt
pranavJibhakate opened a new pull request, #87:
URL: https://github.com/apache/datafusion-ray/pull/87
Fix for the issue #72
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
wForget opened a new pull request, #1732:
URL: https://github.com/apache/datafusion-comet/pull/1732
## Which issue does this PR close?
Closes #.
## Rationale for this change
## What changes are included in this PR?
## How are these changes t
gabotechs commented on PR #15857:
URL: https://github.com/apache/datafusion/pull/15857#issuecomment-2870958657
Here's the new PR:
https://github.com/apache/datafusion/pull/16025
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitH
gabotechs opened a new pull request, #16025:
URL: https://github.com/apache/datafusion/pull/16025
## Which issue does this PR close?
- Closes #13987.
## Rationale for this change
Reuse the work done on https://github.com/apache/datafusion/pull/15667 for
a
wForget opened a new pull request, #1731:
URL: https://github.com/apache/datafusion-comet/pull/1731
## Which issue does this PR close?
Closes #.
## Rationale for this change
The memeory acquired by `CometMemoryPool.grow` may be less than the actual
request, so `M
gabotechs closed pull request #15857: feat(datafusion-functions-aggregate): add
support for lists and other nested types in min and max
URL: https://github.com/apache/datafusion/pull/15857
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
gabotechs commented on PR #15857:
URL: https://github.com/apache/datafusion/pull/15857#issuecomment-2870925010
After reviewing https://github.com/apache/datafusion/pull/15667, I think it
makes more sense to reuse the work done there rather than what this PR
proposes. Thanks @alamb for point
joroKr21 commented on issue #11591:
URL: https://github.com/apache/datafusion/issues/11591#issuecomment-2870882363
This method would be unusable because the callback-based APIs of `NullState`
borrow it as mutable
--
This is an automated message from the Apache Git Service.
To respond to t
joroKr21 closed issue #11591: Add NullState::is_null public method
URL: https://github.com/apache/datafusion/issues/11591
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To uns
wForget commented on code in PR #1727:
URL: https://github.com/apache/datafusion-comet/pull/1727#discussion_r2083780711
##
spark/src/main/java/org/apache/spark/CometTaskMemoryManager.java:
##
@@ -30,36 +34,41 @@
* memory manager. This assumes Spark's off-heap memory mode is en
brayanjuls commented on code in PR #16019:
URL: https://github.com/apache/datafusion/pull/16019#discussion_r2083769931
##
datafusion/sql/src/statement.rs:
##
@@ -710,6 +710,17 @@ impl SqlToRel<'_, S> {
*statement,
&mut planner_context,
kosiew commented on code in PR #16024:
URL: https://github.com/apache/datafusion/pull/16024#discussion_r2083735160
##
datafusion/ffi/src/tests/async_provider.rs:
##
@@ -270,7 +270,7 @@ impl Stream for AsyncTestRecordBatchStream {
None => std::task::Poll::Ready(N
Lordworms opened a new pull request, #16024:
URL: https://github.com/apache/datafusion/pull/16024
## Which issue does this PR close?
- Closes #16021
## Rationale for this change
## What changes are included in this PR?
## Are these changes
Lordworms commented on issue #14958:
URL: https://github.com/apache/datafusion/issues/14958#issuecomment-2870484350
wating for https://github.com/apache/datafusion/pull/16015 to be merged
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
qstommyshu commented on code in PR #15743:
URL: https://github.com/apache/datafusion/pull/15743#discussion_r2083691379
##
datafusion/sql/tests/sql_integration.rs:
##
@@ -4607,82 +4607,58 @@ fn test_prepare_statement_to_plan_params_as_constants()
{
}
#[test]
-fn test_infer_t
qstommyshu commented on code in PR #16019:
URL: https://github.com/apache/datafusion/pull/16019#discussion_r2083691801
##
datafusion/sql/src/statement.rs:
##
@@ -710,6 +710,17 @@ impl SqlToRel<'_, S> {
*statement,
&mut planner_context,
qstommyshu commented on code in PR #15743:
URL: https://github.com/apache/datafusion/pull/15743#discussion_r2083691379
##
datafusion/sql/tests/sql_integration.rs:
##
@@ -4607,82 +4607,58 @@ fn test_prepare_statement_to_plan_params_as_constants()
{
}
#[test]
-fn test_infer_t
Lordworms commented on issue #16021:
URL: https://github.com/apache/datafusion/issues/16021#issuecomment-2870436642
take
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
adriangb commented on PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#issuecomment-2870271047
@alamb please review again I implemented and added a test 😄
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
lovasoa opened a new pull request, #1848:
URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1848
- **optimize string escaping by writing chunks instead of individual chars**
- **add comments**
--
This is an automated message from the Apache Git Service.
To respond to the m
atahanyorganci opened a new pull request, #16023:
URL: https://github.com/apache/datafusion/pull/16023
(no comment)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsub
juju4 commented on issue #15872:
URL: https://github.com/apache/datafusion/issues/15872#issuecomment-2870214294
it does or need subquery to evaluate
```
> select * from 'test-datafusion.csv' where regexp_match(b, '-O') is not
null;
+---+--+
| a | b|
+--
brayanjuls commented on code in PR #15743:
URL: https://github.com/apache/datafusion/pull/15743#discussion_r2083604044
##
datafusion/sql/tests/sql_integration.rs:
##
@@ -4607,82 +4607,58 @@ fn test_prepare_statement_to_plan_params_as_constants()
{
}
#[test]
-fn test_infer_t
alamb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2083599017
##
datafusion/datasource/src/file_stream.rs:
##
@@ -367,7 +368,7 @@ impl Default for OnError {
pub trait FileOpener: Unpin + Send + Sync {
/// Asynchronously o
alamb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2870055758
> > > > How would that work going from sync -> async? For example: `1 = 2 OR
1 = call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2), Or
Rachelint commented on issue #15633:
URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2870037513
Found discord is banned in my current mac (work mac belonging to company), I
will swith to work on my personal mac and start to communicate on it today
later.
--
This is an
piki commented on issue #1846:
URL:
https://github.com/apache/datafusion-sqlparser-rs/issues/1846#issuecomment-2869951944
It looks like #1120 is where this changed. `BigQueryDialect` got the
ability to parse `DELETE` statements without the `FROM` keyword.
`GenericDialect` got treated the
adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2083553168
##
datafusion/datasource/src/file_stream.rs:
##
@@ -367,7 +368,7 @@ impl Default for OnError {
pub trait FileOpener: Unpin + Send + Sync {
/// Asynchronousl
adriangb commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2083553168
##
datafusion/datasource/src/file_stream.rs:
##
@@ -367,7 +368,7 @@ impl Default for OnError {
pub trait FileOpener: Unpin + Send + Sync {
/// Asynchronousl
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869915327
> > > How would that work going from sync -> async? For example: `1 = 2 OR 1
= call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq,
berkaysynnada commented on PR #16007:
URL: https://github.com/apache/datafusion/pull/16007#issuecomment-2869904980
https://github.com/apache/datafusion/pull/16005#issuecomment-2869904849
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
berkaysynnada commented on PR #16005:
URL: https://github.com/apache/datafusion/pull/16005#issuecomment-2869904849
Hi @comphead. you're right, but I think we can make an exception here for
the greater good. The reasoning is well summarized here:
https://github.com/apache/datafusion/issues/1
berkaysynnada commented on PR #16006:
URL: https://github.com/apache/datafusion/pull/16006#issuecomment-2869904936
https://github.com/apache/datafusion/pull/16005#issuecomment-2869904849
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on t
berkaysynnada commented on issue #16001:
URL: https://github.com/apache/datafusion/issues/16001#issuecomment-2869898307
I think there shouldn't be a SortExec there at all
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
Rachelint commented on issue #15633:
URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2869898423
> Thanks for the details [@Rachelint](https://github.com/Rachelint). I see
you made a significant progress here, and what you provide and the roadmap
sound logical. Of course
irenjj commented on PR #15993:
URL: https://github.com/apache/datafusion/pull/15993#issuecomment-2869891230
> @irenjj I've set the default value of this config to "false", and
auto-complete the slt tests. This is one of the changes:
>
> https://private-user-images.githubusercontent.co
berkaysynnada commented on issue #15633:
URL: https://github.com/apache/datafusion/issues/15633#issuecomment-2869894299
Thanks for the details @Rachelint. I see you made a significant progress
here, and what you provide and the roadmap sound logical. Of course I'm happy
to let you take it t
irenjj closed pull request #15993: Add configuration for eliminating sort in
subquery
URL: https://github.com/apache/datafusion/pull/15993
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specifi
berkaysynnada commented on PR #15993:
URL: https://github.com/apache/datafusion/pull/15993#issuecomment-2869869805
@irenjj I've set the default value of this config to "false", and
auto-complete the slt tests. This is one of the changes:
https://github.com/user-attachments/assets/a277
berkaysynnada commented on code in PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#discussion_r2083522199
##
datafusion/datasource/src/file_stream.rs:
##
@@ -367,7 +368,7 @@ impl Default for OnError {
pub trait FileOpener: Unpin + Send + Sync {
/// Asynchro
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083518706
##
datafusion/sqllogictest/test_files/prepare.slt:
##
@@ -264,16 +264,19 @@ WHERE run_id = 'foo'
ORDER BY random()
LIMIT $1
-query I
+query error
EXECUT
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083517075
##
datafusion/physical-optimizer/src/optimizer.rs:
##
@@ -126,6 +119,13 @@ impl PhysicalOptimizer {
// into an `order by max(x) limit y`. In thi
adriangb commented on PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2869838330
> The rule issue is not very trivial because we cannot just track and
eliminate some hardcoded patterns, since we also need to be aware of upper
parts of the plan, and new patterns
adriangb commented on PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2869838533
@berkaysynnada you might need to resolve the conflicts for CI to run
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083516712
##
datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt:
##
@@ -229,7 +232,9 @@ EXPLAIN select * from t_pushdown where val != 'c';
logical_plan
0
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083516280
##
datafusion/sqllogictest/test_files/prepare.slt:
##
@@ -264,16 +264,19 @@ WHERE run_id = 'foo'
ORDER BY random()
LIMIT $1
-query I
+query error
EXECUT
adriangb commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083516211
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -570,6 +680,47 @@ impl TopKHeap {
+ self.store.size()
+ self.owned_bytes
}
+
+
adriangb commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083515926
##
datafusion/core/tests/fuzz_cases/topk_filter_pushdown.rs:
##
@@ -0,0 +1,354 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contr
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083513975
##
datafusion/sqllogictest/test_files/parquet_filter_pushdown.slt:
##
@@ -86,7 +86,8 @@ logical_plan
physical_plan
01)SortPreservingMergeExec: [a@0 ASC NUL
berkaysynnada commented on PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#issuecomment-2869830929
The PR looks really nice BTW, and every line is clearly understandable. Of
course there are some possible optimizations (as you've also noticed some of
them like the todo durin
berkaysynnada commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2083506643
##
datafusion/physical-plan/src/topk/mod.rs:
##
@@ -570,6 +680,47 @@ impl TopKHeap {
+ self.store.size()
+ self.owned_bytes
}
alamb commented on PR #16012:
URL: https://github.com/apache/datafusion/pull/16012#issuecomment-2869817974
🤖: Benchmark completed
Details
```
Comparing HEAD and alamb_arrow_55.1_update
Benchmark clickbench_extended.json
--
alamb commented on PR #16012:
URL: https://github.com/apache/datafusion/pull/16012#issuecomment-2869787134
🤖 `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
alamb commented on PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#issuecomment-2869790733
> Not sure how to construct the empty stream.
You can use something like
https://docs.rs/futures/latest/futures/stream/fn.iter.html perhaps -- like
`futures::stream::iter(v
alamb commented on PR #16014:
URL: https://github.com/apache/datafusion/pull/16014#issuecomment-2869788980
> It might be nice to implement pruning for Vec where each
statistic represents an arbitrary container (e.g. partition or file).
Yes this would be super nice -- the more we can d
alamb commented on PR #16012:
URL: https://github.com/apache/datafusion/pull/16012#issuecomment-2869783033
🤖: Benchmark completed
Details
```
Comparing HEAD and alamb_arrow_55.1_update
Benchmark clickbench_extended.json
--
adriangb commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869780441
> > How would that work going from sync -> async? For example: `1 = 2 OR 1 =
call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2), Or,
alamb commented on PR #16012:
URL: https://github.com/apache/datafusion/pull/16012#issuecomment-2869734535
🤖 `./gh_compare_branch.sh` [Benchmark
Script](https://github.com/alamb/datafusion-benchmarking/blob/main/gh_compare_branch.sh)
Running
Linux aal-dev 6.11.0-1013-gcp #13~24.04.1-Ubun
berkaysynnada commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083486922
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
berkaysynnada commented on PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#issuecomment-2869684912
> How would that work going from sync -> async? For example: `1 = 2 OR 1 =
call_llm_model_async()`. I imagine this would build something like
`BinaryExpr(BinaryExpr(1, Eq, 2),
alamb closed issue #15943: Weekly Plan: Andrew Lamb 2025-05-05
URL: https://github.com/apache/datafusion/issues/15943
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubsc
alamb opened a new issue, #16022:
URL: https://github.com/apache/datafusion/issues/16022
This is my plan this week for reviews, etc. I am putting it here to make it
visible and keep myself organized
- [ ] Arrow Filter Performance: Complete ClickBench benchmark:
https://github.com/apa
alamb commented on issue #15943:
URL: https://github.com/apache/datafusion/issues/15943#issuecomment-2869680894
Next week:
- https://github.com/apache/datafusion/issues/16022
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub
alamb commented on code in PR #14837:
URL: https://github.com/apache/datafusion/pull/14837#discussion_r2083479793
##
datafusion/core/src/physical_planner.rs:
##
@@ -775,12 +776,44 @@ impl DefaultPhysicalPlanner {
let runtime_expr =
self.cr
alamb commented on PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2869674992
> I've tried to use this branch to sort data larger than memory. For 24GB
parquet file, it produce error `Error: ArrowError(IoError("No space left on
device (os error 28)", Os { code:
LogicFan commented on PR #15610:
URL: https://github.com/apache/datafusion/pull/15610#issuecomment-2869669114
I've tried to use this branch to sort data larger than memory. For 24GB
parquet file, it produce error `Error: ArrowError(IoError("No space left on
device (os error 28)", Os { code:
Max-Meldrum commented on PR #15612:
URL: https://github.com/apache/datafusion/pull/15612#issuecomment-2869668640
I believe this PR is now ready to be merged @alamb
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
U
UBarney commented on PR #15954:
URL: https://github.com/apache/datafusion/pull/15954#issuecomment-2869638528
@xudong963 Thanks for reviewing. All comments have been addressed, PTAL
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to Git
UBarney commented on code in PR #15876:
URL: https://github.com/apache/datafusion/pull/15876#discussion_r2083447146
##
datafusion/expr/src/logical_plan/builder.rs:
##
@@ -797,26 +807,146 @@ impl LogicalPlanBuilder {
}
// remove pushed down sort columns
-
70 matches
Mail list logo