Re: [PR] Blog post on query cancellation [datafusion-site]

2025-07-05 Thread via GitHub


kevinjqliu commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3039741826

   Just read the post. Coming back here to thank everyone for the contribution! 
I always learn so much from these. 
   
   I think it would be great to have a comment section for the blog posts. I 
opened https://github.com/apache/datafusion-site/issues/80 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-30 Thread via GitHub


comphead commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3020819996

   > The blog is live! 
https://datafusion.apache.org/blog/2025/06/30/cancellation/
   
   Published on Datafusion Linkedin resource


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-30 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3018782433

   The blog is live! 
   https://datafusion.apache.org/blog/2025/06/30/cancellation/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-30 Thread via GitHub


alamb commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2174839499


##
content/blog/2025-06-30-cancellation.md:
##
@@ -0,0 +1,490 @@
+---
+layout: post
+title: Using Rust async for Execution and Cancelling Long-Running Queries

Review Comment:
   ```suggestion
   title: Using Rust async for Query Execution and Cancelling Long-Running 
Queries
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-30 Thread via GitHub


alamb merged PR #75:
URL: https://github.com/apache/datafusion-site/pull/75


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-25 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3003597592

   > https://without.boats/blog/asynchronous-clean-up/
   
   Wait till you read their work on Pin/Unpin.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-25 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3005104426

   @alamb I've done some work to make the diagrams a bit prettier using 
Excalidraw. Massaged the text a bit further. Tried to improve the flow from one 
section to the next.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-25 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3003623249

   > In my opinion this article does a pretty good job explaining the issues 
with cancellation, but it doesn't talk about `async` destructors which I agree 
are probably best left out of scope for this article
   
   If we do link to it I would link directly to 
https://without.boats/blog/asynchronous-clean-up/#cancellation rather than the 
top of the article. The idea of a 'spectrum of cooperativeness' when it comes 
to cancellation is a very elegant way of describing the problem. DataFusion 
tasks were a mix of implicitly cooperative (the Receiver users) and 
non-cooperative. The work that was done tries to move all tasks towards the 
cooperative end of the spectrum.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-25 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3003611703

   > I was wondering if it makes sense to add challenges why async is 
challenging to cancel on low level but it probably would be noisy. But just in 
case this article shed the light on cancellation challenges including Rust 
ecosystem https://without.boats/blog/asynchronous-clean-up/
   
   Thanks for the pointer. I think this would be useful as hyperlink in the 
intro where we're trying to say "cancelling async ain't that simple". Another 
useful blog post to link to might be 
https://cybernetist.com/2024/04/19/rust-tokio-task-cancellation-patterns/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-24 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3001676393

   In my opinion this article does a pretty good job explaining the issues with 
cancellation, but it doesn't talk about `async` destructors which I agree are 
probably best left out of scope for this article


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-24 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3001672876

   > I was wondering if it makes sense to add challenges why async is 
challenging to cancel on low level but it probably would be noisy. But just in 
case this article shed the light on cancellation challenges including Rust 
ecosystem https://without.boats/blog/asynchronous-clean-up/
   
   Man that post somewhat blew my mind (I only understood about 50% of it).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-24 Thread via GitHub


comphead commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3001443602

   I was wondering if it makes sense to add challenges why async is challenging 
to cancel on low level but it probably would be noisy. But just in case this 
article shed the light on cancellation challenges including Rust ecosystem 
https://without.boats/blog/asynchronous-clean-up/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-24 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-3000831078

   I pushed the images to this post, updated the publish date to June 30 (next 
Monday), and started doing some wordsmiting
   
   ![Screenshot 2025-06-24 at 10 52 01 
AM](https://github.com/user-attachments/assets/bd4d42ac-d3ef-493c-a606-3d17a524cc98)
   
   I'll finish up my polishing today and then maybe try and highlight this 
draft in case others want a chance to review it. It is really neat. Thanks 
again @pepijnve 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-23 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2997787662

   > I was not planning on changing it substantially anymore. I was thinking of 
maybe rereading the text with a fresh pair of eyes and editing a sentence here 
or there, but that's it. Need to switch gears from open source contribution 
mode to building our own product for a bit.
   
   Makes sense -- thank you -- I'll do the same (with fresh eyes) and polish it 
up and get it ready to merge. Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-23 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2996933820

   I was not planning on changing it substantially anymore. I was thinking of 
maybe rereading the text with a fresh pair of eyes and editing a sentence here 
or there, but that's it. Need to switch gears from open source contribution 
mode to building our own product for a bit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-23 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2996826099

   Hi @pepijnve  -- I wonder if you plan to work on this post in the near term? 
If not I will try and find time to add some diagrams / etc to help it get ready 
to publish (as I think it is important content)
   
   Thanks again for the work so far


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-20 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2991169813

   @pepijnve  I pushed an Acknowedgement and "About DataFusion" sections
   
   ![Screenshot 2025-06-20 at 7 41 54 
AM](https://github.com/user-attachments/assets/f9a880b2-cc56-4bf8-ae12-aca084b0e4c2)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-20 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2991145431

   > One other thing I'm curious about. This write-up discusses the change in 
terms of enabling long-running tasks to be cancelled, but would making 
CPU-intensive exec blocks more cooperative also help alleviate blocking IO on 
the main runtime if users don't set up a separate runtime ala 
[apache/datafusion#16331](https://github.com/apache/datafusion/pull/16331)? 
That could be a really nice benefit besides cancelability of this, if so. 
@alamb?
   
   @djanderson it may help, but I think even if the CPU intensive tasks yielded 
more frequently we would still need a separate runtime to avoid increasing io 
latnecy


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


alamb commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2153183826


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---

Review Comment:
   Yes of course -- we have done this on other blogs as well 
   
   Something like "Thank you to Datadobi for supporting this work" would be 
very much following the same pattern



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152981299


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---

Review Comment:
   Since Datadobi is paying for my time while I work on this stuff, would it be 
possible to add a small note mentioning that?



##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---

Review Comment:
   Since Datadobi is paying for my time while I work on this stuff, would it be 
possible/allowed to add a small note mentioning that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152978616


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---
+layout: post
+title: Query Cancellation
+date: 2025-06-27
+author: Pepijn Van Eeckhoudt
+categories: [features]
+---
+
+
+
+## The Challenge of Cancelling Long-Running Queries
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll take a look at why that can happen in DataFusion and what 
the community did to resolve the problem in depth.
+
+### Understanding Rust's Async Model
+
+To really understand the cancellation problem you need to be somewhat familiar 
with Rust's asynchronous programming model.
+This is a bit different than what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip 
this section.
+
+ Futures Are Inert
+
+Rust's asynchronous programming model is built around the `Future` trait.
+In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust 
`Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress 
when explicitly polled.
+If nothing tells a `Future` to try and make progress explicitly, it is [an 
inert 
object](https://doc.rust-lang.org/std/future/trait.Future.html#runtime-characteristics).
+
+You ask a `Future`to advance its calculation as much as possible by calling 
the `poll` method.
+The `Future` responds with either:
+- `Poll::Pending` if it needs to wait for something (like I/O) before it can 
continue
+- `Poll::Ready` when it has completed and produced a value
+
+When a `Future` returns `Pending`, it saves its internal state so it can pick 
up where it left off the next time you poll it.
+This state management is what makes Rust's `Future`s memory-efficient and 
composable.
+It also needs to set up the necessary signaling so that the caller gets 
notified when it should try to call `poll` again.
+This avoids having to call `poll` in a busy-waiting loop.
+
+Rust's `async` keyword provides syntactic sugar over this model.
+When you write an `async` function or block, the compiler transforms it into 
an implementation of the `Future` trait for you.
+Since all the state management is compiler generated and hidden from sight, 
async code tends to be more readable while maintaining the same underlying 
mechanics.
+
+The `await` keyword complements this by letting you pause execution until a 
`Future` completes.
+When you `.await` a `Future`, you're essentially telling the compiler to poll 
that `Future` until it's ready before program execution continues with the 
statement after the await.
+
+ From Futures to Streams
+
+The ``Future`s` crate extends the `Future` model with the `Stream` trait.
+Streams represent a sequence of values produced asynchronously rather than 
just a single value.
+The `Stream` trait has one method named `poll_next` that returns:
+- `Poll::Pending` when the next value isn't ready yet, just like a `Future` 
would
+- `Poll::Ready(Some(value))` when a new value is available
+- `Poll::Ready(None)` when the stream is exhausted
+
+### How DataFusion Executes Queries
+
+In DataFusion, queries are executed as follows:
+
+1. First the query is compiled into a tree of `ExecutionPlan` nodes
+2. `ExecutionPlan::execute` is called on the root of the tree. This method 
returns a `SendableRecordBatchStream` (a pinned `Box>`)
+3. `Stream::poll_next` is called in a loop to get the results
+
+The stream we get in step 2 is actually the root of a tree of streams that 
mostly mirrors the execution plan tree.
+Each stream tree node processes the record batches it gets from its children.
+The leaves of the tree produce record batches themselves.
+
+Query execution progresses each time you call `poll_next` on the root stream.
+This call typically cascades down the tree, with each node calling `poll_next` 
on its children to get the data it needs to process.
+
+Here's where the first signs of problems start to show up: some operations 
(like aggregations, sorts, or certain join phases) need to process a lot of 
data before producing any output.
+When `poll_next` encounters one of these operations, it might need to perform 
a substantial amount of work before it can return a record batch.
+
+### Tokio and Cooperative Scheduling
+
+We need to make a small detour now via Tokio's scheduler before we can get to 
the query cancellation problem.
+DataFusion makes use of the [Tokio asynchronous runtime](https://tokio.rs), 
which uses a [cooperative scheduling 
model](https://docs.rs/tokio/latest/tokio/task/index.html#what-are-tasks).
+This is fundamentally different from preemptive scheduling that you might be 
used to:
+
+- In preemptive scheduling, the system can interrupt a task at any time to run 
something else
+

Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152976180


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---
+layout: post
+title: Query Cancellation
+date: 2025-06-27
+author: Pepijn Van Eeckhoudt
+categories: [features]
+---
+
+
+
+## The Challenge of Cancelling Long-Running Queries
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll take a look at why that can happen in DataFusion and what 
the community did to resolve the problem in depth.
+
+### Understanding Rust's Async Model
+
+To really understand the cancellation problem you need to be somewhat familiar 
with Rust's asynchronous programming model.
+This is a bit different than what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip 
this section.
+
+ Futures Are Inert
+
+Rust's asynchronous programming model is built around the `Future` trait.
+In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust 
`Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress 
when explicitly polled.
+If nothing tells a `Future` to try and make progress explicitly, it is [an 
inert 
object](https://doc.rust-lang.org/std/future/trait.Future.html#runtime-characteristics).
+
+You ask a `Future`to advance its calculation as much as possible by calling 
the `poll` method.
+The `Future` responds with either:
+- `Poll::Pending` if it needs to wait for something (like I/O) before it can 
continue
+- `Poll::Ready` when it has completed and produced a value
+
+When a `Future` returns `Pending`, it saves its internal state so it can pick 
up where it left off the next time you poll it.
+This state management is what makes Rust's `Future`s memory-efficient and 
composable.
+It also needs to set up the necessary signaling so that the caller gets 
notified when it should try to call `poll` again.
+This avoids having to call `poll` in a busy-waiting loop.
+
+Rust's `async` keyword provides syntactic sugar over this model.
+When you write an `async` function or block, the compiler transforms it into 
an implementation of the `Future` trait for you.
+Since all the state management is compiler generated and hidden from sight, 
async code tends to be more readable while maintaining the same underlying 
mechanics.
+
+The `await` keyword complements this by letting you pause execution until a 
`Future` completes.
+When you `.await` a `Future`, you're essentially telling the compiler to poll 
that `Future` until it's ready before program execution continues with the 
statement after the await.

Review Comment:
   `.await` is sort of equivalent to `ready!(future.poll)` along with a state 
transition on ready, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


djanderson commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152791907


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,328 @@
+# Query Cancellation
+
+## The Challenge of Cancelling Long-Running Queries
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll take a look at why that can happen in DataFusion and what 
the community did to resolve the problem in depth.
+
+### Understanding Rust's Async Model
+
+To really understand the cancellation problem you need to be somewhat familiar 
with Rust's asynchronous programming model.
+This is a bit different than what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip 
this section.
+
+ Futures Are Inert
+
+Rust's asynchronous programming model is built around the `Future` trait.
+In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust 
`Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress 
when explicitly polled.
+If nothing tells a `Future` to try and make progress explicitly, it is an 
inert object.
+
+You ask a `Future`to advance its calculation as much as possible by calling 
the `poll` method.
+The future responds with either:
+- `Poll::Pending` if it needs to wait for something (like I/O) before it can 
continue
+- `Poll::Ready` when it has completed and produced a value
+
+When a future returns `Pending`, it saves its internal state so it can pick up 
where it left off the next time you poll it.
+This state management is what makes Rust's futures memory-efficient and 
composable.
+It also needs to set up the necessary signaling so that the caller gets 
notified when it should try to call `poll` again.
+This avoids having to call `poll` in a busy-waiting loop.
+
+Rust's `async` keyword provides syntactic sugar over this model.
+When you write an `async` function or block, the compiler transforms it into 
an implementation of the `Future` trait for you.
+Since all the state management is compiler generated and hidden from sight, 
async code tends to be more readable while maintaining the same underlying 
mechanics.
+
+The `await` keyword complements this by letting you pause execution until a 
future completes.
+When you write `.await` after a future, you're essentially telling the 
compiler to poll that future until it's ready before program execution 
continues with the statement after the await.
+
+ From Futures to Streams
+
+The `futures` crate extends the `Future` model with the `Stream` trait.
+Streams represent a sequence of values produced asynchronously rather than 
just a single value.
+The `Stream` trait has one method named `poll_next` that returns:
+- `Poll::Pending` when the next value isn't ready yet, just like a `Future` 
would
+- `Poll::Ready(Some(value))` when a new value is available
+- `Poll::Ready(None)` when the stream is exhausted
+
+### How DataFusion Executes Queries
+
+In DataFusion, queries are executed as follows:
+
+1. First the query is compiled into a tree of `ExecutionPlan` nodes
+2. `ExecutionPlan::execute` is called on the root of the tree. This method 
returns a `SendableRecordBatchStream` (a pinned `Box>`)
+3. `Stream::poll_next` is called in a loop to get the results
+
+The stream we get in step 2 is actually the root of a tree of streams that 
mostly mirrors the execution plan tree.
+Each stream tree node processes the record batches it gets from its children.
+The leaves of the tree produce record batches themselves.
+
+Query execution progresses each time you call `poll_next` on the root stream.
+This call typically cascades down the tree, with each node calling `poll_next` 
on its children to get the data it needs to process.
+
+Here's where the first signs of problems start to show up: some operations 
(like aggregations, sorts, or certain join phases) need to process a lot of 
data before producing any output.
+When `poll_next` encounters one of these operations, it might need to perform 
a substantial amount of work before it can return a record batch.
+
+### Tokio and Cooperative Scheduling
+
+We need to make a small detour now via Tokio's scheduler before we can get to 
the query cancellation problem.
+DataFusion makes use of the Tokio asynchronous runtime, which uses a 
cooperative scheduling model.
+This is fundamentally different from preemptive scheduling that you might be 
used to:
+
+- In preemptive scheduling, the system can interrupt a task at any time to run 
something else
+- In cooperative scheduling, tasks must voluntarily yield control back to the 
scheduler
+
+This distinction is crucial for understanding our cancellation problem.
+When a Tokio task is running, it can't be forcibly interrupted - it must 
cooperate by periodically yielding cont

Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


alamb commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152732436


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,353 @@
+---

Review Comment:
   I took the liberty of pushing this "front matter" and ASF header to this 
post 
   
   I tried to [render the blog 
locally](https://github.com/apache/datafusion-site?tab=readme-ov-file#setup-for-docker)
 to preview but found it wasn't in the blog list without this
   
   ![Screenshot 2025-06-17 at 12 49 31 
PM](https://github.com/user-attachments/assets/a1bfbc5b-11fc-4ce4-83e4-b94a6acab0f0)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2981079329

   > 
   A colleague of mine suggested to add some more diagrams. I know UML is out 
of fashion, but here's a sketch of what's actually happening when you imagine 
Tokio's budget is initialized to 1. Would something like this be a useful 
addition, or trying to cram too much information in a small space?
   
   I think it is hugely helpful (we just shouldn't call it UML to avoid getting 
painted with the "old folks brush" 😆 )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


alamb commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2981082556

   FYI @ozankabak and @zhuqi-lucas


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2980346829

   A colleague of mine suggested to add some more diagrams. I know UML is out 
of fashion, but here's a sketch of what's actually happening when you imagine 
Tokio's budget is initialized to 1: 
   
   ![Sequence diagram of one Tokio 
tick](https://www.plantuml.com/plantuml/png/jPCnRuCm48LtViM9lKCoTgZI9LkbAibUkV3aeXWRdSzjy-_hX5o52KE6TW34ku_dnxEyYM9OKkygqqXWgW_Xs0NQ9IzTZvfCf8jIMGvfqF4Gd8ia9Xuf-0PrDSeFJtJ8sZP9Oj3Z1QicIfu_MykmHh0NXkb7vitZMtG59RhWoOKmr3hOTXo5EW6FmnQ3Wo3IUsejfAvctkVbNi1sOQchhDI-CNvyMqsfYPofiOVhVE3G0At-zznZ1zEUok_BC8eKGMxhKGo-rHRsQ89l9pLOIAJNJ7JU4Zx1fwyFCFwaZlLo7Ulxwb0FJLSwpE8ezCzxinHcTLUOvVrHC3_ErstfnPduvOjJeVbBjqs-fTxz)
   
   Is something like this useful, or trying to cram too much information in a 
small space?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2980078702

   > but would making CPU-intensive exec blocks more cooperative also help 
alleviate blocking IO on the main runtime if users don't set up a separate 
runtime
   
   It should make some more room for the IO indeed. Cancellation is just one 
facet of the more general problem of fairness in a cooperatively scheduled 
runtime. Yielding every now and then helps to ensure every concurrent task gets 
a fair share of time to run.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-17 Thread via GitHub


pepijnve commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2152065111


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,328 @@
+# Query Cancellation
+
+## The Challenge of Cancelling Long-Running Queries
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll take a look at why that can happen in DataFusion and what 
the community did to resolve the problem in depth.
+
+### Understanding Rust's Async Model
+
+To really understand the cancellation problem you need to be somewhat familiar 
with Rust's asynchronous programming model.
+This is a bit different than what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip 
this section.
+
+ Futures Are Inert
+
+Rust's asynchronous programming model is built around the `Future` trait.
+In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust 
`Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress 
when explicitly polled.
+If nothing tells a `Future` to try and make progress explicitly, it is an 
inert object.
+
+You ask a `Future`to advance its calculation as much as possible by calling 
the `poll` method.
+The future responds with either:
+- `Poll::Pending` if it needs to wait for something (like I/O) before it can 
continue
+- `Poll::Ready` when it has completed and produced a value
+
+When a future returns `Pending`, it saves its internal state so it can pick up 
where it left off the next time you poll it.
+This state management is what makes Rust's futures memory-efficient and 
composable.
+It also needs to set up the necessary signaling so that the caller gets 
notified when it should try to call `poll` again.
+This avoids having to call `poll` in a busy-waiting loop.
+
+Rust's `async` keyword provides syntactic sugar over this model.
+When you write an `async` function or block, the compiler transforms it into 
an implementation of the `Future` trait for you.
+Since all the state management is compiler generated and hidden from sight, 
async code tends to be more readable while maintaining the same underlying 
mechanics.
+
+The `await` keyword complements this by letting you pause execution until a 
future completes.
+When you write `.await` after a future, you're essentially telling the 
compiler to poll that future until it's ready before program execution 
continues with the statement after the await.
+
+ From Futures to Streams
+
+The `futures` crate extends the `Future` model with the `Stream` trait.
+Streams represent a sequence of values produced asynchronously rather than 
just a single value.
+The `Stream` trait has one method named `poll_next` that returns:
+- `Poll::Pending` when the next value isn't ready yet, just like a `Future` 
would
+- `Poll::Ready(Some(value))` when a new value is available
+- `Poll::Ready(None)` when the stream is exhausted
+
+### How DataFusion Executes Queries
+
+In DataFusion, queries are executed as follows:
+
+1. First the query is compiled into a tree of `ExecutionPlan` nodes
+2. `ExecutionPlan::execute` is called on the root of the tree. This method 
returns a `SendableRecordBatchStream` (a pinned `Box>`)
+3. `Stream::poll_next` is called in a loop to get the results
+
+The stream we get in step 2 is actually the root of a tree of streams that 
mostly mirrors the execution plan tree.
+Each stream tree node processes the record batches it gets from its children.
+The leaves of the tree produce record batches themselves.
+
+Query execution progresses each time you call `poll_next` on the root stream.
+This call typically cascades down the tree, with each node calling `poll_next` 
on its children to get the data it needs to process.
+
+Here's where the first signs of problems start to show up: some operations 
(like aggregations, sorts, or certain join phases) need to process a lot of 
data before producing any output.
+When `poll_next` encounters one of these operations, it might need to perform 
a substantial amount of work before it can return a record batch.
+
+### Tokio and Cooperative Scheduling
+
+We need to make a small detour now via Tokio's scheduler before we can get to 
the query cancellation problem.
+DataFusion makes use of the Tokio asynchronous runtime, which uses a 
cooperative scheduling model.
+This is fundamentally different from preemptive scheduling that you might be 
used to:
+
+- In preemptive scheduling, the system can interrupt a task at any time to run 
something else
+- In cooperative scheduling, tasks must voluntarily yield control back to the 
scheduler
+
+This distinction is crucial for understanding our cancellation problem.
+When a Tokio task is running, it can't be forcibly interrupted - it must 
cooperate by periodically yielding contro

Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-16 Thread via GitHub


djanderson commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2978267914

   One other thing I'm curious about. This write-up discusses the change in 
terms of enabling  long-running tasks to be cancelled, but would making 
CPU-intensive exec blocks more cooperative also help alleviate blocking IO on 
the main runtime if users don't set up a separate runtime ala 
https://github.com/apache/datafusion/pull/16331? That could be a really nice 
benefit besides cancelability of this, if so. @alamb?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-16 Thread via GitHub


djanderson commented on code in PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#discussion_r2150859260


##
content/blog/2025-06-15-cancellation.md:
##
@@ -0,0 +1,328 @@
+# Query Cancellation
+
+## The Challenge of Cancelling Long-Running Queries
+
+Have you ever tried to cancel a query that just wouldn't stop?
+In this post, we'll take a look at why that can happen in DataFusion and what 
the community did to resolve the problem in depth.
+
+### Understanding Rust's Async Model
+
+To really understand the cancellation problem you need to be somewhat familiar 
with Rust's asynchronous programming model.
+This is a bit different than what you might be used to from other ecosystems.
+Let's go over the basics again as a refresher.
+If you're familiar with the ins and outs of `Future` and `async` you can skip 
this section.
+
+ Futures Are Inert
+
+Rust's asynchronous programming model is built around the `Future` trait.
+In contrast to, for instance, Javascript's `Promise` or Java's `Future` a Rust 
`Future` does not necessarily represent an actively running asynchronous job.
+Instead, a `Future` represents a lazy calculation that only makes progress 
when explicitly polled.
+If nothing tells a `Future` to try and make progress explicitly, it is an 
inert object.
+
+You ask a `Future`to advance its calculation as much as possible by calling 
the `poll` method.
+The future responds with either:
+- `Poll::Pending` if it needs to wait for something (like I/O) before it can 
continue
+- `Poll::Ready` when it has completed and produced a value
+
+When a future returns `Pending`, it saves its internal state so it can pick up 
where it left off the next time you poll it.
+This state management is what makes Rust's futures memory-efficient and 
composable.
+It also needs to set up the necessary signaling so that the caller gets 
notified when it should try to call `poll` again.
+This avoids having to call `poll` in a busy-waiting loop.
+
+Rust's `async` keyword provides syntactic sugar over this model.
+When you write an `async` function or block, the compiler transforms it into 
an implementation of the `Future` trait for you.
+Since all the state management is compiler generated and hidden from sight, 
async code tends to be more readable while maintaining the same underlying 
mechanics.
+
+The `await` keyword complements this by letting you pause execution until a 
future completes.
+When you write `.await` after a future, you're essentially telling the 
compiler to poll that future until it's ready before program execution 
continues with the statement after the await.
+
+ From Futures to Streams
+
+The `futures` crate extends the `Future` model with the `Stream` trait.
+Streams represent a sequence of values produced asynchronously rather than 
just a single value.
+The `Stream` trait has one method named `poll_next` that returns:
+- `Poll::Pending` when the next value isn't ready yet, just like a `Future` 
would
+- `Poll::Ready(Some(value))` when a new value is available
+- `Poll::Ready(None)` when the stream is exhausted
+
+### How DataFusion Executes Queries
+
+In DataFusion, queries are executed as follows:
+
+1. First the query is compiled into a tree of `ExecutionPlan` nodes
+2. `ExecutionPlan::execute` is called on the root of the tree. This method 
returns a `SendableRecordBatchStream` (a pinned `Box>`)
+3. `Stream::poll_next` is called in a loop to get the results
+
+The stream we get in step 2 is actually the root of a tree of streams that 
mostly mirrors the execution plan tree.
+Each stream tree node processes the record batches it gets from its children.
+The leaves of the tree produce record batches themselves.
+
+Query execution progresses each time you call `poll_next` on the root stream.
+This call typically cascades down the tree, with each node calling `poll_next` 
on its children to get the data it needs to process.
+
+Here's where the first signs of problems start to show up: some operations 
(like aggregations, sorts, or certain join phases) need to process a lot of 
data before producing any output.
+When `poll_next` encounters one of these operations, it might need to perform 
a substantial amount of work before it can return a record batch.
+
+### Tokio and Cooperative Scheduling
+
+We need to make a small detour now via Tokio's scheduler before we can get to 
the query cancellation problem.
+DataFusion makes use of the Tokio asynchronous runtime, which uses a 
cooperative scheduling model.
+This is fundamentally different from preemptive scheduling that you might be 
used to:
+
+- In preemptive scheduling, the system can interrupt a task at any time to run 
something else
+- In cooperative scheduling, tasks must voluntarily yield control back to the 
scheduler
+
+This distinction is crucial for understanding our cancellation problem.
+When a Tokio task is running, it can't be forcibly interrupted - it must 
cooperate by periodically yielding cont

Re: [PR] Blog post on query cancellation [datafusion-site]

2025-06-15 Thread via GitHub


pepijnve commented on PR #75:
URL: https://github.com/apache/datafusion-site/pull/75#issuecomment-2974061097

   I've marked this as draft for now. I think I have the narrative arc I was 
going for in place, but the text probably still needs some editing work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]