Hi,

On 2/8/23 11:42, Tomas Vondra wrote:
On 2/1/23 14:41, David Geier wrote:

Yeah, this is a pretty annoying regression. We already can hit poor
behavior when matching rows are not distributed uniformly in the tables
(which is what LIMIT costing assumes), and this makes it more likely to
hit similar issues. A bit like when doing many HTTP requests makes it
more likely to hit at least one 99% outlier.
Are you talking about the use of ordering vs filtering indexes in queries where there's both an ORDER BY and a filter present (e.g. using an ordering index but then all rows passing the filter are at the end of the table)? If not, can you elaborate a bit more on that and maybe give an example.
No opinion on these options, but worth a try. Alternatively, we could
try the usual doubling approach - start with a low threshold (and set
the latch frequently), and then gradually increase it up to the 1/4.

That should work both for queries expecting only few rows and those
producing a lot of data.

I was thinking about this variant as well. One more alternative would be latching the leader once a worker has produced 1/Nth of the LIMIT where N is the number of workers. Both variants have the disadvantage that there are still corner cases where the latch is set too late; but it would for sure be much better than what we have today.

I also did some profiling and - at least on my development laptop with 8 physical cores - the original example, motivating the batching change is slower than when it's disabled by commenting out:

    if (force_flush || mqh->mqh_send_pending > (mq->mq_ring_size >> 2))

SET parallel_tuple_cost TO 0;
CREATE TABLE b (a int);
INSERT INTO b SELECT generate_series(1, 200000000);
ANALYZE b;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM b;

 Gather  (cost=1000.00..1200284.61 rows=200375424 width=4) (actual rows=200000000 loops=1)
   Workers Planned: 7
   Workers Launched: 7
   ->  Parallel Seq Scan on b  (cost=0.00..1199284.61 rows=28625061 width=4) (actual rows=25000000 loops=8)

Always latch: 19055 ms
Batching:     19575 ms

If I find some time, I'll play around a bit more and maybe propose a patch.

...

We would need something similar to CHECK_FOR_INTERRUPTS() which returns
a NULL slot if a parallel worker is supposed to stop execution (we could
e.g. check if the queue got detached). Or could we amend
CHECK_FOR_INTERRUPTS() to just stop the worker gracefully if the queue
got detached?

That sounds reasonable, but I'm not very familiar the leader-worker
communication, so no opinion on how it should be done.

I think an extra macro that needs to be called from dozens of places to check if parallel execution is supposed to end is the least preferred approach. I'll read up more on how CHECK_FOR_INTERRUPTS() works and if we cannot actively signal the workers that they should stop.

--
David Geier
(ServiceNow)



Reply via email to