Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-24 Thread Simon Riggs
On Tue, 22 Nov 2022 at 18:44, Tom Lane wrote: > > I wrote: > > Still wondering if there's really no CHECK_FOR_INTERRUPT anywhere > > else in this loop. > > I did some experimentation using the test case Jakub presented > to start with, and verified that that loop does respond promptly > to

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Robert Haas
On Tue, Nov 22, 2022 at 1:44 PM Tom Lane wrote: > I wrote: > > Still wondering if there's really no CHECK_FOR_INTERRUPT anywhere > > else in this loop. > > I did some experimentation using the test case Jakub presented > to start with, and verified that that loop does respond promptly > to

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Tom Lane
I wrote: > Still wondering if there's really no CHECK_FOR_INTERRUPT anywhere > else in this loop. I did some experimentation using the test case Jakub presented to start with, and verified that that loop does respond promptly to control-C even in HEAD. So there are CFI(s) in the loop as I

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Tom Lane
Robert Haas writes: > On Tue, Nov 22, 2022 at 11:35 AM Tom Lane wrote: >> Is it appropriate to count distinct pages, rather than just the >> number of times we have to visit a heap tuple? That seems to >> complicate the logic a good deal, and I'm not sure it's buying >> much, especially since

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Robert Haas
On Tue, Nov 22, 2022 at 11:35 AM Tom Lane wrote: > Simon Riggs writes: > > New patch version reporting for duty, sir. Please take it from here! > > Why the CHECK_FOR_INTERRUPTS? I'd supposed that there's going to be > one somewhere down inside the index or heap access --- do you have > reason

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Tom Lane
Simon Riggs writes: > New patch version reporting for duty, sir. Please take it from here! Why the CHECK_FOR_INTERRUPTS? I'd supposed that there's going to be one somewhere down inside the index or heap access --- do you have reason to think there isn't? Is it appropriate to count distinct

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Simon Riggs
On Mon, 21 Nov 2022 at 20:55, Robert Haas wrote: > > On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote: > > On November 21, 2022 9:37:34 AM PST, Robert Haas > > wrote: > > >On Mon, Nov 21, 2022 at 12:30 PM Andres Freund wrote: > > >> This can't quite be right - isn't this only applying the

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Simon Riggs
On Mon, 21 Nov 2022 at 22:15, Tom Lane wrote: > > Andres Freund writes: > > On 2022-11-21 16:17:56 -0500, Robert Haas wrote: > >> But ... what if they're not? Could the index contain a large number of > >> pages containing just 1 tuple each, or no tuples at all? If so, maybe > >> we can read ten

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Simon Riggs
On Mon, 21 Nov 2022 at 23:34, Robert Haas wrote: > > On Mon, Nov 21, 2022 at 6:17 PM Tom Lane wrote: > > evidence that it's a live problem. API warts are really hard to > > get rid of once instituted. > > Yeah, agreed. Agreed, happy not to; that version was just a WIP to see how it might work.

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-22 Thread Jakub Wartak
Hi all, apologies the patch was rushed too quickly - my bad. I'm attaching a fixed one as v0004 (as it is the 4th patch floating around here). -Jakub Wartak On Mon, Nov 21, 2022 at 9:55 PM Robert Haas wrote: > > On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote: > > On November 21, 2022

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 6:17 PM Tom Lane wrote: > evidence that it's a live problem. API warts are really hard to > get rid of once instituted. Yeah, agreed. -- Robert Haas EDB: http://www.enterprisedb.com

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Robert Haas writes: > On Mon, Nov 21, 2022 at 5:15 PM Tom Lane wrote: >> I think we should content ourselves with improving the demonstrated >> case, which is where we're forced to do a lot of heap fetches due >> to lots of not-all-visible tuples. > All right. I've been bitten by this problem

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 5:15 PM Tom Lane wrote: > I think we should content ourselves with improving the demonstrated > case, which is where we're forced to do a lot of heap fetches due > to lots of not-all-visible tuples. Whether we can spend a lot of > time scanning the index without ever

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Andres Freund writes: > On 2022-11-21 16:17:56 -0500, Robert Haas wrote: >> But ... what if they're not? Could the index contain a large number of >> pages containing just 1 tuple each, or no tuples at all? If so, maybe >> we can read ten bazillion index pages trying to find each heap tuple >>

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Andres Freund
Hi, On 2022-11-21 16:17:56 -0500, Robert Haas wrote: > But ... what if they're not? Could the index contain a large number of > pages containing just 1 tuple each, or no tuples at all? If so, maybe > we can read ten bazillion index pages trying to find each heap tuple > and still end up in

Re: Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 2:15 PM Tom Lane wrote: > Andres Freund writes: > > On November 21, 2022 10:44:17 AM PST, Simon Riggs > > wrote: > >> Robert, something like this perhaps? limit on both the index and the heap. > > > I don't think we should add additional code / struct members into very

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 1:17 PM Andres Freund wrote: > On November 21, 2022 9:37:34 AM PST, Robert Haas > wrote: > >On Mon, Nov 21, 2022 at 12:30 PM Andres Freund wrote: > >> This can't quite be right - isn't this only applying the limit if we found > >> a > >> visible tuple? > > > >It

Re: Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Andres Freund writes: > On November 21, 2022 10:44:17 AM PST, Simon Riggs > wrote: >> Robert, something like this perhaps? limit on both the index and the heap. > I don't think we should add additional code / struct members into very common > good paths for these limits. Yeah, I don't like

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Andres Freund
Hi, On November 21, 2022 10:44:17 AM PST, Simon Riggs wrote: >Robert, something like this perhaps? limit on both the index and the heap. I don't think we should add additional code / struct members into very common good paths for these limits. I don't really understand the point of

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Simon Riggs
On Mon, 21 Nov 2022 at 18:17, Andres Freund wrote: > > Hi, > > On November 21, 2022 9:37:34 AM PST, Robert Haas > wrote: > >On Mon, Nov 21, 2022 at 12:30 PM Andres Freund wrote: > >> This can't quite be right - isn't this only applying the limit if we found > >> a > >> visible tuple? > > >

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Andres Freund
Hi, On November 21, 2022 9:37:34 AM PST, Robert Haas wrote: >On Mon, Nov 21, 2022 at 12:30 PM Andres Freund wrote: >> This can't quite be right - isn't this only applying the limit if we found a >> visible tuple? > >It doesn't look that way to me, but perhaps I'm just too dense to see >the

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Robert Haas writes: > On Mon, Nov 21, 2022 at 12:38 PM Tom Lane wrote: >> What it's restricting is the number of heap page fetches, which >> might be good enough. We don't have a lot of visibility here >> into how many index pages were scanned before returning the next >> not-dead index entry,

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 12:38 PM Tom Lane wrote: > Andres Freund writes: > > This can't quite be right - isn't this only applying the limit if we found a > > visible tuple? > > What it's restricting is the number of heap page fetches, which > might be good enough. We don't have a lot of

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Andres Freund writes: > This can't quite be right - isn't this only applying the limit if we found a > visible tuple? What it's restricting is the number of heap page fetches, which might be good enough. We don't have a lot of visibility here into how many index pages were scanned before

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 12:30 PM Andres Freund wrote: > This can't quite be right - isn't this only applying the limit if we found a > visible tuple? It doesn't look that way to me, but perhaps I'm just too dense to see the problem? -- Robert Haas EDB: http://www.enterprisedb.com

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Justin Pryzby
This patch version runs "continue" unconditionally (rather than conditionally, like the previous version). if (!index_fetch_heap(index_scan, tableslot)) continue; /* no visible tuple, try next index entry */

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Andres Freund
Hi, On 2022-11-21 17:06:16 +0100, Jakub Wartak wrote: > @@ -6213,14 +6216,26 @@ get_actual_variable_endpoint(Relation heapRel, > /* Fetch first/next tuple in specified direction */ > while ((tid = index_getnext_tid(index_scan, indexscandir)) != NULL) > { > +

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Jakub Wartak
Hi, Draft version of the patch attached (it is based on Simon's) I would be happier if we could make that #define into GUC (just in case), although I do understand the effort to reduce the number of various knobs (as their high count causes their own complexity). -Jakub Wartak. On Mon, Nov 21,

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 10:32 AM Tom Lane wrote: > Robert Haas writes: > > Is there any reason to tie this into page costs? I'd be more inclined > > to just make it a hard limit on the number of pages. I think that > > would be more predictable and less prone to surprising (bad) behavior. > >

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Robert Haas writes: > Is there any reason to tie this into page costs? I'd be more inclined > to just make it a hard limit on the number of pages. I think that > would be more predictable and less prone to surprising (bad) behavior. Agreed, a simple limit of N pages fetched seems appropriate. >

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Simon Riggs
On Mon, 21 Nov 2022 at 15:23, Robert Haas wrote: > > On Mon, Nov 21, 2022 at 10:14 AM Simon Riggs > wrote: > > > > What we need is a solution that avoids reading an unbounded number of > > > > tuples under any circumstances. I previously suggested using > > > > SnapshotAny here, but Tom didn't

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 10:14 AM Simon Riggs wrote: > > > What we need is a solution that avoids reading an unbounded number of > > > tuples under any circumstances. I previously suggested using > > > SnapshotAny here, but Tom didn't like that. I'm not sure if there are > > > safety issues there

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Simon Riggs
On Mon, 21 Nov 2022 at 15:01, Tom Lane wrote: > > > What we need is a solution that avoids reading an unbounded number of > > tuples under any circumstances. I previously suggested using > > SnapshotAny here, but Tom didn't like that. I'm not sure if there are > > safety issues there or if Tom

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Tom Lane
Robert Haas writes: > I don't think this is safe at all. Wait events can only bracket > individual operations, like the reads of the individual index blocks, > not report on which phase of a larger operation is in progress. If we > try to make them do the latter, we will have a hot mess on our

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Robert Haas
On Mon, Nov 21, 2022 at 7:22 AM Alvaro Herrera wrote: > On 2022-Nov-21, Jakub Wartak wrote: > > b) I was wondering about creating a new wait class "Planner" with the > > event "ReadingMinMaxIndex" (or similar). The obvious drawback is the > > naming/categorization as wait events are ... well.. as

Re: Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Alvaro Herrera
On 2022-Nov-21, Jakub Wartak wrote: > b) I was wondering about creating a new wait class "Planner" with the > event "ReadingMinMaxIndex" (or similar). The obvious drawback is the > naming/categorization as wait events are ... well.. as the name "WAIT" > implies, while those btree lookups could

Damage control for planner's get_actual_variable_endpoint() runaway

2022-11-21 Thread Jakub Wartak
Hi hackers, the case of planner's src/backend/utils/adt/selfuncs.c:get_actual_variable_endpoint() spending literally seconds seems to be well known fact across hackers (in the extreme wild case it can be over 1+ hour on VLDBs). For those unfamiliar it is planner estimation that tries to read real