ng time. We
might never be 100% sure that we've fixed all of them if the initial
design is not generally robust. Most patches are not like that.
--
Peter Geoghegan
On Tue, Mar 29, 2022 at 3:04 PM Tom Lane wrote:
> I think David's questions are sufficiently cogent and difficult
> that we should not add jit_warn_above_fraction at this time.
+1
--
Peter Geoghegan
mpletely different to any model that expects plan costs to be
meaningful in an absolute sense. I'm not completely sure how much that
difference matters, but I suspect that the answer is: "it depends, but
often it matters a great deal".
--
Peter Geoghegan
aggressiveness thing) should be
something we escalate to, as with the failsafe.
Dynamic behavior works a lot better. And it makes scheduling of
autovacuum workers a lot more straightforward -- the discontinuities
seem to make that much harder, which is one more reason to avoid them
altogether.
--
Peter Geoghegan
st need to be sure we all are using these terms
> the same way.
Yeah, there are *endless* opportunities for confusion here.
--
Peter Geoghegan
h.
I assume that it doesn't really appear in very simple cases (also
common cases). But delaying the scan setup work until execution time
does seem ugly. That's probably a good enough reason to refactor.
--
Peter Geoghegan
to ranges
sounds odd at first, but it seems like the right approach now, on
balance -- that's the only *good* way to maintain index order.
(Maintaining index order is needed to avoid needing or relying on
deduplication in the executor proper, which is even inappropriate in
an implementation of SELECT-DISTINCT-that-matches-an-index IMO.)
--
Peter Geoghegan
le, but I wouldn't do it unless there was a
pretty noticeable payoff.
--
Peter Geoghegan
On Mon, Mar 28, 2022 at 1:23 PM Peter Geoghegan wrote:
> I doubt that the patch's use of pg_memory_barrier() in places like
> _bt_killitems() is correct.
I also doubt that posting list splits are handled correctly.
If there is an LP_DEAD bit set on a posting list on the primary, and
w
, none of which are in
access method code that reads from shared_buffers. So this is not a
minor oversight.
--
Peter Geoghegan
On Thu, Mar 24, 2022 at 2:40 PM Peter Geoghegan wrote:
> > > This is absolutely mandatory in the aggressive case, because otherwise
> > > relfrozenxid advancement might be seen as unsafe. My observation is:
> > > Why should we accept the same race in the non-aggres
On Sun, Mar 27, 2022 at 2:02 PM Robert Haas wrote:
> On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan wrote:
> > We're not dealing
> > with adversarial page images here.
>
> I think it's bad that we have to make that assumption, considering
> that there's nothing wha
ng. Just saying that this is quite possible.
--
Peter Geoghegan
/postgr.es/m/20220120052404.sonrhq3f3qgplpzj%40alap3.anarazel.de
Oh, yeah. If some other backend is holding back OldestXmin, and you
can't find a way of dealing with that, then you'll need a temp table.
(Mind you, that trick only works on recent versions too.)
--
Peter Geoghegan
u somehow
also make sure that FreezeLimit is OldestXmin (e.g. by setting
vacuum_freeze_min_age to 0).
VACUUM FREEZE (without DISABLE_PAGE_SKIPPING) seems like it would do
everything you want, without using a temp table. At least on the
master branch.
--
Peter Geoghegan
an 8KiB page is
in fact an nbtree page, in a maximally paranoid way. Might be an
example worth following here.
--
Peter Geoghegan
ests
fail in exactly the same way you've had problems with on the
buildfarm. On the first try, even.
--
Peter Geoghegan
e case.
>
> I do see that there are some difficulties there. I'm not sure what to
> do about that. I think a sufficiently clear commit message could
> possibly be enough, rather than trying to split the patch. But I also
> think splitting the patch should be considered, if that can reasonably
> be done.
I'll see if I can come up with something. It's hard to be sure about
that kind of thing when you're this close to the code.
--
Peter Geoghegan
f advantage to knowing our final scanned_pages almost
immediately. Things like prefetching, capping the size of the
dead_items array more intelligently (use final scanned_pages instead
of rel_pages in dead_items_max_items()), improvements to progress
reporting...not to mention more intelligent choices about whether we
should try to advance relfrozenxid a bit earlier during non-aggressive
VACUUMs.
> Hope that's helpful.
Very helpful -- thanks!
--
Peter Geoghegan
nt unsafe.
It would be great if you could take a look v11-0002-*, Robert. Does it
make sense to you?
Thanks
--
Peter Geoghegan
llion XIDs behind
OldestXmin, the vacuum_freeze_min_age default). I don't see much
difference.
Anyway, this isn't important. I'll just drop the third patch.
--
Peter Geoghegan
pruning. We don't need a
cleanup in either lazy_scan_noprune (a share lock is all we need), nor
do we even need one in lazy_vacuum_heap_page (a regular exclusive lock
is all we need).
--
Peter Geoghegan
On Sun, Mar 13, 2022 at 9:05 PM Peter Geoghegan wrote:
> Attached is v10. While this does still include the freezing patch,
> it's not in scope for Postgres 15. As I've said, I still think that it
> makes sense to maintain the patch series with the freezing stuff,
> since it's
hich is local information (not
system wide information) that is much easier to understand and target
in the implementation.
--
Peter Geoghegan
ple method of making the
same information more visible, that you could implement in only a few
minutes. Perhaps that was optimistic.
--
Peter Geoghegan
once and to try
> to test every branch every 24 hours. Let's see how that goes.
Extravagance!
--
Peter Geoghegan
t of avoiding
an FPI. This second idea is also much more general than simply
avoiding FPIs in general.
--
Peter Geoghegan
On Wed, Mar 9, 2022 at 4:46 PM Andres Freund wrote:
> On 2022-03-03 19:31:32 -0800, Peter Geoghegan wrote:
> > Attached is a new revision of my fix. This is more or less a
> > combination of my v4 fix from November 12 [1] and Andres'
> > already-committed fix (commit 18b8
On Thu, Mar 17, 2022 at 6:15 AM Peter Eisentraut
wrote:
> committed, thanks
Glad that this finally happened. Thanks to everybody involved!
--
Peter Geoghegan
rel's relfrozenxid age crossing the
crucial xidStopLimit crossover point.
This patch makes this problem scenario virtually impossible. Right now
I'm only prepared to say it's very unlikely. I don't see a reason to
take any chances, though.
--
Peter Geoghegan
v1-0001-Perform-final-failsafe-check-
her than palloc()) makes
sense as a defensive measure. It depends on the specific code, of
course.
--
Peter Geoghegan
time to further review this patch and/or commit it?
I'll definitely review it some more before too long.
--
Peter Geoghegan
On Fri, Feb 25, 2022 at 5:52 PM Peter Geoghegan wrote:
> There is an important practical way in which it makes sense to treat
> 0001 as separate to 0002. It is true that 0001 is independently quite
> useful. In practical terms, I'd be quite happy to just get 0001 into
> Postgres 15,
essions *like that* are unlikely. Avoiding
doing what is clearly the wrong thing just seems to work out that way,
in general.
--
Peter Geoghegan
enefit from visiting all-frozen pages, just
> because there are only 30 of them in a row.
Right. I imagine that SKIP_PAGES_THRESHOLD actually does help with
this, but if we actually tried we'd find a much better way.
> I wish somebody would tackle merging heap_page_prune() with
> vacuuming. Primarily so we only do a single WAL record. But also because the
> separation has caused a *lot* of complexity. I've already more projects than
> I should, otherwise I'd start on it...
That has value, but it doesn't feel as urgent.
--
Peter Geoghegan
nities for
FreezeMultiXactId() to avoid allocating new XMIDs (just to freeze
XIDs) by having the full context. And maybe by making the dialog
between lazy_scan_prune and heap_prepare_freeze_tuple a bit more
nuanced.
--
Peter Geoghegan
On Fri, Feb 25, 2022 at 2:00 PM Peter Geoghegan wrote:
> > Hm. I guess I'll have to look at the code for it. It doesn't immediately
> > "feel" quite right.
>
> I kinda think it might be. Please let me know if you see a problem
> with what I've said.
Oh, wait. I
mber that the range that we're
skipping was all-frozen up-front?
That way non-aggressive VACUUMs are not unnecessarily at a
disadvantage, when it comes to being able to advance relfrozenxid.
What if we end up not incrementing vacrel->frozenskipped_pages when we
easily could have, just because this is a non-aggressive VACUUM? I
think that it's worth avoiding stuff like that whenever possible.
Maybe this particular example isn't the most important one. For
example it probably isn't as bad as the one was fixed by the
lazy_scan_noprune work. But why even take a chance? Seems easier to
remove the special case -- which is what this really is.
> FWIW, I'd really like to get rid of SKIP_PAGES_THRESHOLD. It often ends up
> causing a lot of time doing IO that we never need, completely trashing all CPU
> caches, while not actually causing decent readaead IO from what I've seen.
I am also suspicious of SKIP_PAGES_THRESHOLD. But if we want to get
rid of it, we'll need to be sensitive to how that affects relfrozenxid
advancement in non-aggressive VACUUMs IMV.
Thanks again for the review!
--
Peter Geoghegan
ilar. We want options and maximum flexibility, everywhere.
> but if you are going to rescan the heap
> again next time before doing any index vacuuming then why we want to
> store them anyway.
It all depends, of course. The decision needs to be made using a cost
model. I suspect it will be necessary to try it out, and see.
--
Peter Geoghegan
On Sun, Feb 20, 2022 at 12:27 PM Peter Geoghegan wrote:
> You've given me a lot of high quality feedback on all of this, which
> I'll work through soon. It's hard to get the balance right here, but
> it's made much easier by this kind of feedback.
Attached is v9. Lots of changes. H
| grep` as well as more reliable given specific variations
> in output style
> depending on how the blocks are specified.
Sounds useful to me.
--
Peter Geoghegan
or a table where the failsafe kicked in).
--
Peter Geoghegan
the last few times I used
it, which wasn't for long enough for it to really matter. This must
have been why. I might have to rescind my recommendation of lld.
--
Peter Geoghegan
rting MultiXact
wraparound? I'm hoping that the patch that adds smarter tracking of
final relfrozenxid/relminmxid values during VACUUM makes this less of
a problem automatically.
--
Peter Geoghegan
uot;TPC-C", which has the relevant autovacuum log
output for the orders table, covering a 24 hour period
--
Peter Geoghegan
e page aborts.
Just as long as we have a consistent idea about what's going on at the
level of the whole page (or maybe the level of each HOT chain, but the
whole page level seems simpler to me).
--
Peter Geoghegan
ve an aborted DEAD tuple
behind, that we could have pruned away, but didn't. The important
thing is to be consistent at the level of the page.
--
Peter Geoghegan
st arbitrary. It seems to have more to
do with freezing tuples than killing tuples.
--
Peter Geoghegan
On Sat, Feb 19, 2022 at 6:16 PM Peter Geoghegan wrote:
> > Given that heap_surgery's raison d'etre is correcting corruption etc, I
> > think
> > it makes sense for it to do as minimal work as possible. Iterating through a
> > HOT chain would be a problem if you e.g
a different return value, preventing the
> "If the tuple is DEAD and doesn't chain to anything else"
> path from being taken.
That makes sense as an explanation. Goes to show just how fragile the
"DEAD and doesn't chain to anything else" logic at the top of
heap_prune_chain really is.
--
Peter Geoghegan
ne() (on account of the pg_surgery
corruption), there is no possible way that that can happen later on.
And so we cannot find the same heap-only tuple and mark it LP_UNUSED
(which is how we always deal with HEAPTUPLE_DEAD heap-only tuples)
during pruning.
--
Peter Geoghegan
at to do in heap_surgery later, once the
main issue is under control. My point was mostly just that orphaned
heap-only tuples are definitely not okay, in general. They are the
least worst option when corruption has already happened, maybe -- but
maybe not.
--
Peter Geoghegan
corrupt-hot-chain.page.gz
Description: GNU Zip compressed data
On Sat, Feb 19, 2022 at 4:22 PM Peter Geoghegan wrote:
> This very much looks like a bug in pg_surgery itself now -- attached
> is a draft fix.
Wait, that's not it either. I jumped the gun -- this isn't sufficient
(though the patch I posted might not be a bad idea anyway).
Looks like pg_s
On Sat, Feb 19, 2022 at 3:08 PM Peter Geoghegan wrote:
> It's quite possible that this is nothing more than a bug in my
> adversarial gizmo patch -- since I don't think that
> ConditionalLockBufferForCleanup() can ever fail with a temp buffer
> (though even that's not completely cle
On Fri, Feb 18, 2022 at 5:00 PM Peter Geoghegan wrote:
> Another testing strategy occurs to me: we could stress-test the
> implementation by simulating an environment where the no-cleanup-lock
> path is hit an unusually large number of times, possibly a fixed
> percentage of the time
ldn't
expect this ConditionalLockBufferForCleanup() testing gizmo to be too
disruptive.
--
Peter Geoghegan
dominate anyway (gambling that we can avoid more FPIs later on is not a
bad gamble, as gambles go). This seems to make the overhead
acceptable, on balance. Granted, you might be able to poke holes in
that argument, and reasonable people might disagree on what acceptable
should mean. There are many value judgements here, which makes it
complicated. (On the other hand we might be able to do better if there
was a particularly bad case for the 0002 work, if one came to light.)
--
Peter Geoghegan
fied as the cost of doing business, that's a hard argument to
make.
In short, 0001 is mechanically tricky, but easy to understand at a
high level. Whereas 0002 is mechanically simple, but tricky to
understand at a high level (and therefore far trickier than 0001
overall).
--
Peter Geoghegan
On Fri, Feb 11, 2022 at 8:30 PM Peter Geoghegan wrote:
> Attached is v8. No real changes -- just a rebased version.
Concerns about my general approach to this project (and even the
Postgres 14 VACUUM work) were expressed by Robert and Andres over on
the "Nonrandom scanned_pages
urely as
> refactoring work"), and then evolved into something not just a refactoring.
Of course.
> If helpful I can give a go at showing how I think it could be split up. Or
> perhaps more productively, do that on a not-yet-committed larger patch.
Any help is appreciated.
--
Peter Geoghegan
On Thu, Feb 17, 2022 at 10:33 AM Andres Freund wrote:
> On 2022-02-16 19:43:09 -0800, Peter Geoghegan wrote:
> > It makes the code in vacuumlazy.c much cleaner. In fact, that's how commit
> > 44fa8488 started off -- purely as refactoring work.
>
> The problem is that it didn'
muddied by the later work in the patch series, which really does make
pretty big, relatively risky changes to how we do certain things in
vacuumlazy.c. Maybe I shouldn't have referenced that work in the
commit message.
--
Peter Geoghegan
res here, as fellow
experts on VACUUM. That's hard for everybody, myself included. I
greatly appreciate your working with me on this, and on the earlier
VACUUM projects. I'm doing my best.
--
Peter Geoghegan
ave been
latent. If it wasn't there all along, then it might have been. Sort of
like the issue reference in 2011 commit b4b6923e.
--
Peter Geoghegan
thing that suggests "option of last resort". Oh well.
--
Peter Geoghegan
be
> used multiplicitively).
Hearing no objections, I pushed a commit to increase the default to 2.0.
Thanks
--
Peter Geoghegan
On Sun, Feb 13, 2022 at 1:43 PM Peter Geoghegan wrote:
> This fixes the observed problem directly. It also makes the code
> robust against other similar problems that might arise in the future.
> The risk that comes from trusting that scanned_pages is a truly random
> s
ue to random
happenstance. Once that happens, the situation is bound to come to a
head. The user is bound to finally notice that the system has gone
over xidStopLimit, because there is no longer any way for the problem
to go away on its own.
--
Peter Geoghegan
hat in place? :-)
--
Peter Geoghegan
On Tue, Feb 15, 2022 at 10:48 AM Matthias van de Meent
wrote:
> Peter Geoghegan asked for good arguments for the two changes
> implemented. Below are my arguments detailed, with adversarial loads
> that show the problematic behaviour of the line pointer array that is
> fixed with the
On Wed, Feb 16, 2022 at 10:27 AM Peter Geoghegan wrote:
> True, but the most recent version where that's actually possible is
> PostgreSQL 8.0, which was released in early 2005.
It just occurred to me that the main historic reason for the single
user mode advice was the lack of virtua
e ought to be very clear on the fact that Postgres
isn't going to just let your database become corrupt in some more or
less predictable way. The xidStopLimit thing is pretty bad, but it's
still much better than that.
--
Peter Geoghegan
ff in place, maybe not ever).
--
Peter Geoghegan
UUMs fail when attempting to truncate the table (it's
no worse than ANALYZE failing, for example).
Good news!
--
Peter Geoghegan
On Wed, Feb 16, 2022 at 12:43 AM John Naylor
wrote:
> I'll put some effort in finding any way that it might not be robust.
> After that, changing the message and docs is trivial.
Great, thanks John.
--
Peter Geoghegan
On Tue, Feb 15, 2022 at 9:28 AM Peter Geoghegan wrote:
> On Mon, Feb 14, 2022 at 10:04 PM John Naylor
> wrote:
> > Well, the point of inventing this new vacuum mode was because I
> > thought that upon reaching xidStopLimit, we couldn't issue commands,
> > peri
database over and over again. And honestly I think that's fine.
I think that it's fine too. It's unlikely that anybody is going to go
to any trouble to get better compression, certainly, but we should
give them every opportunity to use better alternatives. Ideally
without anybody having to even think about it.
--
Peter Geoghegan
le going
forward. TOAST compression is applied selectively already, based on
various obscure implementation details, some of which are quite
arbitrary.
I know less about zstd, but I imagine that a similar dynamic exists
there. Overall, everything that you've said makes sense to me.
--
Peter Geoghegan
UM
would have to do when run -- not limited to freezing. You could
probably do this anyway, but it's nice that that'll be true.
--
Peter Geoghegan
increase in the default for
Postgres 15, to 2.0.
Any objections to that plan?
--
Peter Geoghegan
On Mon, Feb 14, 2022 at 8:04 PM John Naylor
wrote:
> The failsafe mode does disable truncation as of v14:
>
> commit 60f1f09ff44308667ef6c72fbafd68235e55ae27
> Author: Peter Geoghegan
> Date: Tue Apr 13 12:58:31 2021 -0700
>
> Don't truncate heap when VACUUM's failsafe
the opaque space).
I agree that it's hard to imagine that opting out of using the
standard page header format could ever make much sense. Principally
because the restrictions imposed on an index AM that uses the standard
page header format are very minimal, while the benefits are
substantial.
--
Peter Geoghegan
also makes the code
robust against other similar problems that might arise in the future.
The risk that comes from trusting that scanned_pages is a truly random
sample (given these conditions) is generally very large, while the
risk that comes from disbelieving it (given these same conditions) is
vani
On Sat, Jan 29, 2022 at 8:42 PM Peter Geoghegan wrote:
> Attached is v7, a revision that overhauls the algorithm that decides
> what to freeze. I'm now calling it block-driven freezing in the commit
> message. Also included is a new patch, that makes VACUUM record zero
> free spac
to be far easier by making the
behavior dynamic, and continually reassessing. Once a relatively large
difference among two indexes first emerges, we can be relatively
confident about what to do. But smaller differences are likely just
noise.
--
Peter Geoghegan
t's definitely a reasonable concern. But once you find yourself in
this situation, *every* index will need to be vacuumed anyway, pretty
much as soon as possible. There will be many LP_DEAD items in the
heap, which will be enough to force index vacuuming of all indexes.
--
Peter Geoghegan
incing. There aren't that
many other things that are like pg_waldump, but haven't already been
exposed via an SQL interface. Offhand, I can't think of any.
--
Peter Geoghegan
nderstand. Cost control makes it okay to guess about
benefits for the index/queries and be wrong.
[1]
https://www.postgresql.org/message-id/cagnebogatzs1mwmvx8fzzhmxzudecb10anvwwhctxtibpg3...@mail.gmail.com
--
Peter Geoghegan
maybe some indexes are vacuumed
sooner when the number of LP_DEAD items is increasing. Not really
sure, but that seems more promising than anything else.
--
Peter Geoghegan
growth for
whatever index on the same table has grown the least since last time,
accounting for obvious special cases like partial indexes). Perhaps
we'd give some consideration to bulk deletes, too. Overall, it should
be pretty simple, and should sometimes force us to do one of these
"dynamic mini vacuums" of the index just because we're not quite sure
what to do. There is nothing wrong with admitting the uncertainty.
--
Peter Geoghegan
On Tue, Feb 8, 2022 at 9:33 AM Robert Haas wrote:
> On Tue, Feb 8, 2022 at 12:12 PM Peter Geoghegan wrote:
> > I believe that the main benefit of the dead TID conveyor belt (outside
> > of global index use cases) will be to enable us to do more (much more)
> > index va
ount of *useful* index vacuuming, in less time. There is
often some way in which failing to vacuum one index for a long time
does lasting damage to the index structure.
--
Peter Geoghegan
sense, and that extra concurrent
with pruning heap vacuuming doesn't seem useful at all.
--
Peter Geoghegan
On Mon, Feb 7, 2022 at 12:21 PM Robert Haas wrote:
>
> On Mon, Feb 7, 2022 at 11:43 AM Peter Geoghegan wrote:
> > > That's because, if VACUUM is only ever getting triggered by XID
> > > age advancement and not by bloat, there's no opportunity for your
> > > patc
ve (very reasonable) concerns about the extremes, even if you accept
all that. I just want to get the general idea across here, as a
starting point for further discussion.
--
Peter Geoghegan
On Fri, Feb 4, 2022 at 10:44 PM Peter Geoghegan wrote:
> Right - it's practically inevitable that you'll need an
> anti-wraparound VACUUM to advance relfrozenxid right now. Technically
> it's possible to advance relfrozenxid in any VACUUM, but in practice
> it just never happens on a
On Fri, Feb 4, 2022 at 10:21 PM Greg Stark wrote:
> On Wed, 15 Dec 2021 at 15:30, Peter Geoghegan wrote:
> > My emphasis here has been on making non-aggressive VACUUMs *always*
> > advance relfrozenxid, outside of certain obvious edge cases. And so
> > with all th
On Fri, Feb 4, 2022 at 4:18 PM Robert Haas wrote:
> On Fri, Feb 4, 2022 at 3:31 PM Peter Geoghegan wrote:
> > Application B will already block pruning by VACUUM operations against
> > application A's table, and so effectively blocks recording of the
> > resultant free spa
that artificially
split B-Tree pages early [1], detecting concurrency control related
contention. Other systems need concurrency control in indexes, which
we avoid by having versions live in indexes.
[1] http://cidrdb.org/cidr2021/papers/cidr2021_paper21.pdf
--
Peter Geoghegan
he thing that drives us to perform heap vacuuming
will probably be heap vacuuming itself, and not the fact that each and
every index has become "sufficiently bloated".
> If this isn't entirely making sense, it may well be because I'm a
> little fuzzy on all of it myself.
I'm in no position to judge. :-)
--
Peter Geoghegan
1001 - 1100 of 3113 matches
Mail list logo