Re: Commitfest Closed

2022-04-08 Thread Peter Geoghegan
On Fri, Apr 8, 2022 at 5:58 AM Alvaro Herrera wrote: > Thanks for herding through the CF! +1 -- Peter Geoghegan

Re: Lowering the ever-growing heap->pd_lower

2022-04-07 Thread Peter Geoghegan
eady (*), but if it > grows > further... No arguments here. There are probably quite a few places that won't need to be fixed, because it just doesn't matter, but lazy_scan_prune() will. -- Peter Geoghegan

Re: Lowering the ever-growing heap->pd_lower

2022-04-07 Thread Peter Geoghegan
On Mon, Apr 4, 2022 at 7:24 PM Peter Geoghegan wrote: > I am sympathetic to the idea that giving the system a more accurate > picture of how much free space is available on each heap page is an > intrinsic good. This might help us in a few different areas. For > example, the FSM

Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Peter Geoghegan
t; even individual tables, which would all be very cool), but I don't think > this approach would make that possible..? That would be the main advantage, yes. But I also tend to doubt that we should make it completely impossible to know anything at all about the page without fully decrypting it. It was just a suggestion. I will leave it at that. -- Peter Geoghegan

Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Peter Geoghegan
On Thu, Apr 7, 2022 at 12:37 PM Robert Haas wrote: > On Thu, Apr 7, 2022 at 3:27 PM Peter Geoghegan wrote: > > I just meant that it wouldn't be reasonable to impose a fixed cost on > > every user, even those not using the feature. Which you said yourself. > > Unfortunat

Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Peter Geoghegan
ogically come after" the special space under this scheme. You wouldn't have a simple constant offset into the page, but you'd have something not too far removed from such a constant. It could work as a constant with minimal context (just the AM type). Just like with Matthias' patch. -- Peter Geoghegan

Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Peter Geoghegan
ly modest effort. You'd need to have AM-specific knowledge (it would stack right on top of Matthias's technique), but that doesn't seem all that hard. There are plenty of remaining status bits in BTOpaque, and probably all other index AM special areas. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-04-07 Thread Peter Geoghegan
oblems, which manifest themselves as line pointer bloat first, with any problems in indexes coming up only much later, if at all (admittedly I could probably contrive such a case if I wanted to). Absence of evidence isn't evidence of absence, though. Just giving you my opinion. Again, though, I must ask: why does it matter either way? Even if such a scenario were reasonably common, it wouldn't necessarily make life harder for you here. -- Peter Geoghegan

Re: New compiler warning from btree dedup code

2022-04-06 Thread Peter Geoghegan
That approach seems fine. Thanks.-- Peter Geoghegan

Re: REINDEX blocks virtually any queries but some prepared queries.

2022-04-06 Thread Peter Geoghegan
s thought that the docs for REINDEX, while technically accurate, are very misleading in practice. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-04-05 Thread Peter Geoghegan
On Tue, Apr 5, 2022 at 2:53 PM Robert Haas wrote: > On Tue, Apr 5, 2022 at 4:30 PM Peter Geoghegan wrote: > > On Tue, Apr 5, 2022 at 1:10 PM Robert Haas wrote: > > > I had assumed that this would not be the case, because if the page is > > > being accessed by the

Re: should vacuum's first heap pass be read-only?

2022-04-05 Thread Peter Geoghegan
t's far from guaranteed to help. Also, many tables have way more than one index. Of course it isn't nearly as simple as comparing the bytes of bloat in each case. More generally, I don't claim that it's easy to characterize which factor is more important, even in the abstract, even under ideal conditions -- it's very hard. But I'm sure that there are routinely very large differences among indexes and the heap structure. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-04-05 Thread Peter Geoghegan
mething *also* enabled by the conveyor belt design. So overall, in either scenario, VACUUM concentrates on problems that are particular to a given table and workload, without being hindered by implementation-level restrictions. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-04-05 Thread Peter Geoghegan
On Mon, Apr 4, 2022 at 8:25 PM Peter Geoghegan wrote: > Right. The reason I used WARNINGs was because it matches vaguely > related WARNINGs in vac_update_relstats()'s sibling function, > vacuum_set_xid_limits(). Okay, pushed the relfrozenxid warning patch. Thanks -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-04-04 Thread Peter Geoghegan
elated messages on the grounds that they're usually highly obscure issues that are (by definition) never supposed to happen. The only thing that a user can be expected to do with the information from the message is to report it to -bugs, or find some other similar report. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-04-04 Thread Peter Geoghegan
On Fri, Apr 1, 2022 at 10:54 AM Peter Geoghegan wrote: > I also refined the WARNING patch in v15. It now actually issues > WARNINGs (rather than PANICs, which were just a temporary debugging > measure in v14). Going to commit this remaining patch tomorrow, barring objections.

Re: Lowering the ever-growing heap->pd_lower

2022-04-04 Thread Peter Geoghegan
which predate HOT. Obviously an increase in MaxHeapTuplesPerPage is likely to make the problem that the patch proposes to solve worse. I lean towards committing the patch now as work in that direction, in fact. It helps that this patch now seems relatively low risk. -- Peter Geoghegan

Re: Run pg_amcheck in 002_pg_upgrade.pl and 027_stream_regress.pl?

2022-04-04 Thread Peter Geoghegan
nd bugs in other code. I'd really like it if amcheck had HOT chain verification. That's the other area where catching bugs passively with assertions and whatnot is clearly not good enough. -- Peter Geoghegan

Re: Run pg_amcheck in 002_pg_upgrade.pl and 027_stream_regress.pl?

2022-04-03 Thread Peter Geoghegan
vor of using verify_heapam() to its full potential. So I'm +1 on your proposal. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-04-03 Thread Peter Geoghegan
YZE stats in autovacuum.c? Possibly with help from vacuumlazy.c, and the visibility map? I see a lot of potential for exploiting the visibility map more, both within vacuumlazy.c itself, and for autovacuum.c scheduling [1]. I'd probably start with the scheduling stuff, and only then work out how to show users more actionable information. [1] https://postgr.es/m/cah2-wzkt9ey9nnm7q9nsaw5jdbjvsaq3yvb4ut4m93uajvd...@mail.gmail.com -- Peter Geoghegan

Re: CLUSTER sort on abbreviated expressions is broken

2022-04-03 Thread Peter Geoghegan
affected tuplesorts. (Just for CLUSTER tuplesorts on an expression index.) -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-04-01 Thread Peter Geoghegan
[1] https://www.postgresql.org/message-id/CAH2-WzmG%3D_vYv0p4bhV8L73_u%2BBkd0JMWe2zHH333oEujhig1g%40mail.gmail.com -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-04-01 Thread Peter Geoghegan
nt question instead? To put it another way, it would be great if the scheduling code for autovacuum could make inferences about what general strategy works best for a given table over time. In order to be able to do that sensibly, the algorithm needs more context, so that it can course correct wi

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-04-01 Thread Peter Geoghegan
On Thu, Mar 31, 2022 at 11:19 AM Peter Geoghegan wrote: > The assert is "Assert(diff > 0)", and not "Assert(diff >= 0)". Attached is v15. I plan to commit the first two patches (the most substantial two patches by far) in the next couple of days, barring objectio

Re: should vacuum's first heap pass be read-only?

2022-03-31 Thread Peter Geoghegan
*never* vacuum certain indexes on tables prone to non-HOT updates, without that ever causing index bloat. But heap line pointer bloat is eventually going to become a real problem with non-HOT updates, no matter what. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-03-31 Thread Peter Geoghegan
uming in those indexes that don't really need it, in order to be able to do much more in those that do. At some point we must "complete a whole cycle of heap vacuuming" by processing all the heap pages using lazy_vacuum_heap_page() that need it. Separately, the conveyor belt seems to have promise as a way of breaking up work for multiplexing, or parallel processing. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-03-31 Thread Peter Geoghegan
hat we've previously added to the > conveyor belt plus maybe also new ones, we may as well just forget the > whole idea of having a conveyor belt at all. I definitely agree that that's bad, and would be the inevitable result of being lazy about deduplicating consistently. > The only way the conveyor belt system has any > value is if we think that there is some set of circumstances where the > heap scan is separated in time from the index vacuum, such that we > might sometimes do an index vacuum without having done a heap scan > just before. I agree. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-31 Thread Peter Geoghegan
hat causing any new problems. > Can you repro the issue with my recipe? FWIW, adding log_min_messages=debug5 > and fsync=off made the crash trigger more quickly. I'll try to do that today. I'm not feeling the most energetic right now, to be honest. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-31 Thread Peter Geoghegan
abort is pretty rare in the real world, I bet. The speculative insertion precheck is very likely to work almost always with real workloads. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-31 Thread Peter Geoghegan
e beginning? And so the worst problem is probably just that we don't use aggressive VACUUM when we really should in rare cases? -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-31 Thread Peter Geoghegan
around amounts to allowing XIDs "from the future" to exist, which is dangerous. But why here? Won't pruning by VACUUM eventually correct the issue anyway? -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
On Wed, Mar 30, 2022 at 9:29 PM Peter Geoghegan wrote: > > Perhaps we should just fetch the horizons from the "local" catalog for > > shared > > rels? > > Not sure what you mean. Wait, you mean use vacrel->relfrozenxid directly? Seems kind of ugly... -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
the horizons from the "local" catalog for shared > rels? Not sure what you mean. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
cution, based on context? Does it look too high? Something else? -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
On Wed, Mar 30, 2022 at 8:28 PM Andres Freund wrote: > I triggered twice now, but it took a while longer the second time. Great. I wonder if you can get an RR recording... -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
On Wed, Mar 30, 2022 at 7:37 PM Peter Geoghegan wrote: > Yeah, a WARNING would be good here. I can write a new version of my > patch series with a separation patch for that this evening. Actually, > better make it a PANIC for now... Attached is v14, which includes a new patch that PA

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
() instead of an Assert()? Something has gone > pear shaped if we get here... It's a bit annoying though, because it'd have to > be a PANIC to be visible on the bf / CI :(. Yeah, a WARNING would be good here. I can write a new version of my patch series with a separation patch for that this evening. Actually, better make it a PANIC for now... -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
On Wed, Mar 30, 2022 at 12:01 AM Peter Geoghegan wrote: > Perhaps something is amiss inside vac_update_relstats(), where the > boolean flag that indicates that pg_class.relfrozenxid was advanced is > set: > > if (frozenxid_updated) > *frozenxid_updated

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-30 Thread Peter Geoghegan
sily when the regression tests are run. Perhaps I need to do something like that with the other assertion as well (or more likely just get rid of it). Will figure it out tomorrow. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-29 Thread Peter Geoghegan
On Tue, Mar 29, 2022 at 11:58 AM Peter Geoghegan wrote: > > I think I understand what the first paragraph of the header comment > > for heap_tuple_needs_freeze() is trying to say, but the second one is > > quite confusing. I think this is again because it veers into talking

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-29 Thread Peter Geoghegan
s), we can expect those bugs to be hidden for a long time. We might never be 100% sure that we've fixed all of them if the initial design is not generally robust. Most patches are not like that. -- Peter Geoghegan

Re: Add parameter jit_warn_above_fraction

2022-03-29 Thread Peter Geoghegan
On Tue, Mar 29, 2022 at 3:04 PM Tom Lane wrote: > I think David's questions are sufficiently cogent and difficult > that we should not add jit_warn_above_fraction at this time. +1 -- Peter Geoghegan

Re: Add parameter jit_warn_above_fraction

2022-03-29 Thread Peter Geoghegan
pecific query*. That's completely different to any model that expects plan costs to be meaningful in an absolute sense. I'm not completely sure how much that difference matters, but I suspect that the answer is: "it depends, but often it matters a great deal". -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-29 Thread Peter Geoghegan
raparound autovacuums), but can escalate from there. The non-cancellable autovacuum behavior (technically an anti-wraparound thing, but really an aggressiveness thing) should be something we escalate to, as with the failsafe. Dynamic behavior works a lot better. And it makes scheduling of autovacuum workers a lot more straightforward -- the discontinuities seem to make that much harder, which is one more reason to avoid them altogether. -- Peter Geoghegan

Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
at least need to be sure we all are using these terms > the same way. Yeah, there are *endless* opportunities for confusion here. -- Peter Geoghegan

Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
though. I assume that it doesn't really appear in very simple cases (also common cases). But delaying the scan setup work until execution time does seem ugly. That's probably a good enough reason to refactor. -- Peter Geoghegan

Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
ems like something we might actually want to aim for, rather than avoid. Teaching nbtree to transform quals into ranges sounds odd at first, but it seems like the right approach now, on balance -- that's the only *good* way to maintain index order. (Maintaining index order is needed to avoid needing or relying on deduplication in the executor proper, which is even inappropriate in an implementation of SELECT-DISTINCT-that-matches-an-index IMO.) -- Peter Geoghegan

Re: MDAM techniques and Index Skip Scan patch

2022-03-28 Thread Peter Geoghegan
;s doable, but I wouldn't do it unless there was a pretty noticeable payoff. -- Peter Geoghegan

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-28 Thread Peter Geoghegan
On Mon, Mar 28, 2022 at 1:23 PM Peter Geoghegan wrote: > I doubt that the patch's use of pg_memory_barrier() in places like > _bt_killitems() is correct. I also doubt that posting list splits are handled correctly. If there is an LP_DEAD bit set on a posting list on the primary, and

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2022-03-28 Thread Peter Geoghegan
de paths, none of which are in access method code that reads from shared_buffers. So this is not a minor oversight. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-27 Thread Peter Geoghegan
On Thu, Mar 24, 2022 at 2:40 PM Peter Geoghegan wrote: > > > This is absolutely mandatory in the aggressive case, because otherwise > > > relfrozenxid advancement might be seen as unsafe. My observation is: > > > Why should we accept the same race in the non-aggres

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Peter Geoghegan
On Sun, Mar 27, 2022 at 2:02 PM Robert Haas wrote: > On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan wrote: > > We're not dealing > > with adversarial page images here. > > I think it's bad that we have to make that assumption, considering > that there

Re: Assert in pageinspect with NULL pages

2022-03-27 Thread Peter Geoghegan
fset has some wildly unreasonable value. I'm not volunteering. Just saying that this is quite possible. -- Peter Geoghegan

Re: A test for replay of regression tests

2022-03-24 Thread Peter Geoghegan
/postgr.es/m/20220120052404.sonrhq3f3qgplpzj%40alap3.anarazel.de Oh, yeah. If some other backend is holding back OldestXmin, and you can't find a way of dealing with that, then you'll need a temp table. (Mind you, that trick only works on recent versions too.) -- Peter Geoghegan

Re: A test for replay of regression tests

2022-03-24 Thread Peter Geoghegan
n't going to help either, unless you somehow also make sure that FreezeLimit is OldestXmin (e.g. by setting vacuum_freeze_min_age to 0). VACUUM FREEZE (without DISABLE_PAGE_SKIPPING) seems like it would do everything you want, without using a temp table. At least on the master branch. -- Peter Geoghegan

Re: Assert in pageinspect with NULL pages

2022-03-24 Thread Peter Geoghegan
that an 8KiB page is in fact an nbtree page, in a maximally paranoid way. Might be an example worth following here. -- Peter Geoghegan

Re: A test for replay of regression tests

2022-03-24 Thread Peter Geoghegan
the tests fail in exactly the same way you've had problems with on the buildfarm. On the first try, even. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-24 Thread Peter Geoghegan
cause we already avoided the race in > > the aggressive case. > > I do see that there are some difficulties there. I'm not sure what to > do about that. I think a sufficiently clear commit message could > possibly be enough, rather than trying to split the patch. But I also > think splitting the patch should be considered, if that can reasonably > be done. I'll see if I can come up with something. It's hard to be sure about that kind of thing when you're this close to the code. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-24 Thread Peter Geoghegan
he visibility map again later (unless we need to set a bit). It really doesn't matter if somebody else unsets a page's VM bit concurrently, at all. I see a lot of advantage to knowing our final scanned_pages almost immediately. Things like prefetching, capping the size of the dead_items array more intelligently (use final scanned_pages instead of rel_pages in dead_items_max_items()), improvements to progress reporting...not to mention more intelligent choices about whether we should try to advance relfrozenxid a bit earlier during non-aggressive VACUUMs. > Hope that's helpful. Very helpful -- thanks! -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-23 Thread Peter Geoghegan
making relfrozenxid advancement unsafe. It would be great if you could take a look v11-0002-*, Robert. Does it make sense to you? Thanks -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-23 Thread Peter Geoghegan
ably 50 million XIDs behind OldestXmin, the vacuum_freeze_min_age default). I don't see much difference. Anyway, this isn't important. I'll just drop the third patch. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-23 Thread Peter Geoghegan
eft behind by opportunistic pruning. We don't need a cleanup in either lazy_scan_noprune (a share lock is all we need), nor do we even need one in lazy_vacuum_heap_page (a regular exclusive lock is all we need). -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-23 Thread Peter Geoghegan
On Sun, Mar 13, 2022 at 9:05 PM Peter Geoghegan wrote: > Attached is v10. While this does still include the freezing patch, > it's not in scope for Postgres 15. As I've said, I still think that it > makes sense to maintain the patch series with the freezing stuff, > s

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-21 Thread Peter Geoghegan
ing solutions that focus on limiting the downside of not setting LP_DEAD bits, which is local information (not system wide information) that is much easier to understand and target in the implementation. -- Peter Geoghegan

Re: Probable CF bot degradation

2022-03-20 Thread Peter Geoghegan
a very simple method of making the same information more visible, that you could implement in only a few minutes. Perhaps that was optimistic. -- Peter Geoghegan

Re: Probable CF bot degradation

2022-03-20 Thread Peter Geoghegan
ed to test 4 branches at once and to try > to test every branch every 24 hours. Let's see how that goes. Extravagance! -- Peter Geoghegan

Re: Patch proposal - parameter to limit amount of FPW because of hint bits per second

2022-03-20 Thread Peter Geoghegan
y index tuples as a result of avoiding an FPI. This second idea is also much more general than simply avoiding FPIs in general. -- Peter Geoghegan

Hardening heap pruning code (was: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum)

2022-03-19 Thread Peter Geoghegan
On Wed, Mar 9, 2022 at 4:46 PM Andres Freund wrote: > On 2022-03-03 19:31:32 -0800, Peter Geoghegan wrote: > > Attached is a new revision of my fix. This is more or less a > > combination of my v4 fix from November 12 [1] and Andres' > > already-committed fix (commit 18

Re: ICU for global collation

2022-03-17 Thread Peter Geoghegan
On Thu, Mar 17, 2022 at 6:15 AM Peter Eisentraut wrote: > committed, thanks Glad that this finally happened. Thanks to everybody involved! -- Peter Geoghegan

Re: do only critical work during single-user vacuum?

2022-03-15 Thread Peter Geoghegan
ystem to reach xidStopLimit due to the target rel's relfrozenxid age crossing the crucial xidStopLimit crossover point. This patch makes this problem scenario virtually impossible. Right now I'm only prepared to say it's very unlikely. I don't see a reason to take any chances, tho

Re: Fix uninitialized variable access (src/backend/utils/mmgr/freepage.c)

2022-03-14 Thread Peter Geoghegan
g palloc0() rather than palloc()) makes sense as a defensive measure. It depends on the specific code, of course. -- Peter Geoghegan

Re: Lowering the ever-growing heap->pd_lower

2022-03-14 Thread Peter Geoghegan
time to further review this patch and/or commit it? I'll definitely review it some more before too long. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-13 Thread Peter Geoghegan
On Fri, Feb 25, 2022 at 5:52 PM Peter Geoghegan wrote: > There is an important practical way in which it makes sense to treat > 0001 as separate to 0002. It is true that 0001 is independently quite > useful. In practical terms, I'd be quite happy to just get 0001 into > Postgres

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-03-01 Thread Peter Geoghegan
ssions are quite possible, and a real concern -- but regressions *like that* are unlikely. Avoiding doing what is clearly the wrong thing just seems to work out that way, in general. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-25 Thread Peter Geoghegan
> It might make sense to separate the purposes of SKIP_PAGES_THRESHOLD. The > relfrozenxid advancement doesn't benefit from visiting all-frozen pages, just > because there are only 30 of them in a row. Right. I imagine that SKIP_PAGES_THRESHOLD actually does help with this, but if we actually tried we'd find a much better way. > I wish somebody would tackle merging heap_page_prune() with > vacuuming. Primarily so we only do a single WAL record. But also because the > separation has caused a *lot* of complexity. I've already more projects than > I should, otherwise I'd start on it... That has value, but it doesn't feel as urgent. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-25 Thread Peter Geoghegan
is this: in general, there are probably quite a few opportunities for FreezeMultiXactId() to avoid allocating new XMIDs (just to freeze XIDs) by having the full context. And maybe by making the dialog between lazy_scan_prune and heap_prepare_freeze_tuple a bit more nuanced. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-25 Thread Peter Geoghegan
On Fri, Feb 25, 2022 at 2:00 PM Peter Geoghegan wrote: > > Hm. I guess I'll have to look at the code for it. It doesn't immediately > > "feel" quite right. > > I kinda think it might be. Please let me know if you see a problem > with what I've sai

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-25 Thread Peter Geoghegan
_FROZEN(vacrel->rel, blkno, &vmbuffer)) vacrel->frozenskipped_pages++; continue; } The fact that this is conditioned in part on "vacrel->aggressive" concerns me here. Why should we have a special case for this, where we condition something on aggressive-ness that isn't actually strictly related to that? Why not just remember that the range that we're skipping was all-frozen up-front? That way non-aggressive VACUUMs are not unnecessarily at a disadvantage, when it comes to being able to advance relfrozenxid. What if we end up not incrementing vacrel->frozenskipped_pages when we easily could have, just because this is a non-aggressive VACUUM? I think that it's worth avoiding stuff like that whenever possible. Maybe this particular example isn't the most important one. For example it probably isn't as bad as the one was fixed by the lazy_scan_noprune work. But why even take a chance? Seems easier to remove the special case -- which is what this really is. > FWIW, I'd really like to get rid of SKIP_PAGES_THRESHOLD. It often ends up > causing a lot of time doing IO that we never need, completely trashing all CPU > caches, while not actually causing decent readaead IO from what I've seen. I am also suspicious of SKIP_PAGES_THRESHOLD. But if we want to get rid of it, we'll need to be sensitive to how that affects relfrozenxid advancement in non-aggressive VACUUMs IMV. Thanks again for the review! -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-02-25 Thread Peter Geoghegan
s themselves are similar. We want options and maximum flexibility, everywhere. > but if you are going to rescan the heap > again next time before doing any index vacuuming then why we want to > store them anyway. It all depends, of course. The decision needs to be made using a cost model. I suspect it will be necessary to try it out, and see. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-24 Thread Peter Geoghegan
On Sun, Feb 20, 2022 at 12:27 PM Peter Geoghegan wrote: > You've given me a lot of high quality feedback on all of this, which > I'll work through soon. It's hard to get the balance right here, but > it's made much easier by this kind of feedback. Attached is v9

Re: [PATCH] add relation and block-level filtering to pg_waldump

2022-02-24 Thread Peter Geoghegan
rep` as well as more reliable given specific variations > in output style > depending on how the blocks are specified. Sounds useful to me. -- Peter Geoghegan

Re: Add index scan progress to pg_stat_progress_vacuum

2022-02-23 Thread Peter Geoghegan
ort for a table where the failsafe kicked in). -- Peter Geoghegan

Re: small development tip: Consider using the gold linker

2022-02-22 Thread Peter Geoghegan
slow the last few times I used it, which wasn't for long enough for it to really matter. This must have been why. I might have to rescind my recommendation of lld. -- Peter Geoghegan

Re: do only critical work during single-user vacuum?

2022-02-20 Thread Peter Geoghegan
sed on averting MultiXact wraparound? I'm hoping that the patch that adds smarter tracking of final relfrozenxid/relminmxid values during VACUUM makes this less of a problem automatically. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-20 Thread Peter Geoghegan
/postgr.es/m/CAH2-Wz=ilnf+0csab37efxcgmrjo1dyjw5hmzm7tp1axg1n...@mail.gmail.com -- scroll down to "TPC-C", which has the relevant autovacuum log output for the orders table, covering a 24 hour period -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
with or just after) some xact that inserted or updated on the page aborts. Just as long as we have a consistent idea about what's going on at the level of the whole page (or maybe the level of each HOT chain, but the whole page level seems simpler to me). -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
st, HTSV ain't cheap. I guess it doesn't actually matter if we leave an aborted DEAD tuple behind, that we could have pruned away, but didn't. The important thing is to be consistent at the level of the page. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
of root items is just arbitrary. It seems to have more to do with freezing tuples than killing tuples. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
On Sat, Feb 19, 2022 at 6:16 PM Peter Geoghegan wrote: > > Given that heap_surgery's raison d'etre is correcting corruption etc, I > > think > > it makes sense for it to do as minimal work as possible. Iterating through a > > HOT chain would be a problem if yo

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
ted() have a different return value, preventing the > "If the tuple is DEAD and doesn't chain to anything else" > path from being taken. That makes sense as an explanation. Goes to show just how fragile the "DEAD and doesn't chain to anything else" logic at the top of heap_prune_chain really is. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
EAD before we even reached heap_page_prune() (on account of the pg_surgery corruption), there is no possible way that that can happen later on. And so we cannot find the same heap-only tuple and mark it LP_UNUSED (which is how we always deal with HEAPTUPLE_DEAD heap-only tuples) during pruning. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
hich is kind of misleading). Anyway, we can decide on what to do in heap_surgery later, once the main issue is under control. My point was mostly just that orphaned heap-only tuples are definitely not okay, in general. They are the least worst option when corruption has already happened, maybe -- but maybe not. -- Peter Geoghegan corrupt-hot-chain.page.gz Description: GNU Zip compressed data

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
On Sat, Feb 19, 2022 at 4:22 PM Peter Geoghegan wrote: > This very much looks like a bug in pg_surgery itself now -- attached > is a draft fix. Wait, that's not it either. I jumped the gun -- this isn't sufficient (though the patch I posted might not be a bad idea anyway). Looks

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
On Sat, Feb 19, 2022 at 3:08 PM Peter Geoghegan wrote: > It's quite possible that this is nothing more than a bug in my > adversarial gizmo patch -- since I don't think that > ConditionalLockBufferForCleanup() can ever fail with a temp buffer > (though even that's not

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-19 Thread Peter Geoghegan
On Fri, Feb 18, 2022 at 5:00 PM Peter Geoghegan wrote: > Another testing strategy occurs to me: we could stress-test the > implementation by simulating an environment where the no-cleanup-lock > path is hit an unusually large number of times, possibly a fixed > percentage of the time

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-18 Thread Peter Geoghegan
ossible), I wouldn't expect this ConditionalLockBufferForCleanup() testing gizmo to be too disruptive. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-18 Thread Peter Geoghegan
r pgbench_branches), because they're generally big tables, where the overhead of FPIs tends to dominate anyway (gambling that we can avoid more FPIs later on is not a bad gamble, as gambles go). This seems to make the overhead acceptable, on balance. Granted, you might be able to poke holes in that argument, and reasonable people might disagree on what acceptable should mean. There are many value judgements here, which makes it complicated. (On the other hand we might be able to do better if there was a particularly bad case for the 0002 work, if one came to light.) -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-18 Thread Peter Geoghegan
hout benefit in certain cases. Although I think that this can be justified as the cost of doing business, that's a hard argument to make. In short, 0001 is mechanically tricky, but easy to understand at a high level. Whereas 0002 is mechanically simple, but tricky to understand at a high level (and therefore far trickier than 0001 overall). -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-18 Thread Peter Geoghegan
On Fri, Feb 11, 2022 at 8:30 PM Peter Geoghegan wrote: > Attached is v8. No real changes -- just a rebased version. Concerns about my general approach to this project (and even the Postgres 14 VACUUM work) were expressed by Robert and Andres over on the "Nonrandom scanned_pages

Re: Nonrandom scanned_pages distorts pg_class.reltuples set by VACUUM

2022-02-17 Thread Peter Geoghegan
as > refactoring work"), and then evolved into something not just a refactoring. Of course. > If helpful I can give a go at showing how I think it could be split up. Or > perhaps more productively, do that on a not-yet-committed larger patch. Any help is appreciated. -- Peter Geoghegan

<    6   7   8   9   10   11   12   13   14   15   >