nbtree's ScalarArrayOp array mark/restore code appears to be buggy

2023-09-22 Thread Peter Geoghegan
current scan position and the current array keys a great deal looser. [1] https://commitfest.postgresql.org/44/4455/ -- Peter Geoghegan nbtree_array_mark_restore_bug.sql Description: Binary data

Re: nbtree's ScalarArrayOp array mark/restore code appears to be buggy

2023-09-23 Thread Peter Geoghegan
On Fri, Sep 22, 2023 at 8:17 PM Peter Geoghegan wrote: > My suspicion is that bugfix commit 70bc5833 missed some subtlety > around what we need to do to make sure that the array keys stay "in > sync" with the scan. I'll have time to debug the problem some more > tomorro

Re: nbtree's ScalarArrayOp array mark/restore code appears to be buggy

2023-09-23 Thread Peter Geoghegan
On Sat, Sep 23, 2023 at 11:47 AM Peter Geoghegan wrote: > The fix for this should be fairly straightforward. We must teach > _bt_restore_array_keys() to distinguish "past the end of the array" > from "after the start of the array", so that doesn't sp

Re: nbtree's ScalarArrayOp array mark/restore code appears to be buggy

2023-09-25 Thread Peter Geoghegan
On Sat, Sep 23, 2023 at 4:22 PM Peter Geoghegan wrote: > Attached draft patch shows how this could work. > > _bt_restore_array_keys() has comments that seem to suppose that > calling _bt_preprocess_keys is fairly expensive, and something that's > well worth avoiding. But...is

Re: Eager page freeze criteria clarification

2023-09-25 Thread Peter Geoghegan
reeze debt on this thread. If 90% of the pages in the entire database are frozen, it'll generally be okay if we make the wrong call by freezing lazily when we shouldn't. This is doubly true within small to medium sized tables, where the cost of catching up on freezing cannot ever be too bad (concentrations of unfrozen pages in one big table are what really hurt users). -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-26 Thread Peter Geoghegan
so think that the absolute amount of debt (measured in physical units such as unfrozen pages) should be kept under control. But that isn't something that can ever be expected to work on the basis of a simple threshold -- if only because autovacuum scheduling just doesn't work that way, and can't really be adapted to work that way. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2023 at 10:01 AM Andres Freund wrote: > On 2023-09-26 09:07:13 -0700, Peter Geoghegan wrote: > I don't think doing this on a system wide basis with a metric like #unfrozen > pages is a good idea. It's quite common to have short lived data in some > tables

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
> median 64bit xid would be interesting because it'd not get "invalidated" if > relfrozenxid is increased. I'm glad that you're mostly of the view that we should be freezing a lot more aggressively overall, but I think that you're still too focussed on avoiding small problems. I understand why novel new problems are generally more of a concern than established old problems, but there needs to be a sense of proportion. Performance stability is incredibly important, and isn't zero cost. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
where I'd expect Melanie's current approach of optimizing a whole cross-section of representative workloads to really help with. I have a separate concern about it that I'll raise shortly, in my response to Andres. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
d, when your algorithm decides to not freeze (for pages that are still set all-visible in the VM), you really can't afford to be wrong. (I think that you get this already, but it's a point worth emphasizing.) -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
ven with this behavior of scanning all-visible pages in non-aggressive VACUUMs? Big append-only tables simply won't get the opportunity to catch up in the next non-aggressive VACUUM if there simply isn't one. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2023 at 1:45 PM Andres Freund wrote: > On 2023-09-27 13:14:41 -0700, Peter Geoghegan wrote: > > As a general rule, I think that we're better of gambling against > > future FPIs, and then pulling back if we go too far. The fact that we > > went one VA

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2023 at 2:26 PM Peter Geoghegan wrote: > On Wed, Sep 27, 2023 at 1:45 PM Andres Freund wrote: > > I think we need to make vacuums on large tables much more aggressive than > > they > > are now, independent of opportunistic freezing heuristics. It's

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
The choice to freeze or not freeze pretty much always relies on guesswork about what'll happen to the page in the future, no? Obviously we wouldn't even apply the FPI trigger criteria if we could somehow easily determine that it won't work out (to some degree that's what conditioning it on being able to set the all-frozen VM bit actually does). -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
require it" That makes sense when you consider where we are right now, but it'll sound odd in a world where freezing via min_freeze_age is the exception rather than the rule. If anything, it would make more sense if the traditional min_freeze_age trigger criteria was the type of freezing that needed its own adjective. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
ssues. Of course it's very subtle. > I think at the very least there'd need to be something causing pages to reopen > once the aggregate unused space in the table reaches some threshold. Of course that's true. ISTM that you might well need some kind of hysteresis to avoid pages ping-ponging. If it isn't sticky, it might never settle, or take way too long to settle. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2023 at 5:42 PM Melanie Plageman wrote: > On Wed, Sep 27, 2023 at 5:27 PM Peter Geoghegan wrote: > > What about my idea of holding back when some tuples are already frozen > > from before? Admittedly that's still a fairly raw idea, but something > &g

Re: Index range search optimization

2023-09-27 Thread Peter Geoghegan
On Wed, Sep 27, 2023 at 9:41 AM Alexander Korotkov wrote: > Fixed typo inficating => indicating as pointed by Pavel. > Peter, what do you think about the current shape of the patch? I'll try to get to this tomorrow. I'm rather busy with moving home at the moment, unfortu

Re: Eager page freeze criteria clarification

2023-09-27 Thread Peter Geoghegan
t to avoid freezing the same pages again in a tight loop? On a positive note, I like that what you've laid out freezes eagerly when an FPI won't result -- this much we can all agree on. I guess that that part is becoming uncontroversial. -- Peter Geoghegan

Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

2023-09-28 Thread Peter Geoghegan
On Sun, Sep 17, 2023 at 4:47 PM Peter Geoghegan wrote: > Attached is v2, which makes all array key advancement take place using > the "next index tuple" approach (using binary searches to find array > keys using index tuple values). Attached is v3, which fixes bitrot cause

Re: Index range search optimization

2023-09-28 Thread Peter Geoghegan
age() without having reached the point in _bt_first() where you initialize so->firstPage to "true". It would probably make sense if the flag was initialized to "false" in the same way as most other scan state is already, somewhere in nbtree.c. Probably in btrescan(). -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-29 Thread Peter Geoghegan
On Fri, Sep 29, 2023 at 7:55 AM Robert Haas wrote: > On Thu, Sep 28, 2023 at 12:03 AM Peter Geoghegan wrote: > > But isn't the main problem *not* freezing when we could and > > should have? (Of course the cost of freezing is very relevant, but > > it's still seco

Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers

2023-09-29 Thread Peter Geoghegan
e really important point is that the TID (which maps to the root item of the HOT chain) has a decent chance of being stable over time, no matter how many versions the HOT chain churns through. And that that can break (or at least weaken) our dependence on VACUUM with some workloads. -- Peter Geoghegan

Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers

2023-09-29 Thread Peter Geoghegan
On Fri, Sep 29, 2023 at 11:04 AM Peter Geoghegan wrote: > > But when a HOT update happens the entry in an (logically unchanged) > > index still points to the original heap tid, and that line item is > > updated with a pointer to the new line pointer in the same page. >

Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers

2023-09-29 Thread Peter Geoghegan
rforming "valid transitions" for each tuple/line pointer). That is, we don't really care about the difference between calling ItemIdSetRedirect() for an LP_NORMAL item versus an existing LP_REDIRECT item at the code level (we just do it and let PageRepairFragmentation() clean things up). -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-29 Thread Peter Geoghegan
I'm skeptical of varying the LSN distance, but I'm not skeptical of the idea of caring about FPIs in general. I wonder how much truly useful work VACUUM performed for pgbench_accounts during Melanie's performance evaluation -- leaving freezing aside. For the "too much freezing for pgbench_accounts" case, where master performed better than the patch, would it have been possible to do even better than that by simply turning off autovacuum? Or at least increasing the scale factor that triggers autovacuuming? (The answer will depend to some extent on heap fill factor.) -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-09-29 Thread Peter Geoghegan
il, the lesson offered by pgbench_accounts table seems to be "never VACUUM at all, except perhaps to advance relfrozenxid" (which shouldn't actually require any freezing even one page). If you haven't tuned heap fill factor, then you might want to VACUUM a bit, at first. But, overall, vacuuming is bad. That is the logical though absurd conclusion. It completely flies in the face of practical experience. -- Peter Geoghegan

Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers

2023-09-29 Thread Peter Geoghegan
On Fri, Sep 29, 2023 at 6:27 PM James Coleman wrote: > On Fri, Sep 29, 2023 at 4:06 PM Peter Geoghegan wrote: > > I think that it's talking about what happens during opportunistic > > pruning, in particular what happens to HOT chains. (Though pruning > > does almost

Re: pgstatindex vs. !indisready

2023-10-01 Thread Peter Geoghegan
ught that this behavior was ideal, but ISTM that it has fewer problems than any alternative approach you can think of. The same argument works just as well with any function that accepts a regclass argument IMV. -- Peter Geoghegan

Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-01 Thread Peter Geoghegan
on invalidated anything. I meant to follow-up on investigating the extent to which anything could hold up OldestMXact without also holding up OldestXmin/removable cutoff, but that doesn't seem essential. This patch does indeed seem "ready for committer". John? -- Peter Geoghegan

Re: pgstatindex vs. !indisready

2023-10-01 Thread Peter Geoghegan
very doesn't seem sensible. After all, the answer that RecoveryInProgress() gives can change in a way that's observable within individual transactions. Again, I wouldn't claim that this is very elegant. Just that it seems to have the fewest problems. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-10-02 Thread Peter Geoghegan
measure than the utility of pruning, but why should we assume that pruning isn't already just as much of a problem? (Maybe that's not a problem that particularly interests you right now; I'm bringing it up because it seems possible that putting it in scope could somehow clarify what to do about freezing.) -- Peter Geoghegan

Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-04 Thread Peter Geoghegan
ver as committer here, I'll let the issue of backpatching go. I only ask that you note why you've not backpatched in the commit message. -- Peter Geoghegan

Re: Eager page freeze criteria clarification

2023-10-04 Thread Peter Geoghegan
On Mon, Oct 2, 2023 at 4:25 PM Robert Haas wrote: > On Mon, Oct 2, 2023 at 11:37 AM Peter Geoghegan wrote: > > If no vacuuming against pgbench_accounts is strictly better than some > > vacuuming (unless it's just to advance relfrozenxid, which can't be > > a

Re: post-recovery amcheck expectations

2023-10-09 Thread Peter Geoghegan
ed, since it requires that we worry about both phases of page deletion -- not just the first. That in itself necessitates that we deal with various edge cases. (The really prominent edge-case is the interrupted page deletion case, which requires significant handling, but evidently missed a subtlety with leftmost pages). -- Peter Geoghegan

Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-10 Thread Peter Geoghegan
t representation/output -- the details beyond that shouldn't matter. I was happy with how easy it was to make this assertion fail (with a known broken numeric_ops opclass) while testing/developing deduplication. I'm a little surprised that it took this long to notice the interval_ops issue. Do we really need to change the catalog contents when backpatching? -- Peter Geoghegan

Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-10 Thread Peter Geoghegan
ghted is exactly the kind of issue that I anticipated might happen at some point. This seems straightforward. -- Peter Geoghegan

Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-10 Thread Peter Geoghegan
On Tue, Oct 10, 2023 at 8:51 PM Peter Geoghegan wrote: > I don't see any reason to delay committing your fix. The issue that > you've highlighted is exactly the kind of issue that I anticipated > might happen at some point. This seems straightforward. BTW, we don'

Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-11 Thread Peter Geoghegan
ted indexes using SQL. So this is one case where telling users to REINDEX really does seem like the best thing (as opposed to something we say because we're too lazy to come up with nuanced, practical guidance). -- Peter Geoghegan

Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-12 Thread Peter Geoghegan
tation changes that are directly dependent on > those message changes. And I might also be inclined to back-patch the > former patch as far as it makes sense to do so, while leaving the > latter one master-only. No objections from me. -- Peter Geoghegan

Re: [PATCH] Clarify the behavior of the system when approaching XID wraparound

2023-10-12 Thread Peter Geoghegan
On Thu, Oct 12, 2023 at 1:10 PM Robert Haas wrote: > On Thu, Oct 12, 2023 at 12:01 PM Peter Geoghegan wrote: > > No objections from me. > > Here is a doc-only patch that I think could be back-patched as far as > emergency mode exists. It combines all of the wording changes to t

Re: interval_ops shall stop using btequalimage (deduplication)

2023-10-12 Thread Peter Geoghegan
context, without really adding a special case, and without any real > > question of users being misled. > > Works for me. Added. Looks good. Thanks! -- Peter Geoghegan

Re: Improve search for missing parent downlinks in amcheck

2020-03-10 Thread Peter Geoghegan
ementation detail of amcheck, but that doesn't apply here. > Thank you. I'd like to have another feedback from you assuming there > are logic changes. This looks committable. I only noticed one thing: The comments above bt_target_page_check() need to be updated to reflect the new

Re: Improve search for missing parent downlinks in amcheck

2020-03-11 Thread Peter Geoghegan
On Wed, Mar 11, 2020 at 2:02 AM Alexander Korotkov wrote: > Thank you! Pushed with this comment revised! Thanks! -- Peter Geoghegan

Re: add types to index storage params on doc

2020-03-15 Thread Peter Geoghegan
On Sun, Mar 15, 2020 at 7:10 PM Atsushi Torikoshi wrote: > I think it'll be better to add types to storage parameters > on CREATE INDEX for the consistency. Seems reasonable to me. -- Peter Geoghegan

nbtree: Refactor "fastpath" and _bt_search() code

2020-03-16 Thread Peter Geoghegan
ence point for other code that sets up the fastpath optimization that will now actually be used in _bt_search_insert(). -- Peter Geoghegan v1-0001-Refactor-_bt_doinsert-fastpath-optimization.patch Description: Binary data

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-17 Thread Peter Geoghegan
of B-Tree indexes over many hours while BenchmarkSQL/TPC-C [1] ran, for example. [1] https://github.com/petergeoghegan/benchmarksql -- Peter Geoghegan

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-17 Thread Peter Geoghegan
_xlog_unlink_page() was the problem. -- Peter Geoghegan

Re: nbtree: assertion failure in _bt_killitems() for posting tuple

2020-03-19 Thread Peter Geoghegan
{bi_hi = 0, bi_lo = 2}, ip_posid = 200}, > indexOffset = 121, tupleOffset = 32639} > > > Unless I miss something, this assertion must be removed. Is this index an unlogged index, under the hood? -- Peter Geoghegan

Re: nbtree: assertion failure in _bt_killitems() for posting tuple

2020-03-19 Thread Peter Geoghegan
he page is changed at all -- the LSN is checked by the logic added by commit 2ed5b87f. That's why I asked about unlogged indexes (we don't do the LSN thing there). But I still think that we need to take a firm position on it. -- Peter Geoghegan

Re: Why does [auto-]vacuum delay not report a wait event?

2020-03-21 Thread Peter Geoghegan
> On a green field I'd really like to pass a 'vacuum state' struct to > vacuum_delay_point(). In a green field situation, there'd be no ginInsertCleanup() at all. It is a Lovecraftian horror show. The entire thing should be scrapped now, in fact. -- Peter Geoghegan

Re: Why does [auto-]vacuum delay not report a wait event?

2020-03-21 Thread Peter Geoghegan
VACUUM or not is surely relevant, or at least relevant to the issue that Mahendra just reported. shiftList() relies on this directly already. -- Peter Geoghegan

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-27 Thread Peter Geoghegan
could hardly be more conservative (though see the code and comments at the end of btree_xlog_split(), which mention locking and backwards scans directly). -- Peter Geoghegan

Re: [PATCH] Btree BackwardScan race condition on Standby during VACUUM

2020-03-27 Thread Peter Geoghegan
an affected (LP_DEAD bits set) leaf page. Again, I suspect that the problem is more likely to occur on Postgres 12 in practice because page deletion is more likely to occur on that version. IOW, due to my B-Tree work for Postgres 12: commit dd299df8, and related commits. That's probably all that there is to it. -- Peter Geoghegan

Minor bug in suffix truncation of non-key attributes from INCLUDE indexes

2020-03-28 Thread Peter Geoghegan
have enough duplicates to make nbtsplitloc.c ever use its "single value" strategy). Attached patch fixes the issue. Barring objections, I'll push this to v12 + master branches early next week. The bug is low severity, but then the fix is very low risk. -- Peter Geoghegan v1-0001-Consi

Re: Improving connection scalability: GetSnapshotData()

2020-03-28 Thread Peter Geoghegan
leneck is a *huge* problem for us. (As problems for Postgres users go, I would probably rank it second behind issues with VACUUM.) -- Peter Geoghegan

Re: Potential (low likelihood) wraparound hazard in heap_abort_speculative()

2020-03-29 Thread Peter Geoghegan
rent to RecentXmin. I am in favor of fixing the issue, and backpatching all the way. I just want to put the issue in perspective, and have my own understanding of things verified. -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
He was jet lagged from travelling to India at the time. He went to huge lengths to make sure that the bug was correctly squashed. > Actually removing the code is unnecessary, protects > nobody, and has risk. Every possible approach has risk. We are deciding among several unpleasant and risky alternatives here, no? -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
t would certainly be welcome from my perspective. I had a few other things that I was going to work on this week, but those seems less urgent. I'll take a look into it, and report back what I find. -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
have to wait around for a minute or two to reproduce it each time. Makes it hard to get to a minimal test case. -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
On Wed, Apr 1, 2020 at 3:00 PM Peter Geoghegan wrote: > I like that idea. I think that I've spotted what may be an independent > bug, but I have to wait around for a minute or two to reproduce it > each time. Makes it hard to get to a minimal test case. I now have simple steps to r

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
antum? That made reproducing the bug *very* tedious. -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-01 Thread Peter Geoghegan
ages? We only ever call PredicateLockPage() on a leaf nbtree page. Why the inconsistency between the two similar-seeming cases? -- Peter Geoghegan

Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

2020-04-01 Thread Peter Geoghegan
ces, overall. Can Dilip demonstrate the the "extra" buffer accesses are proportionate to the number of workers launched in some constant, predictable way? -- Peter Geoghegan

Re: snapshot too old issues, first around wraparound and then more.

2020-04-02 Thread Peter Geoghegan
here. There are glaring problems with how we manipulate the data structure that controls the effective horizon for pruning. Maybe they can be fixed while leaving the code that manages the OldSnapshotControl circular buffer in something resembling its current form, but I doubt it. In my opini

Re: snapshot too old issues, first around wraparound and then more.

2020-04-02 Thread Peter Geoghegan
On Thu, Apr 2, 2020 at 11:28 AM Peter Geoghegan wrote: > In conclusion, I share Andres' concerns here. There are glaring > problems with how we manipulate the data structure that controls the > effective horizon for pruning. Maybe they can be fixed while leaving > the code

Re: snapshot too old issues, first around wraparound and then more.

2020-04-02 Thread Peter Geoghegan
ng as they could still access ODBC's "100 rows in a cache" through the cursor. The docs say that a old_snapshot_threshold setting in the hours is about the lowest reasonable setting for production use, which seems rather high to me. It almost seems as if the feature specifically targets misbehaving applications already. -- Peter Geoghegan

Re: vacuum_defer_cleanup_age inconsistently applied on replicas

2020-04-03 Thread Peter Geoghegan
. OTOH, I wonder if it's possible that vacuum_defer_cleanup_age was deliberately intended to affect the behavior of XLogWalRcvSendHSFeedback(), which is probably one of the most common reasons why GetOldestXmin() is called on standbys. -- Peter Geoghegan

Re: vacuum_defer_cleanup_age inconsistently applied on replicas

2020-04-03 Thread Peter Geoghegan
On Fri, Apr 3, 2020 at 4:18 PM Peter Geoghegan wrote: > OTOH, I wonder if it's possible that vacuum_defer_cleanup_age was > deliberately intended to affect the behavior of > XLogWalRcvSendHSFeedback(), which is probably one of the most common > reasons why GetOldestXmin() is c

Re: Reinitialize stack base after fork (for the benefit of rr)?

2020-04-04 Thread Peter Geoghegan
owerful. I agree that rr is very useful. It would be great if we had a totally smooth workflow for debugging using rr. -- Peter Geoghegan

Re: Thoughts on "killed tuples" index hint bits support on standby

2020-04-05 Thread Peter Geoghegan
e can probably push this general approach forward in a number of different ways. I just started with unique indexes because that seemed most promising. I have only worked on the project for a few days. I don't really know how it will evolve. -- Peter Geoghegan 0001-Non-opportunistically-delete-B-Tree-items.patch Description: Binary data

Re: nbtree: assertion failure in _bt_killitems() for posting tuple

2020-04-05 Thread Peter Geoghegan
time we reach _bt_killitems(), even though _bt_killitems() does get to kill items. I am thinking about pushing a fix along the lines of the attached patch. This preserves the assertion, while avoiding the check in cases where it doesn't apply, such as when a dirty snapshot is in use. -- Pet

Re: Reinitialize stack base after fork (for the benefit of rr)?

2020-04-05 Thread Peter Geoghegan
recording (e.g. pointers, PIDs) is stable, it seems possible to treat a recording as a totally self contained thing. Other resources: https://github.com/mozilla/rr/wiki/Usage https://github.com/mozilla/rr/wiki/Debugging-protips [1] https://github.com/mozilla/rr/issues/91 -- Peter Geoghegan

Re: Reinitialize stack base after fork (for the benefit of rr)?

2020-04-05 Thread Peter Geoghegan
ack base - > but didn't end up checking stack depth except for expression indexes). No, just a personal preference for things like this. -- Peter Geoghegan

Using the rr debugging tool to debug Postgres

2020-04-06 Thread Peter Geoghegan
nce you're already familiar with gdb. I have written a Wiki page on how to use rr to record and replay Postgres executions: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Recording_Postgres_using_rr_Record_and_Replay_Framework -- Peter Geoghegan

Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

2020-04-06 Thread Peter Geoghegan
On Mon, Apr 6, 2020 at 2:21 AM Amit Kapila wrote: > AFAIU, it uses heapam_index_build_range_scan but for writing to index, > it doesn't use buffer manager. Right. It doesn't need to use the buffer manager to write to the index, unlike (say) GIN's CREATE INDEX. -- Peter Geoghegan

Re: nbtree: assertion failure in _bt_killitems() for posting tuple

2020-04-06 Thread Peter Geoghegan
On Sun, Apr 5, 2020 at 5:15 PM Peter Geoghegan wrote: > I am thinking about pushing a fix along the lines of the attached > patch. This preserves the assertion, while avoiding the check in cases > where it doesn't apply, such as when a dirty snapshot is in use. Pushed. Than

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-17 Thread Peter Geoghegan
On Mon, Jan 17, 2022 at 7:12 AM Robert Haas wrote: > On Thu, Jan 13, 2022 at 4:27 PM Peter Geoghegan wrote: > > 1. Cases where our inability to get a cleanup lock signifies nothing > > at all about the page in question, or any page in the same table, with > > the sam

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-17 Thread Peter Geoghegan
On Mon, Jan 17, 2022 at 2:13 PM Robert Haas wrote: > On Mon, Jan 17, 2022 at 4:28 PM Peter Geoghegan wrote: > > Updating relfrozenxid should now be thought of as a continuous thing, > > not a discrete thing. > > I think that's pretty nearly 100% wrong. The most simp

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-17 Thread Peter Geoghegan
On Mon, Jan 17, 2022 at 8:13 PM Robert Haas wrote: > On Mon, Jan 17, 2022 at 5:41 PM Peter Geoghegan wrote: > > That just seems like semantics to me. The very next sentence after the > > one you quoted in your reply was "And so it's highly unlikely that any > > giv

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-18 Thread Peter Geoghegan
On Tue, Jan 18, 2022 at 6:11 AM Robert Haas wrote: > On Tue, Jan 18, 2022 at 12:14 AM Peter Geoghegan wrote: > > I quite clearly said that you'll only get an anti-wraparound VACUUM > > with the patch applied when the only factor that *ever* causes *any* > > autovacuum

Re: A qsort template

2022-01-18 Thread Peter Geoghegan
lasses with abbreviated keys encoded as unsigned integers. Just a thought. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-19 Thread Peter Geoghegan
arlier, we now know that the patient almost certainly has a brain tumor. What new risk is implied by delaying the wait like this? Very little, I believe. Lets say we derive FreezeLimit from autovacuum_freeze_max_age/2 (instead of vacuum_freeze_min_age). We still ought to have the opportunity to wait for the cleanup lock for rather a long time -- if the XID consumption rate is so high that that isn't true, then we're doomed anyway. All told, there seems to be a huge net reduction in risk with this design. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-20 Thread Peter Geoghegan
in_age to 0 internally) continue to work. So maybe its default should be changed to -1, which is interpreted as "whatever autovacuum_freeze_max_age/2 is". But it should still be greatly deemphasized in user docs. -- Peter Geoghegan

Re: autovacuum prioritization

2022-01-20 Thread Peter Geoghegan
. The needs of queries matters, but controlling costs matters too. One of the most effective techniques is to manually VACUUM when the system is naturally idle, like at night time. If that could be quasi-automated, or if the criteria used by autovacuum scheduling gave just a little weight to how busy the system is right now, then we would have more slack when the system becomes very busy. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-01-21 Thread Peter Geoghegan
o highlight that they're now closely related concepts. Now if you VACUUM a table that is either empty or has only frozen tuples, VACUUM will set relfrozenxid to oldestxmin/removable cutoff. Internally, oldestxmin is the "starting point" for our final/target relfrozenxid for the table. We ratchet it back dynamically, whenever we see an older-than-current-target XID that cannot be immediately frozen (e.g., when we can't easily get a cleanup lock on the page). -- Peter Geoghegan

Re: autovacuum prioritization

2022-01-22 Thread Peter Geoghegan
-- just wait (say) another 60 seconds, and then launch a new autovacuum worker on the same table if it became larger by some smallish fixed amount (stop caring about percentage table growth). Constant mini-vacuums against such a table make sense, since costs are almost exactly proportional to the number of heap pages appended since the last VACUUM. -- Peter Geoghegan

Re: autovacuum prioritization

2022-01-25 Thread Peter Geoghegan
. Even busy production DBs should usually only be vacuuming one large table at a time. Also might make sense to strategically align the work with the beginning of a new checkpoint. -- Peter Geoghegan

Re: autovacuum prioritization

2022-01-26 Thread Peter Geoghegan
> On Wed, Jan 26, 2022 at 10:55 AM Robert Haas wrote: > On Tue, Jan 25, 2022 at 3:32 PM Peter Geoghegan wrote: > > For example, a > > page that has 5 dead heap-only tuples is vastly different to a similar > > page that has 5 LP_DEAD items instead -- and yet our curre

Re: autovacuum prioritization

2022-01-26 Thread Peter Geoghegan
'll be able to notice the inherent futility of an anti-wraparound VACUUM that runs against a table whose relfrozenxid is already exactly equal to the VACUUM's OldestXmin (say because of a leaked replication slot -- anything that makes vacuuming fundamentally unable to advance relfrozenxid, really). -- Peter Geoghegan

Why is INSERT-driven autovacuuming based on pg_class.reltuples?

2022-01-27 Thread Peter Geoghegan
for an append-only table, relative to the documented behavior of autovacuum_vacuum_insert_scale_factor? -- Peter Geoghegan

Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

2022-01-27 Thread Peter Geoghegan
On Thu, Jan 27, 2022 at 12:20 PM Peter Geoghegan wrote: > Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems > to assume that it's only something that VACUUM can ever do. Why > wouldn't we expect a plain ANALYZE to have actually been the las

Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

2022-01-28 Thread Peter Geoghegan
eemed related to this messiness with statistics and pg_class. -- Peter Geoghegan

Re: Why is INSERT-driven autovacuuming based on pg_class.reltuples?

2022-02-02 Thread Peter Geoghegan
ils than the truth of what's going on in the table). So that also recommends "relativistically interpreting" the values later on. This specific issue has even less to do with autovacuum_vacuum_scale_factor than the main point, of course. I agree with you that the insert-driven stuff isn't a special case. -- Peter Geoghegan

Re: Stats collector's idx_blks_hit value is highly misleading in practice

2022-02-04 Thread Peter Geoghegan
On Thu, Feb 3, 2022 at 7:08 PM John Naylor wrote: > Is this a TODO candidate? What would be a succinct title for it? I definitely think that it's worth working on. I suppose it follows that it should go on the TODO list. -- Peter Geoghegan

Re: decoupling table and index vacuum

2022-02-04 Thread Peter Geoghegan
7;t want to set LP_DEAD bits for, since _bt_check_unique() tends to do a good job of setting LP_DEAD bits, independent of the kill_prior_tuple thing. You can avoid using kill_prior_tuple by forcing bitmap scans, of course. -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-02-04 Thread Peter Geoghegan
le point during VACUUM, based on considerations about the actual number of dead items that we now need to remove from indexes, as well as metadata from any preexisting conveyor belt. -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-04 Thread Peter Geoghegan
of the workload, and compare an imaginary ideal to the actual behavior of the system. In particular, there is really only one way that the free space management can work for the two big tables that will perform acceptably -- the orders have to be stored in the same place to begin with, and stay in the same place forever (at least to the extent that that's possible). -- Peter Geoghegan

Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations

2022-02-04 Thread Peter Geoghegan
etimes sharing the machine, > or only being on the edge of running out of memory. I think in general > people tend to avoid such things in benchmarking scenarios, but even > if include stuff like this, it's hard to know what to include that > would be representative of real life, because just about anything > *could* happen in real life. Then what could you have confidence in? -- Peter Geoghegan

Re: should vacuum's first heap pass be read-only?

2022-02-04 Thread Peter Geoghegan
table, in terms of how much vacuuming each index requires. And so the thing that drives us to perform heap vacuuming will probably be heap vacuuming itself, and not the fact that each and every index has become "sufficiently bloated". > If this isn't entirely making sense, it may well be because I'm a > little fuzzy on all of it myself. I'm in no position to judge. :-) -- Peter Geoghegan

<    12   13   14   15   16   17   18   19   20   21   >