Re: Commitfest 2023-03 starting tomorrow!

2023-03-18 Thread Peter Geoghegan
at least 10 minutes into looking at something. -- Peter Geoghegan

Re: Commitfest 2023-03 starting tomorrow!

2023-03-18 Thread Peter Geoghegan
t. If something is made extremely easy, and requires little or no context to get going with, then people tend to do much more of it. Even when they theoretically don't have a good reason to do so. And even when they theoretically already had a good reason to do so, before the improved tooling/workflow was in place. -- Peter Geoghegan

Re: Add pg_walinspect function with block info columns

2023-03-17 Thread Peter Geoghegan
ould be wrong. It really is that simple. > This said, your point about having rec_blk_ref reported as an empty > string rather than NULL if there are no block references does not feel > natural to me, either.. Reporting NULL would be better. You have it backwards. It outputs an empty string right now. I want to change that, so that it outputs NULLs instead. -- Peter Geoghegan

Re: Add pg_walinspect function with block info columns

2023-03-17 Thread Peter Geoghegan
7;s patch. As I said, it will become a lot closer to pg_get_wal_records_info(). We should be clear on that. -- Peter Geoghegan

Re: Add n_tup_newpage_upd to pg_stat table views

2023-03-17 Thread Peter Geoghegan
On Fri, Jan 27, 2023 at 3:23 PM Corey Huinker wrote: > This patch adds the n_tup_newpage_upd to all the table stat views. I think that this is pretty close to being committable already. I'll move on that early next week, barring any objections. -- Peter Geoghegan

Re: Add pg_walinspect function with block info columns

2023-03-17 Thread Peter Geoghegan
so enables describing the relationship between the two functions with reference to block_ref. It seems particularly helpful to me to be able to say that pg_get_wal_block_info() doesn't show anything for precisely those WAL records whose block_ref is NULL according to pg_get_wal_records_info(). -- Peter Geoghegan

Re: Add pg_walinspect function with block info columns

2023-03-16 Thread Peter Geoghegan
ueries or such, but just can run the function, which actually > takes way less time (3sec) to scan the same 5mn WAL records [3]. That's exactly my concern, yes. As you say, it's not just the performance aspect. Requiring users to write a needlessly ornamental query is actively misleading. It suggests that block_ref is distinct information from the blocks output by pg_get_wal_block_info(). -- Peter Geoghegan

Re: Amcheck verification of GiST and GIN

2023-03-16 Thread Peter Geoghegan
On Thu, Mar 16, 2023 at 4:48 PM Peter Geoghegan wrote: > Some feedback on the GiST patch: I see that the Bloom filter that's used to implement heapallindexed verification fingerprints index tuples that are formed via calls to gistFormTuple(), without any attempt to normalize-away differ

Re: Amcheck verification of GiST and GIN

2023-03-16 Thread Peter Geoghegan
If it never happens anyway, then the fact that we handle it with an error won't matter -- so the error is harmless. If it does happen then we'll want to hear about it as soon as possible -- so the error is useful. * I suggest using c99 style variable declarations in loops. Especially for

Re: Add pg_walinspect function with block info columns

2023-03-14 Thread Peter Geoghegan
On Tue, Mar 14, 2023 at 5:34 PM Melanie Plageman wrote: > On Tue, Mar 14, 2023 at 6:57 PM Peter Geoghegan wrote: > > Why doesn't it already work like this? Why do we need a separate > > pg_get_wal_block_info() function at all? > > Well, I think if you only care

Re: Add pg_walinspect function with block info columns

2023-03-14 Thread Peter Geoghegan
uery you end up writing must do two passes over the WAL records, but its structure almost suggests that it's necessary to do two separate passes over distinct "streams". Why doesn't it already work like this? Why do we need a separate pg_get_wal_block_info() function at all? -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-03-13 Thread Peter Geoghegan
terse mode, and making no such promise otherwise. Terse mode wouldn't just truncate the output of verbose mode -- it would never display information that could in principle exceed the 30 character allowance, even with records that happen to fall under the limit. I can't feel too bad about putting this part off. A pager like pspg is already table stakes when using pg_walinspect in any sort of serious way. As I said upthread, absurdly wide output is already reasonably common in most cases. -- Peter Geoghegan

Re: Testing autovacuum wraparound (including failsafe)

2023-03-13 Thread Peter Geoghegan
On Mon, Mar 13, 2023 at 3:25 PM Jacob Champion wrote: > Does https://commitfest.postgresql.org/42/4128/ address that > independently enough? I wasn't aware of that patch. It looks like it does exactly what I was arguing in favor of. So yes. -- Peter Geoghegan

Re: Testing autovacuum wraparound (including failsafe)

2023-03-11 Thread Peter Geoghegan
t I did have a real point: once we have tests for the xidStopLimit mechanism, why not take the opportunity to correct the long standing issue with the documentation advising the use of single user mode? -- Peter Geoghegan

Re: Testing autovacuum wraparound (including failsafe)

2023-03-07 Thread Peter Geoghegan
;t want to make life harder by (say) connecting it to the single user mode problem. But...the single user mode thing really needs to go away. It's just terrible advice, and actively harms users. -- Peter Geoghegan

pg_walinspect memory leaks

2023-02-13 Thread Peter Geoghegan
his could be avoided by using a separate memory context that is reset periodically, or something else along the same lines. -- Peter Geoghegan

Re: Minor meson gripe

2023-02-09 Thread Peter Geoghegan
make the way that we run a subset of test suites against a running server similar to the way that we run a subset of test suites against a throwaway installation (ala "make check"). > The only restriction I see wrt add_test_setup() is that it's not entirely > trivial to use a "runtime-variable" path to an installation. I personally have no problem with that, though of course I could have easily overlooked something. -- Peter Geoghegan

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-09 Thread Peter Geoghegan
much sense, even from a GiST point of view. It can follow exactly the same approach as B-Tree here, since its approach to page deletion is already directly based on nbtree. -- Peter Geoghegan

Re: Minor meson gripe

2023-02-09 Thread Peter Geoghegan
setup, like --setup running, used whenever you want to just run one or two tests against an ad-hoc temporary installation? Offhand it seems as if add_test_setup() could support that requirement? -- Peter Geoghegan

Re: Minor meson gripe

2023-02-09 Thread Peter Geoghegan
te name to tmp_install? That immediately reminds me of what's really going on here, since I'm used to seeing that directory name. And it clashes with "--suite setup" in a way that seems useful. -- Peter Geoghegan

Minor meson gripe

2023-02-09 Thread Peter Geoghegan
that name, given that we also need to use the unrelated --setup flag for some nearby testing recipes? * Why do we actually need a "setup" suite? Offhand it appears that a simple "meson test -v --suite regress" works just as well. Have I missed something? -- Peter Geoghegan

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-08 Thread Peter Geoghegan
order that actually made sense. As I said, I don't mind making VACUUM VERBOSE behave a little bit more like a progress indicator, which is how it used to work. Maybe I went a little too far in the direction of neatly summarizing the whole VACUUM operation in one go. But I doubt that I went too far with it by all that much. Overall, the old VACUUM VERBOSE was extremely hard to use, and was poorly maintained -- let's not go back to that. (See commit ec196930 for evidence of how sloppily it was maintained.) -- Peter Geoghegan

Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-08 Thread Peter Geoghegan
s really don't matter much at this level. But seeing something about the number of WAL records written while vacuuming each index is another story. That's a cost that is likely to vary in possibly-interesting ways amongst indexes on the table, unlike IndexBulkDeleteResult.tuples_removed, which is very noisy, and signifies almost nothing important on its own. -- Peter Geoghegan

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-08 Thread Peter Geoghegan
do that without locking rows and dirtying heap pages. If somebody were to argue that we should make DO NOTHING lock rows and throw similar errors now then I'd also disagree with them, but to a much lesser degree. I don't think that this patch is a good idea. -- Peter Geoghegan

Re: GUCs to control abbreviated sort keys

2023-02-06 Thread Peter Geoghegan
comparisons have further optimizations such as strcoll caching and the memcmp equality fast path. It's also required to actually fix the test case at hand -- 100k isn't enough to avoid the performance issue Jeff reported. I think that this should be committed to HEAD only. -- Peter

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

2023-02-04 Thread Peter Geoghegan
ons, and if you use vacuum_defer_cleanup_age. It's likely that most GiST indexes never have any page deletions due to the workload characteristics. -- Peter Geoghegan

Re: Amcheck verification of GiST and GIN

2023-02-03 Thread Peter Geoghegan
On Thu, Feb 2, 2023 at 12:15 PM Peter Geoghegan wrote: > * Why are there only WARNINGs, never ERRORs here? Attached revision v22 switches all of the WARNINGs over to ERRORs. It has also been re-indented, and now uses a non-generic version of PageGetItemIdCareful() in both verify_gin.c

Re: Amcheck verification of GiST and GIN

2023-02-02 Thread Peter Geoghegan
as pg_amcheck's approach is (it's doing nothing that you couldn't replicate in a shell script), in practice that its standardized approach probably makes things a lot smoother, especially in terms of how VACUUM is impacted. -- Peter Geoghegan

Re: Move defaults toward ICU in 16?

2023-02-02 Thread Peter Geoghegan
investment. Most of the work is actually done by natural language scholars, not technologists. That effort is very unlikely to be duplicated by some other group with its own conflicting goals. AFAICT there is no great need for any schisms, since differences of opinion can usually be accommodated under the umbrella of Unicode. -- Peter Geoghegan

Re: Amcheck verification of GiST and GIN

2023-02-02 Thread Peter Geoghegan
early advise users that they should probably just use pg_amcheck. Using the SQL interface directly should now mostly be something that only a tiny minority of experts need to do -- and even the experts won't do it that way unless they have a good reason to. -- Peter Geoghegan

Re: Amcheck verification of GiST and GIN

2023-02-02 Thread Peter Geoghegan
On Thu, Feb 2, 2023 at 11:51 AM Peter Geoghegan wrote: > I also have some questions about the verification functionality itself: I forgot to include another big concern here: * Why are there only WARNINGs, never ERRORs here? It's far more likely that you'll run into problems

Re: Amcheck verification of GiST and GIN

2023-02-02 Thread Peter Geoghegan
nctions for GiST and GIN -- though the locking/naming situation must be resolved before we decide what to do here, for pg_amcheck. -- Peter Geoghegan

Re: pg_dump versus hash partitioning

2023-02-01 Thread Peter Geoghegan
make sense. And they probably won't come up very often -- collation updates don't often contain enormous gratuitous differences that are liable to create dump/reload hazards with range partitioning. It is the least worst approach, overall. In theory, and in practice. -- Peter Geoghegan

Re: pg_dump versus hash partitioning

2023-02-01 Thread Peter Geoghegan
On Wed, Feb 1, 2023 at 2:12 PM Robert Haas wrote: > On Wed, Feb 1, 2023 at 4:44 PM Peter Geoghegan wrote: > > This is a misrepresentation of Tom's words. It isn't actually > > self-evident what "we end up with all of the same objects, each > > defined in the

Re: pg_dump versus hash partitioning

2023-02-01 Thread Peter Geoghegan
TE INDEX time. ISTM that the requirements are rather similar here -- perhaps even identical. See: https://www.postgresql.org/docs/devel/btree-support-funcs.html -- Peter Geoghegan

Re: pg_dump versus hash partitioning

2023-02-01 Thread Peter Geoghegan
nd complicated -- it's an inherently tricky area. You seem to be saying that the way that this stuff currently works is correct by definition, except when it isn't. -- Peter Geoghegan

Re: pg_dump versus hash partitioning

2023-02-01 Thread Peter Geoghegan
lent for some purposes, but not other purposes. The indirection between "logical and physical collations" is underdeveloped. There isn't even an official name for that idea. -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-02-01 Thread Peter Geoghegan
ws/records in LSN order, and want to be able to easily compare the LSNs (or other details) of groups of adjoining records. -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan wrote: > Obviously what you're doing here will lead to a significant increase > in the verbosity of the output for affected WAL records. I don't feel > too bad about that, though. It's really an existing problem, and on

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan wrote: > > I would also like to see functions like XLogRecGetBlockRefInfo() pass > > something more useful than a stringinfo buffer so that we could easily > > extract out the relfilenode in pgwalinspect. > > That does see

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
till work, but I think it's okay to care less about pg_waldump usability. > > BTW, while playing around with this patch today, I noticed that it > > won't display the number of elements in each offset array directly. > > Perhaps it's worth including that, too? > > I believe I have addressed this in the attached patch. Thanks for taking care of that. -- Peter Geoghegan

Re: Add n_tup_newpage_upd to pg_stat table views

2023-01-27 Thread Peter Geoghegan
out basic HOT safety issues first.) If you see one particular index that gets a far larger number of non-hot updates that are reported as "logical changes to the indexed columns", then dropping that index has the potential to make the HOT update situation far better. -- Peter Geoghegan

Re: Add n_tup_newpage_upd to pg_stat table views

2023-01-27 Thread Peter Geoghegan
e a patch to add just that? Do you mean something more specific, like a tracker for when an UPDATE leaves a page full, without needing to go to a new page itself? If so, then that does require defining what that really means, because it isn't trivial. Do you assume that all updates have a successor version that is equal in size to that of the UPDATE that gets counted by this hypothetical other counter of yours? -- Peter Geoghegan

Re: GUCs to control abbreviated sort keys

2023-01-27 Thread Peter Geoghegan
em, having run the perl program as outlined in your test case: $ ls -l /tmp/strings.txt -rw-r--r-- 1 pg pg 431886574 Jan 27 11:13 /tmp/strings.txt $ sha1sum /tmp/strings.txt 22f60dc12527c215c8e3992e49d31dc531261a83 /tmp/strings.txt Does that match what you see on your system? -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-27 Thread Peter Geoghegan
I think that that's good, but > > you didn't seem to. > > I think that, if we had something like the recency test I was talking about, > we could afford to alway freeze when the page is already dirty and not very > recently modified. I.e. not even insist on a WAL record having been generated > during pruning/HTSV. But I need to think through the dangers of that more. Now I'm confused. I thought that the recency test you talked about was purely to be used to do something a bit like the FPI thing, but using some high level context. Now I don't know what to think. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-27 Thread Peter Geoghegan
iterally said I'm done with VACUUM for good, and that I just want to put a line under this. Yet you still persist in doing this sort of thing. I'm not fighting you, I'm not fighting Andres. I was making a point about the need to do something in this area in general. That's all. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
been completely > > wrong -- even then there generally will indeed be a second FPI later > > on for the same page, to go with everything else. This makes the > > wasted freezing even less significant, on a comparative basis! > > This is precisely why I think that we can afford to be quite aggressive about > freezing already dirty pages... I'm beginning to warm to this idea, now that I understand it a little better. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
ive basis! It's also likely true that an FPI in lazy_scan_prune is a much stronger signal, but I think that the important dynamic is that we're reasoning about "costs now vs costs later on". The asymmetry is really important. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
e-level freezing works will make freezing of pages on databases with page-level checksums similar to an equivalent case without checksums enabled. Even assuming that that's an important goal, you won't be much closer to achieving it under your scheme, since hint bits being set during VACUUM and requiring an FPI still make a huge difference. Tables like pgbench_history have pages that generally aren't pruned, that don't need to log an FPI just to set PD_ALL_VISIBLE once checksums are disabled. That's the difference that users are going to notice between checksums enabled vs disabled, if they notice any -- it's the most important one by far. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
On Thu, Jan 26, 2023 at 1:22 PM Robert Haas wrote: > On Thu, Jan 26, 2023 at 4:06 PM Peter Geoghegan wrote: > > There is very good reason to believe that the large majority of all > > data that people store in a system like Postgres is extremely cold > > data: > >

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
You've said that you agree that it sucks, but somehow I still can't shake the feeling that you don't fully understand just how much it sucks. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
h, then you'd only need to include the relfilenode and block number (and so on) once. It would be tricky to handle Multis, so what you'd probably do is just freezing xmin, and possibly aborted and locker XIDs in xmax. So you wouldn't completely get rid of the main freeze reco

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
s been modified, and direct our > freezing activity toward the ones less-recently modified on the theory > that they're not so likely to be modified again in the near future, > but in reality we have no such system. So I don't really feel like I > know what the right answer is here, yet. So we need to come up with a way of getting reliable information from the future, about an application that we have no particular understanding of. As opposed to just eating the cost to some degree, and making it configurable. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
t bits is on, than without them. Which I think is how using either > of pgWalUsage.wal_fpi, pgWalUsage.wal_records ends up working? Which part is the odd part? Is it odd that page-level freezing works that way, or is it odd that page-level checksums work that way? In any case this seems like a

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
> XLogInsert() would be likely to generate an FPI would make more sense. The > rare race case of a checkpoint starting concurrently doesn't matter IMO. That's going to be very significantly more aggressive. For example it'll impact small tables very differently. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
On Thu, Jan 26, 2023 at 5:41 AM Robert Haas wrote: > On Wed, Jan 25, 2023 at 11:25 PM Peter Geoghegan wrote: > > On Wed, Jan 25, 2023 at 7:41 PM Robert Haas wrote: > > > Both Andres and I have repeatedly expressed concern about how much is > > > being changed in the

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-26 Thread Peter Geoghegan
lready eligible to be set all-visible. > > The only reason there is a substantial difference is because of pgbench's > uniform access pattern. Most real-world applications don't have that. It's not pgbench! It's TPC-C. It's actually an adversarial case for the patch series. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-25 Thread Peter Geoghegan
e some or all of the patches yourself, in part or in full, I have no objections. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
On Wed, Jan 25, 2023 at 8:24 PM Peter Geoghegan wrote: > > I think we're on a very dangerous path here. I want VACUUM to be > > better as the next person, but I really don't believe that's the > > direction we're headed. I think if we release like this, we&

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
ough. An elegant analogy. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
s my choice to work on VACUUM in general, I still wanted to finish off what I'd started. I don't see how that'll be possible now -- I'm just not in a position to be in the center of another controversy, and I just don't seem to be able to avoid them here, as a practical matte

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
will have a relatively young average, a more append only table > will have an increasing average age. > > > It might also make sense to look at the age of relfrozenxid - there's really > no point in being overly eager if the relation is quite young. I don't think that's true. What about bulk loading? It's a totally valid and common requirement. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
able that was too small. Obviously the way that eager freezing strategy avoids freezing concurrently modified pages isn't perfect. It's one approach to limiting the downside from eager freezing, in tables (or even individual pages) where it's inappropriate. Of course that isn't perfect, but it's a significant factor. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
e part of the data is transient and only settles after a time. In neither > case eager freezing is ok. It sounds like you're not willing to accept any kind of trade-off. How, in general, can we detect what kind of 1TB table it will be, in the absence of user input? And in the absence of user input, why would we prefer to default to a behavior that is highly destabilizing when we get it wrong? -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
es eager freezing when the failsafe is in effect. > I don't see an alternative to reverting this for now. I want to see your test case before acting. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
iteria to trigger freezing pages. > > That's only true because vacuum_freeze_min_age being has been fairly radically > redefined recently. So? This part of the commit message is a simple statement of fact. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-25 Thread Peter Geoghegan
always intended to be provisional. Something that I explicitly noted would be reviewed after the beta period is over, once we gained more experience with the setting. I think that a far higher setting could be almost as effective. 32GB, or even 64GB could work quite well, since you'll still have the FPI optimization. -- Peter Geoghegan

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-01-25 Thread Peter Geoghegan
was always understood to be more susceptible to certain issues (when in READ COMMITTED mode) as a result. There are some halfway reasonable arguments against this sort of behavior, but I believe that we made the right trade-off. -- Peter Geoghegan

Re: Update comments in multixact.c

2023-01-24 Thread Peter Geoghegan
detail is that it can happen if certain rules are not followed. Thanks -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-24 Thread Peter Geoghegan
teed to work or not work. I only claim that we can meaningfully reduce the absolute risk by using a fairly simple approach, principally by not needlessly coupling the auto-cancellation behavior to *all* autovacuums that are specifically triggered by age(relfrozenxid). As Andres said at one point, doing those two things at exactly the same time is just arbitrary. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-23 Thread Peter Geoghegan
(autovac_table *tab) > Somehow the added "for ..." sounds a bit awkward. "autovacuum for table XID > age". Maybe "autovacuum due to ..."? That works just as well IMV. I'll change it to that. Anything else for 0001? Would be nice to get it committed tomorrow. -- Peter Geoghegan

Re: New strategies for freezing, advancing relfrozenxid early

2023-01-23 Thread Peter Geoghegan
/Freezing/skipping_strategies_patch:_motivating_examples#Patch_3 [2] https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Opportunistically_advancing_relfrozenxid_with_bursty.2C_real-world_workloads -- Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

2023-01-22 Thread Peter Geoghegan
g pgindent tends to produce better results than just running pgindent, at least when working on a new patch. -- Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

2023-01-21 Thread Peter Geoghegan
On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan wrote: > There is one thing about clang-format that I find mildly infuriating: > it can indent function declarations in the way that I want it to, and > it can indent variable declarations in the way that I want it to. It > just can'

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-21 Thread Peter Geoghegan
On Sat, Jan 21, 2023 at 6:54 PM Andres Freund wrote: > Is > https://www.postgresql.org/message-id/CAH2-WzmytCuSpaMEhv8H-jt8x_9whTi0T5bjNbH2gvaR0an2Pw%40mail.gmail.com > the last / relevant version of the patch to look at? Yes. I'm mostly just asking about v5-0001-* right n

Re: run pgindent on a regular basis / scripted manner

2023-01-21 Thread Peter Geoghegan
e this out recently? If this is true, then it's certainly discouraging. I don't have a problem with the current pgindent alignment of function parameters, so not sure what you mean about that. It *was* terrible prior to commit e3860ffa, but that was back in 2017 (pg_bsd_indent 2.0 fixed that problem). -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-21 Thread Peter Geoghegan
atch) out of the way soon. -- Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

2023-01-21 Thread Peter Geoghegan
E posted several years back, I believe. Plus the timescaledb one in one or two places. I worked a couple of things out through trial and error. It's relatively hard to follow the documentation, and there have been features added to newer LLVM versions. -- Peter Geoghegan clang-format Description: Binary data

Re: run pgindent on a regular basis / scripted manner

2023-01-21 Thread Peter Geoghegan
ily about my fixed preferences (though it can be hard to tell!). It's really not surprising that clang-format cannot quite perfectly simulate pgindent. How flexible can we be about stuff like that? Obviously there is no clear answer right now. -- Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

2023-01-21 Thread Peter Geoghegan
and tools like meson in return. It would probably make it practical to have much stronger rules about how committed code must be indented -- rules that are practical, and can actually be enforced. That trade-off seems likely to be worth it in my view, though it's not something that I feel too strongly about. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-20 Thread Peter Geoghegan
havior. Aggressive/antiwraparound VACUUMs are naturally much more likely to coincide with periodic DDL, just because they take so much longer. That is a dangerous combination. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-20 Thread Peter Geoghegan
h a default as low as 1.1x or even 1.05x. That's going to make very little difference to those users that really rely on the no-auto-cancellation behavior, while at the same time making things a lot safer for scenarios like the Joyent/Manta "DROP TRIGGER" outage (not perfectly safe, by any means, but meaningfully safer). -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Peter Geoghegan
ovacuum. Even if it's valuable to maintain this kind of VACUUM/autovacuum parity (which I tend to doubt), doesn't the same argument work almost as well with whatever stripped down version you come up with? It's also confusing that a manual VACUUM command will be doing an ANALYZE-like thing. Especially in cases where it's really expensive relative to the work of VACUUM, because VACUUM scanned so few pages. You just have to make some kind of trade-off. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Peter Geoghegan
t down at all, in the end. Presumably you'll want to add the same I/O prefetching logic to this cut-down version, just for example. Since without that there will be no competition between it and ANALYZE proper. Besides which, isn't it kinda wasteful to not just do a full ANALYZE? Sure, you can avoid detoasting overhead that way. But even still. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Peter Geoghegan
I now understand that you're in favor of addressing the root problem directly. I am also in favor of that approach. I'd be more than happy to get rid of the band aid as part of that whole effort. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-19 Thread Peter Geoghegan
only thing that can be done is to either make VACUUM behave somewhat like ANALYZE in at least some cases, or to have it invoke ANALYZE directly (or indirectly) in those same cases. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
vacuum is frequent enough. > > As a demo: The attached sql script ends up with a table containing 10k rows, > but relpages being set 1 million. I saw that too. But then I checked again a few seconds later, and autoanalyze had run, so reltuples was 10k. Just like it would have if there was no VACUUM statements in your script. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
directed, ndead, nunused) > around heap_page_prune() and a > pgstat_count_heap_vacuum(nunused) > in lazy_vacuum_heap_page(), we'd likely end up with a better approximation > than what vac_estimate_reltuples() does, in the "partially scanned" case. What does vac_estimate_reltuples() have to do with dead tuples? -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
On Wed, Jan 18, 2023 at 5:49 PM Andres Freund wrote: > On 2023-01-18 15:28:19 -0800, Peter Geoghegan wrote: > > Perhaps we should make vac_estimate_reltuples focus on the pages that > > VACUUM newly set all-visible each time (not including all-visible > > pages that got scan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
What if it was just a simple multiplier on freeze_max_age/multixact_freeze_max_age, without changing any other detail? -- Peter Geoghegan v5-0002-Add-table-age-trigger-concept-to-autovacuum.patch Description: Binary data v5-0001-Add-autovacuum-trigger-instrumentation.patch Description: Binary data

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
ings change while VACUUM runs. The other problem is that the thing that is counted isn't broken down into distinct subcategories of things -- things are bunched together that shouldn't be. Oh wait, you were thinking of what I said before -- my "awkward but logical question". Is that it? -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
here errors can grow without bound. But you have to draw the line somewhere, unless you're willing to replace the whole approach with something that stores historic metadata. What kind of tradeoff do you want to make here? I think that you have to make one. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
On Wed, Jan 18, 2023 at 3:28 PM Peter Geoghegan wrote: > The problems in this area tend to be that vac_estimate_reltuples() > behaves as if it sees a random sample, when in fact it's far from > random -- it's the same scanned_pages as last time, and the ten other > times

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
On Wed, Jan 18, 2023 at 2:37 PM Peter Geoghegan wrote: > Maybe you're right to be concerned to the degree that you're concerned > -- I'm not sure. I'm just adding what I see as important context. The problems in this area tend to be that vac_estimate_reltuples() beha

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
to accumulate over time. Maybe you're right to be concerned to the degree that you're concerned -- I'm not sure. I'm just adding what I see as important context. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
ch for this? I'd be very interested in seeing this through. Could definitely review it. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
On Wed, Jan 18, 2023 at 1:02 PM Peter Geoghegan wrote: > Some of what I'm proposing arguably amounts to deliberately adding a > bias. But that's not an unreasonable thing in itself. I think of it as > related to the bias-variance tradeoff, which is a concept that comes &g

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
big part of the overall picture, but not everything. It tells us relatively little about the benefits, except perhaps when most pages are all-visible. -- Peter Geoghegan

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
On Wed, Jan 18, 2023 at 11:02 AM Robert Haas wrote: > On Wed, Jan 18, 2023 at 1:31 PM Peter Geoghegan wrote: > > pgstat_report_analyze() will totally override the > > tabentry->dead_tuples information that drives autovacuum.c, based on > > an estimate derived from a rand

Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation

2023-01-18 Thread Peter Geoghegan
ch. One of the advantages of running VACUUM sooner is that it provides us with relatively reliable information about the needs of the table. We can also cheat, sort of. If we find another justification for autovacuuming (e.g., it's a quiet time for the system as a whole), and it works out to help with this other problem, it may be just as good for users. -- Peter Geoghegan

<    1   2   3   4   5   6   7   8   9   10   >