Re: RecoveryInProgress() has critical side effects
On Tue, Dec 7, 2021 at 5:55 PM Robert Haas wrote: > On Sat, Dec 4, 2021 at 7:44 PM Michael Paquier wrote: > > My main worry here is that this changes slightly the definition of > > doPageWrites across stable branches at the end of recovery as there > > could be records generated there. Note that GetFullPageWriteInfo() > > gets called in XLogInsert(), while Insert->fullPageWrites gets updated > > before CleanupAfterArchiveRecovery(). And it may influence > > the value of doPageWrites in the startup process. > > But ... so what? All the code that uses it retries if the value that > was tentatively used turns out to be wrong. Despite the fact that this question didn't get further discussion, and the fact that nobody seems quite sure what the best way of proceeding here is, my interpretation of the comments made so far is that nobody thinks that what we have now is superior to either of the proposed alternatives, and there's a weak preference for v2 over v1. So I committed that one. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RecoveryInProgress() has critical side effects
On Sat, Dec 4, 2021 at 7:44 PM Michael Paquier wrote: > My main worry here is that this changes slightly the definition of > doPageWrites across stable branches at the end of recovery as there > could be records generated there. Note that GetFullPageWriteInfo() > gets called in XLogInsert(), while Insert->fullPageWrites gets updated > before CleanupAfterArchiveRecovery(). And it may influence > the value of doPageWrites in the startup process. But ... so what? All the code that uses it retries if the value that was tentatively used turns out to be wrong. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RecoveryInProgress() has critical side effects
On Tue, Nov 16, 2021 at 09:41:58AM -0500, Robert Haas wrote: > Maybe I'm not understanding you properly here, but it seems like you > might be forgetting that this is a local variable and thus every > backend is going to have something different. In the startup process, > it will be initialized by StartupXLOG(); in other processes, it's > currently initialized by RecoveryInProgress(), but with this patch it > wouldn't be. Either way, it's then updated by future calls to > XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect > the new value that gets set the next time XLogInsertRecord(), but they > don't directly affect doPageWrites. My main worry here is that this changes slightly the definition of doPageWrites across stable branches at the end of recovery as there could be records generated there. Note that GetFullPageWriteInfo() gets called in XLogInsert(), while Insert->fullPageWrites gets updated before CleanupAfterArchiveRecovery(). And it may influence the value of doPageWrites in the startup process. -- Michael signature.asc Description: PGP signature
Re: RecoveryInProgress() has critical side effects
There are a couple typos. "uninitalized," in one place. Also, +* If this is either a bootstrap process nor a standalone backend, start Is "either .. nor" correct? It looks very wrong to me. I think you meant "either .. or". Overall, this patch seems mostly about documenting some undesirable existing behavior. Is part of your plan to get such things fixed/ replaced entirely? -- Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/ "We're here to devour each other alive"(Hobbes)
Re: RecoveryInProgress() has critical side effects
On Sat, Nov 20, 2021 at 3:22 PM Heikki Linnakangas wrote: > But here's yet another idea: We could initialize RedoRecPtr and > doPageWrites in InitXLogInsert(). It would seem like a pretty natural > place for it. I considered that, but it seemed like an abstraction violation to me, because that code is part of xloginsert.c, which is mostly concerned with assembly of write-ahead log records, not the associated shared-memory state. Also, I don't think there's any guarantee that the state in shared memory is initialized at the time we call that function, so we might just be copying uninitialized memory into other uninitialized memory. > PS. typo in patch v2: s/prepard/prepared/ Thanks, fixed. -- Robert Haas EDB: http://www.enterprisedb.com v3-0001-Remove-InitXLOGAccess.patch Description: Binary data
Re: RecoveryInProgress() has critical side effects
On 17/11/2021 00:04, Andres Freund wrote: On 2021-11-16 16:30:27 -0500, Robert Haas wrote: I'm still not entirely clear on whether you prefer v1-0002, v2-0002, or something else. I think it basically doesn't matter much. It's such a small piece of the cost compared to either the cost of a single insert or the ratio between RedoRecPtr/FPW changes and the number of inserted records. I guess v2-0002 is mildly simpler, so I very weakly lean towards that. I was going to argue for v1, because I was worried about generating full-page writes unnecessarily. But looking closer, it doesn't do that. It will initially think that no full page images are needed, and perform the extra loop in XLogInsert() to try again with full page images. Similarly in heap_update(), it will fail in the direction of sometimes doing the "prefix-suffix compression" unnecessarily, which means a little bit of extra CPU work, but it won't affect the WAL record that gets written. So either way is fine, really. That's a bit subtle, though, a comment somewhere would be good. But here's yet another idea: We could initialize RedoRecPtr and doPageWrites in InitXLogInsert(). It would seem like a pretty natural place for it. - Heikki PS. typo in patch v2: s/prepard/prepared/
Re: RecoveryInProgress() has critical side effects
On 2021-11-16 16:30:27 -0500, Robert Haas wrote: > I'm still not entirely clear on whether you prefer v1-0002, v2-0002, > or something else. I think it basically doesn't matter much. It's such a small piece of the cost compared to either the cost of a single insert or the ratio between RedoRecPtr/FPW changes and the number of inserted records. I guess v2-0002 is mildly simpler, so I very weakly lean towards that.
Re: RecoveryInProgress() has critical side effects
On Tue, Nov 16, 2021 at 3:42 PM Andres Freund wrote: > On 2021-11-16 15:19:19 -0500, Robert Haas wrote: > > > Hm. I think this might included a bunch of convoluting factors that make > > > it > > > hard to judge the actual size of the performance difference. > > > > Yes, I think so, too. > > FWIW I ran that pgench thing I presented upthread, and I didn't see any > meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and > 0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than > the difference between the changes. Thanks. I suspected that the results I was seeing were not meaningful, but it's hard to be sure when the results seem to be repeatable locally. I'm still not entirely clear on whether you prefer v1-0002, v2-0002, or something else. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RecoveryInProgress() has critical side effects
Hi, On 2021-11-16 15:19:19 -0500, Robert Haas wrote: > > Hm. I think this might included a bunch of convoluting factors that make it > > hard to judge the actual size of the performance difference. > > Yes, I think so, too. FWIW I ran that pgench thing I presented upthread, and I didn't see any meaningful and repeatable performance difference 354a1f8d220, ad26ee28250 and 0002 applied ontop of ad26ee28250. The run-to-run variance is way higher than the difference between the changes. Greetings, Andres Freund
Re: RecoveryInProgress() has critical side effects
On Tue, Nov 16, 2021 at 2:27 PM Andres Freund wrote: > Is this with assertions enabled? Because on my workstation (which likely is > slower on a single-core basis than your laptop) I get around half of this with > an optimized build (and a non-optimized build rarely is worth using as a > measuring stick). It could also be that your psqlrc does something > heavyweight? Assertions disabled. ~/.psqlrc does not exist. > Hm. I think this might included a bunch of convoluting factors that make it > hard to judge the actual size of the performance difference. Yes, I think so, too. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RecoveryInProgress() has critical side effects
Hi, On 2021-11-15 16:29:28 -0500, Robert Haas wrote: > v1: 0.378 ms > v2: 0.391 ms > common base commit (10eae82b2): 0.376 ms Is this with assertions enabled? Because on my workstation (which likely is slower on a single-core basis than your laptop) I get around half of this with an optimized build (and a non-optimized build rarely is worth using as a measuring stick). It could also be that your psqlrc does something heavyweight? > methodology: > for i in `seq 1 1000`; do psql -c '\timing' -c 'select > txid_current();'; done | grep '^Time:' | awk '{total+=$2} END {print > total/NR}' > run twice, discarding the first result, since the very first > connection attempt after starting a new server process is notably > slower Hm. I think this might included a bunch of convoluting factors that make it hard to judge the actual size of the performance difference. It'll e.g. include a fair bit of syscache initialization overhead that won't change between the two approaches. This could be addressed by doing something like -c "SELECT 'txid_current'::regproc;" first. Also, the psql invocations itself use up a fair bit of time, even if you ignored them from the result with the \timing trick. A pgbench -C doing 1k SELECT 1;s is about ~1.5s for me, whereas psql is ~5.7s. Just to size up the order of magnitude of overhead of the connection establishment and syscache initialization, I ran a pgbench with a script of SELECT 1; SELECT 1; SELECT 'txid_current()'::regprocedure; SELECT 'txid_current()'::regprocedure; SELECT txid_current(); SELECT txid_current(); and ran that with pgbench -n -f /tmp/txid.sql -C -t1000 -P1 --report-latencies and got statement latencies in milliseconds: 0.125 SELECT 1; 0.024 SELECT 1; 0.095 SELECT 'txid_current()'::regprocedure; 0.025 SELECT 'txid_current()'::regprocedure; 0.033 SELECT txid_current(); 0.024 SELECT txid_current(); which nicely shows how much of the overhead is not related to the actual txid_current() invocation. Greetings, Andres Freund
Re: RecoveryInProgress() has critical side effects
Hi, On 2021-11-11 12:15:03 -0500, Robert Haas wrote: > The reason why RecoveryInProgress() has critical side effects is that > it calls InitXLOGAccess(). It turns out that the only critical thing > that InitXLOGAccess() does is call InitXLogInsert().[1] If a backend > doesn't call InitXLogInsert(), max_rdatas won't be initialized, and > the first attempt to insert WAL will probably fail with something like > "ERROR: too much WAL data". But there's no real reason why > InitXLogInsert() needs to happen only after recovery is finished. I think we should consider allocating that memory in postmaster, on !windows at least. Or, perhaps even better, to initially use some static variable, and only use a separate memory context when XLogEnsureRecordSpace(). Reserving all that memory in every process just needlessly increases our per-connection memory usage, for no good reason. > The > work that it does is just setting up data structures in backend-local > memory, and not much is lost if they are set up and not used. So, in > 0001, I propose to move the call to InitXLogInsert() from > InitXLOGAccess() to BaseInit(). That means every backend will do it > the first time it starts up. It also means that CreateCheckPoint() > will no longer need a special case call to InitXLogInsert(), which > seems like a good thing. Yes. Seems making BaseInit() being executed at a halfway consistent point is starting to have some benefits at least ;) > Nevertheless, what I did in 0002 is remove InitXLOGAccess() but move > the initialization of RedoRecPtr and doPageWrites into > GetFullPageWritesInfo(), only in the case where RedoRecPtr hasn't been > set yet. This has one property that I really really like, which is > that makes the code more understandable. There is no suggestion that > the correctness of WAL insertion somehow depends on a > RecoveryInProgress() call which may or may not have occurred at some > arbitrarily distant time in the past. Initializing the value at the > point of first use is just way clearer than initializing it as a > side-effect of some other function that's not even that tightly > connected. However, it does have the effect of introducing a branch > into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough > not to matter. Hm. Compared to the other costs of the XLogInsert do/while loop it probably doesn't matter. One nearly-always-false branch is vastly cheaper than going through the loop unnecessarily. Sure, the unnecessarily iteration saved will only be the first (for now, it might be worth to refresh the values more often), but there's enough branches in the body of the loop that I can't really see it mattering. Maybe kinda conceivably the cost of that branch could be an argument if GetFullPageWritesInfo() were inline in XLogInsert(). But it isn't. > I think the obvious alternative is to just not > initialize RedoRecPtr and doPageWrites at all and document in the > comments that the first time each backend does something with them > it's going to end up retrying once; perhaps that is preferable. Going > the other way, we could get rid of the global variables altogether and > have GetFullPageWrites() read the data from shared memory. However, > unless 8-byte atomicity is guaranteed, that requires a spinlock, so it > seems likely unappealing. I think it'd be fine to just make them 8byte atomics, which'd be lock-free on relevant platforms (at least once the aarch64 thing is fixed, there's a patch). XLogCtlInsert already takes care to ensure that RedoRecPtr/fullPageWrites are on a cacheline that's not constantly modified. If we really wanted to optimize the no-8-byte-single-copy-atomicity path, we could introduce a counter counting how many times RedoRecPtr/fullPageWrites have changed. But it seems unlikely to be worth it. Greetings, Andres Freund
Re: RecoveryInProgress() has critical side effects
On Mon, Nov 15, 2021 at 9:34 PM Michael Paquier wrote: > Patch 0001 means that the startup process would set up the structures > to be able to build WAL records before running any kind of recovery > action rather than doing it when it really needs it. That's fine by > me. Great, thanks. I think I'll go ahead and commit this one, then. > Is patch 0002 actually right regarding the handling of doPageWrites? > Once applied, we finish by setting it when the startup process starts > and not anymore at the end of recovery based on its the state of > Insert, but this could have changed while in recovery when replaying > one or more XLOG_FPW_CHANGE records. Maybe I'm not understanding you properly here, but it seems like you might be forgetting that this is a local variable and thus every backend is going to have something different. In the startup process, it will be initialized by StartupXLOG(); in other processes, it's currently initialized by RecoveryInProgress(), but with this patch it wouldn't be. Either way, it's then updated by future calls to XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect the new value that gets set the next time XLogInsertRecord(), but they don't directly affect doPageWrites. > > I'm personally not too worried about a ~4% regression in this > > particular benchmark... > > This is not a hot code path, that should be fine. OK. I'll wait a while and see if anyone else wants to weigh in. -- Robert Haas EDB: http://www.enterprisedb.com
Re: RecoveryInProgress() has critical side effects
On Mon, Nov 15, 2021 at 10:09:17PM +, Bossart, Nathan wrote: > On 11/15/21, 1:30 PM, "Robert Haas" wrote: >> Here's a new version that does it that way. Any other opinions? > > LGTM Patch 0001 means that the startup process would set up the structures to be able to build WAL records before running any kind of recovery action rather than doing it when it really needs it. That's fine by me. Is patch 0002 actually right regarding the handling of doPageWrites? Once applied, we finish by setting it when the startup process starts and not anymore at the end of recovery based on its the state of Insert, but this could have changed while in recovery when replaying one or more XLOG_FPW_CHANGE records. > I'm personally not too worried about a ~4% regression in this > particular benchmark... This is not a hot code path, that should be fine. -- Michael signature.asc Description: PGP signature
Re: RecoveryInProgress() has critical side effects
On 11/15/21, 1:30 PM, "Robert Haas" wrote: > Here's a new version that does it that way. Any other opinions? LGTM > The best thing I could come up with for a test case for this was to > try repeatedly making a new connection and running "SELECT > txid_current()", which will cause just one WAL record to be generated. > Unfortunately that path has overhead from a lot of other causes so I'm > not sure the results are very meaningful, but here they are: > > v1: 0.378 ms > v2: 0.391 ms > common base commit (10eae82b2): 0.376 ms I'm personally not too worried about a ~4% regression in this particular benchmark... Nathan
Re: RecoveryInProgress() has critical side effects
On Fri, Nov 12, 2021 at 10:23 AM Bossart, Nathan wrote: > 0001 looks reasonable to me. Cool. > Unless there's a demonstrable performance benefit from adding the > branch, my preference is to leave it out. I could be off-base, but it > seems possible that future changes might end up depending on any side > effects from this new branch, which is exactly what you're trying to > fix. Plus, always using the retry path is a good way to test that it > works as expected. Here's a new version that does it that way. Any other opinions? The best thing I could come up with for a test case for this was to try repeatedly making a new connection and running "SELECT txid_current()", which will cause just one WAL record to be generated. Unfortunately that path has overhead from a lot of other causes so I'm not sure the results are very meaningful, but here they are: v1: 0.378 ms v2: 0.391 ms common base commit (10eae82b2): 0.376 ms methodology: for i in `seq 1 1000`; do psql -c '\timing' -c 'select txid_current();'; done | grep '^Time:' | awk '{total+=$2} END {print total/NR}' run twice, discarding the first result, since the very first connection attempt after starting a new server process is notably slower -- Robert Haas EDB: http://www.enterprisedb.com v2-0002-Remove-InitXLOGAccess.patch Description: Binary data v2-0001-Move-InitXLogInsert-call-from-InitXLOGAccess-to-B.patch Description: Binary data
Re: RecoveryInProgress() has critical side effects
On 11/11/21, 9:20 AM, "Robert Haas" wrote: > memory, and not much is lost if they are set up and not used. So, in > 0001, I propose to move the call to InitXLogInsert() from > InitXLOGAccess() to BaseInit(). That means every backend will do it > the first time it starts up. It also means that CreateCheckPoint() > will no longer need a special case call to InitXLogInsert(), which > seems like a good thing. 0001 looks reasonable to me. > connected. However, it does have the effect of introducing a branch > into GetFullPageWritesInfo(), which I'm not 100% sure is cheap enough > not to matter. I think the obvious alternative is to just not > initialize RedoRecPtr and doPageWrites at all and document in the > comments that the first time each backend does something with them > it's going to end up retrying once; perhaps that is preferable. Going Unless there's a demonstrable performance benefit from adding the branch, my preference is to leave it out. I could be off-base, but it seems possible that future changes might end up depending on any side effects from this new branch, which is exactly what you're trying to fix. Plus, always using the retry path is a good way to test that it works as expected. Nathan