Re: Fsync (flush) all inserted WAL records
Dear All, I would propose a new function like GetXLogInsertRecPtr(), but with some modifications (please, see the attached patch). The result LSN can be passed to XLogFLush() safely. I believe, it will not raise an error in any case. XLogFlush(GetXLogLastInsertEndRecPtr()) will flush (fsync) all already inserted records at the moment. It is what I would like to get. I'm not sure, we need a SQL function counterpart for this new C function, but it is not a big deal to implement. With best regards, Vitaly On Monday, August 19, 2024 09:35 MSK, Michael Paquier wrote: On Wed, Aug 07, 2024 at 06:00:45PM +0300, Aleksander Alekseev wrote: > Assuming the function has value, as you claim, I see no reason not to > expose it similarly to pg_current_wal_*(). On top of that you will > have to test-cover it anyway. The easiest way to do it will be to have > an SQL-wrapper. I cannot be absolutely without seeing a patch, but adding SQL functions in this area is usually very useful for monitoring purposes of external solutions. -- Michael From ba82d6c6f8570fbbff14b4b52fa7720122bfb8ad Mon Sep 17 00:00:00 2001 From: Vitaly Davydov Date: Tue, 20 Aug 2024 18:03:11 +0300 Subject: [PATCH] Add function to return the end LSN of the last inserted WAL record --- src/backend/access/transam/xlog.c | 19 +++ src/include/access/xlog.h | 1 + 2 files changed, 20 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ee0fb0e28f..1430aea6d5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -9425,6 +9425,25 @@ GetXLogWriteRecPtr(void) return LogwrtResult.Write; } +/* + * Get the end pointer of the last inserted WAL record. + * The returned value will differ from the current insert pointer + * returned by GetXLogInsertRecPtr() if the last WAL record ends + * up at a page boundary. + */ +XLogRecPtr +GetXLogLastInsertEndRecPtr(void) +{ + XLogCtlInsert *Insert = &XLogCtl->Insert; + uint64 current_bytepos; + + SpinLockAcquire(&Insert->insertpos_lck); + current_bytepos = Insert->CurrBytePos; + SpinLockRelease(&Insert->insertpos_lck); + + return XLogBytePosToEndRecPtr(current_bytepos); +} + /* * Returns the redo pointer of the last checkpoint or restartpoint. This is * the oldest point in WAL that we still need, if we have to restart recovery. diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 083810f5b4..e98a825642 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -226,6 +226,7 @@ extern RecoveryState GetRecoveryState(void); extern bool XLogInsertAllowed(void); extern XLogRecPtr GetXLogInsertRecPtr(void); extern XLogRecPtr GetXLogWriteRecPtr(void); +extern XLogRecPtr GetXLogLastInsertEndRecPtr(void); extern uint64 GetSystemIdentifier(void); extern char *GetMockAuthenticationNonce(void); -- 2.34.1
Re: Fsync (flush) all inserted WAL records
On Wednesday, August 07, 2024 16:55 MSK, Aleksander Alekseev wrote: Perhaps you could give more context on the use cases for this function? The value of it is not quite clear. What people typically need is making sure if a given LSN was fsync'ed and/or replicated and/or applied on a replica. Your case(s) however is different and I don't fully understand it. I use asynchronous commit (without XLogFlush/fsync at commit). At some moment I would like to XLogFlush (fsync) all already asynchronously committed transactions (inserted but not flushed/fsynced yet WAL records). Assume, that there is no any active transactions at this moment, no any potential race conditions. My problem is to find a proper LSN which I can use as a parameter for XLogFlush. The problem is that I can't use GetXLogInsertRecPtr() because it may be "in the future" due to some reasons (added page header size). XLogFlush will fail in this case. In any case you will need to implement an SQL-wrapper in order to make the function available to DBAs, cover it with tests and provide documentation.Well, I would like to use such function in C language code, in some solution, not as a function to be used by users. With best regards, Vitaly Hi Vitaly, > I would propose a new function to fulfill my requirements like this (see > below) but I prefer not to create new functions unreasonably: > > XLogRecPtr > GetXLogLastInsertEndRecPtr(void) > { > XLogCtlInsert *Insert = &XLogCtl->Insert; > uint64 current_bytepos; > SpinLockAcquire(&Insert->insertpos_lck); > current_bytepos = Insert->CurrBytePos; > SpinLockRelease(&Insert->insertpos_lck); > return XLogBytePosToEndRecPtr(current_bytepos); > } > > This function differs from the existing GetXLogInsertRecPtr() by calling > XLogBytePosToEndRecPtr instead of XLogBytePosToRecPtr. In any case you will need to implement an SQL-wrapper in order to make the function available to DBAs, cover it with tests and provide documentation. -- Best regards, Aleksander Alekseev
Re: Fsync (flush) all inserted WAL records
Hi Aleksander, On Wednesday, August 07, 2024 12:19 MSK, Aleksander Alekseev wrote: > Does pg_current_wal_flush_lsn() [1] return what you need? > > [1]: > https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL If not, take a look at its implementation and functions around, GetInsertRecPtr() and others. I believe you will find all you need for the task.Thank you for the response. I need the LSN of the last inserted by not flushed WAL record. The function pg_current_wal_flush_lsn() doesn't help. It returns the current flush position. GetInsertRecPtr() doesn't help as well because it returns XLogCtl->LogwrtRqst.Write which is updated when the record crosses page boundary. I looked at the code and haven't found any suitable function except of GetLastImportantRecPtr() but it returns start LSN of the last inserted important record (but I need end lsn). I would propose a new function to fulfill my requirements like this (see below) but I prefer not to create new functions unreasonably: XLogRecPtr GetXLogLastInsertEndRecPtr(void) { XLogCtlInsert *Insert = &XLogCtl->Insert; uint64 current_bytepos; SpinLockAcquire(&Insert->insertpos_lck); current_bytepos = Insert->CurrBytePos; SpinLockRelease(&Insert->insertpos_lck); return XLogBytePosToEndRecPtr(current_bytepos); } This function differs from the existing GetXLogInsertRecPtr() by calling XLogBytePosToEndRecPtr instead of XLogBytePosToRecPtr. With best regards, Vitaly
Fsync (flush) all inserted WAL records
Hi Hackers, I use async commits. At some moment, I would like to make sure that all inserted WAL records are fsync-ed. I can use XLogFlush function but I have some doubts which LSN to specify. There is a number of functions which return write or insert LSNs but they are not applicable. I can't use GetXLogInsertRecPtr() because it returns a real insert LSN, not the end LSN of the last record. XLogFlush may fail with such LSN because the specified LSN may be "in the future" if the WAL record ends up to the page boundary (the real insert LSN is summed up with page header size). I can't use GetXLogWriteRecPtr() because it seems to be bounded to page boundaries. Some inserted WAL records may not be fsync-ed. Some other functions seems not applicable as well. The first idea is to use GetLastImportantRecPtr() but this function returns the start LSN of the last important WAL record. I would use XLogFlush(GetLastImportantRecPtr() + 1) but I'm not sure that this way is conventional. Another idea is to create a new function like GetXLogInsertRecPtr() which calls XLogBytePosToEndRecPtr() instead of XLogBytePosToRecPtr() inside it. Could you please advice which way to go? With best regards, Vitaly
RE: Slow catchup of 2PC (twophase) transactions on replica in LR
Hi Kuroda-san, Thank you very much for the patch. In general, it seem to work well for me, but there seems to be a memory access problem in libpqrcv_alter_slot -> quote_identifier in case of NULL slot_name. It happens, if the two_phase option is altered on a subscription without slot. I think, a simple check for NULL may fix the problem. I guess, the same problem may be for failover option. Another possible problem is related to my use case. I haven't reproduced this case, just some thoughts. I guess, when two_phase is ON, the PREPARE statement may be truncated from the WAL at checkpoint, but COMMIT PREPARED is still kept in the WAL. On catchup, I would ask the master to send transactions from some restart LSN. I would like to get all such transactions competely, with theirs bodies, not only COMMIT PREPARED messages. One of the solutions is to have an option for the slot to keep the WAL like with two_phase = OFF independently on its two_phase option. It is just an idea. With best regards, Vitaly
RE: Slow catchup of 2PC (twophase) transactions on replica in LR
Dear Hayato, On Monday, April 22, 2024 15:54 MSK, "Hayato Kuroda (Fujitsu)" wrote: > Dear Vitaly, > > > I looked at the patch and realized that I can't try it easily in the near > > future > > because the solution I'm working on is based on PG16 or earlier. This patch > > is > > not easily applicable to the older releases. I have to port my solution to > > the > > master, which is not done yet. > > We also tried to port our patch for PG16, but the largest barrier was that a > replication command ALTER_SLOT is not supported. Since the slot option > two_phase > can't be modified, it is difficult to skip decoding PREPARE command even when > altering the option from true to false. > IIUC, Adding a new feature (e.g., replication command) for minor updates is > generally > prohibited > ... Attached patch set is a ported version for PG16, which breaks ABI. This can be used for testing purpose, but it won't be pushed to REL_16_STABLE. At least, this patchset can pass my github CI. Can you apply and check whether your issue is solved?It is fantastic. Thank you for your help! I will definitely try your patch. I need some time to test and incorporate it. I also plan to port my stuff to the master branch to simplify testing of patches. With best regards, Vitaly Davydov
Re: How to accurately determine when a relation should use local buffers?
Hi Aleksander, Thank you for the reply. > Could you please provide a specific example when the current code willdo > something wrong/unintended? I can't say that something is wrong in vanilla. But if you decide to replicate DDL in some solutions like multimaster, you might want to replicate CREATE TEMPORARY TABLE. Furthermore, there is some possible inconsistency in the code show below (REL_16_STABLE) in bufmgr.c file: - FlushRelationBuffers, PrefetchBuffer uses RelationUsesLocalBuffers(rel). - ExtendBufferedRel_common finally use BufferManagerRelation.relpersistence which is actually rd_rel->relpersistence, works like RelationUsesLocalBuffers. - ReadBuffer_common uses isLocalBuf = SmgrIsTemp(smgr), that checks rlocator.backend for InvalidBackendId. I would like to clarify, do we completely refuse the use of temporary tables in other contexts than in backends or there is some work-in-progress to allow some other usage contexts? If so, the check of rd_rel->relpersistence is enough. Not sure why we use SmgrIsTemp instead of RelationUsesLocalBuffers in ReadBuffer_common. With best regards, Vitaly Davydov вт, 21 нояб. 2023 г. в 11:52, Aleksander Alekseev : > Hi, > > > I would like to clarify, what the correct way is to determine that a > given relation is using local buffers. Local buffers, as far as I know, are > used for temporary tables in backends. There are two functions/macros > (bufmgr.c): SmgrIsTemp, RelationUsesLocalBuffers. The first function > verifies that the current process is a regular session backend, while the > other macro verifies the relation persistence characteristic. It seems, the > use of each function independently is not correct. I think, these functions > should be applied in pair to check for local buffers use, but, it seems, > these functions are used independently. It works until temporary tables are > allowed only in session backends. > > Could you please provide a specific example when the current code will > do something wrong/unintended? > > > I'm concerned, how to determine the use of local buffers in some other > theoretical cases? For example, if we decide to replicate temporary tables? > Are there the other cases, when local buffers can be used with relations in > the Vanilla? Do we allow the use of relations with RELPERSISTENCE_TEMP not > only in session backends? > > Temporary tables, by definition, are visible only within one session. > I can't imagine how and why they would be replicated. > > -- > Best regards, > Aleksander Alekseev > -- С уважением, Давыдов Виталий http://www.vdavydov.ru