Thanks Rahila for reviewing this patch. On Tue, Mar 14, 2017 at 8:13 PM, Rahila Syed <rahilasye...@gmail.com> wrote:
> Hello, > > I applied and tested this patch on latest sources and it works fine. > > Following are some comments, > > >+ /* Wait event for SNRU */ > >+ WAIT_EVENT_READ_SLRU_PAGE, > Typo in the comment. > > Fixed. > >FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush, > WAIT_EVENT_FLUSH_DATA_BLOCK); > This call is inside mdwriteback() which can flush more than one block so > should WAIT_EVENT _FLUSH_DATA_BLOCK > be renamed to WAIT_EVENT_FLUSH_DATA_BLOCKS? > > Changed with WAIT_EVENT_FLUSH_DATA_BLOCKS. > Should calls to write() in following functions be tracked too? > qtext_store() - This is related to pg_stat_statements > > I am not quite sure about this, as this is for stat statements. Also part from the place you found there are many other fwrite() call into pg_stat_statements, and I intentionally haven't added event here as it is very small write about stat, and doesn't look like we should add for those call. > dsm_impl_mmap() - This is in relation to creating dsm segments. > > Added new event here. Actually particular write call is zero-filling the DSM file. > write_auto_conf_file()- This is called when updated configuration > parameters are > written to a temp file. > > write_auto_conf_file() is getting called during the ALTER SYSTEM call. Here write happen only when someone explicitly run the ALTER SYSTEM call. This is administrator call and so doesn't seem like necessary to add separate wait event for this. PFA latest patch with other fixes. > > On Wed, Mar 8, 2017 at 4:50 PM, Rushabh Lathia <rushabh.lat...@gmail.com> > wrote: > >> >> >> On Wed, Mar 8, 2017 at 8:23 AM, Robert Haas <robertmh...@gmail.com> >> wrote: >> >>> On Tue, Mar 7, 2017 at 9:32 PM, Amit Kapila <amit.kapil...@gmail.com> >>> wrote: >>> > On Tue, Mar 7, 2017 at 9:16 PM, Robert Haas <robertmh...@gmail.com> >>> wrote: >>> >> On Mon, Mar 6, 2017 at 9:09 PM, Amit Kapila <amit.kapil...@gmail.com> >>> wrote: >>> >>> Sure, if you think both Writes and Reads at OS level can have some >>> >>> chance of blocking in obscure cases, then we should add a wait event >>> >>> for them. >>> >> >>> >> I think writes have a chance of blocking in cases even in cases that >>> >> are not very obscure at all. >>> > >>> > Point taken for writes, but I think in general we should have some >>> > criteria based on which we can decide whether to have a wait event for >>> > a particular call. It should not happen that we have tons of wait >>> > events and out of which, only a few are helpful in most of the cases >>> > in real-world scenarios. >>> >>> Well, the problem is that if you pick and choose which wait events to >>> add based on what you think will be common, you're actually kind of >>> hosing yourself. Because now when something uncommon happens, suddenly >>> you don't get any wait event data and you can't tell what's happening. >>> I think the number of new wait events added by Rushabh's patch is >>> wholly reasonable. Yeah, some of those are going to be a lot more >>> common than others, but so what? We add wait events so that we can >>> find out what's going on. I don't want to sometimes know when a >>> backend is blocked on an I/O. I want to ALWAYS know. >>> >>> >> Yes, I agree with Robert. Knowing what we want and what we don't >> want is difficult to judge. Something which we might think its not useful >> information, and later of end up into situation where we re-think about >> adding those missing stuff is not good. Having more information about >> the system, specially for monitoring purpose is always good. >> >> I am attaching another version of the patch, as I found stupid mistake >> in the earlier version of patch, where I missed to initialize initial >> value to >> WaitEventIO enum. Also earlier version was not getting cleanly apply on >> the current version of sources. >> >> >> >> -- >> Rushabh Lathia >> EnterpriseDB: http://www.enterprisedb.com >> The Enterprise PostgreSQL Company >> >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> >> > Regards, Rushabh Lathia www.EnterpriseDB.com The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 4d03531..45fedec 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -716,6 +716,12 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser point. </para> </listitem> + <listitem> + <para> + <literal>IO</>: The server process is waiting for a IO to complete. + <literal>wait_event</> will identify the specific wait point. + </para> + </listitem> </itemizedlist> </entry> </row> @@ -1272,6 +1278,276 @@ postgres 27093 0.0 0.0 30096 2752 ? Ss 11:34 0:00 postgres: ser <entry><literal>RecoveryApplyDelay</></entry> <entry>Waiting to apply WAL at recovery because it is delayed.</entry> </row> + <row> + <entry morerows="65"><literal>IO</></entry> + <entry><literal>ReadDataBlock</></entry> + <entry>Waiting during relation data block read.</entry> + </row> + <row> + <entry><literal>WriteDataBlock</></entry> + <entry>Waiting during relation data block write.</entry> + </row> + <row> + <entry><literal>SyncDataBlock</></entry> + <entry>Waiting during relation data block sync.</entry> + </row> + <row> + <entry><literal>ExtendDataBlock</></entry> + <entry>Waiting during add a block to the relation.</entry> + </row> + <row> + <entry><literal>FlushDataBlocks</></entry> + <entry>Waiting during write pages back to storage.</entry> + </row> + <row> + <entry><literal>PrefetchDataBlock</></entry> + <entry>Waiting during asynchronous read of the specified block of a relation.</entry> + </row> + <row> + <entry><literal>TruncateDataBlock</></entry> + <entry>Waiting during truncate relation to specified number of blocks.</entry> + </row> + <row> + <entry><literal>SyncRelation</></entry> + <entry>Waiting during sync writes to stable storage.</entry> + </row> + <row> + <entry><literal>SyncImmedRelation</></entry> + <entry>Waiting during immediate sync a relation to stable storage.</entry> + </row> + <row> + <entry><literal>WriteRewriteDataBlock</></entry> + <entry>Waiting to write data block during rewrite heap.</entry> + </row> + <row> + <entry><literal>SyncRewriteDataBlock</></entry> + <entry>Waiting to sync data block during rewrite heap.</entry> + </row> + <row> + <entry><literal>ReadBuffile</></entry> + <entry>Waiting during buffile read operation.</entry> + </row> + <row> + <entry><literal>WriteBuffile</></entry> + <entry>Waiting during buffile write operation.</entry> + </row> + <row> + <entry><literal>ReadXLog</></entry> + <entry>Waiting during read the XLOG page.</entry> + </row> + <row> + <entry><literal>ReadCopyXLog</></entry> + <entry>Wait to read the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry> + </row> + <row> + <entry><literal>WriteXLog</></entry> + <entry>Waiting during write the XLOG page.</entry> + </row> + <row> + <entry><literal>WriteInitXLogFile</></entry> + <entry>Waiting to write the XLOG page during XLOG file initialization.</entry> + </row> + <row> + <entry><literal>WriteCopyXLogFile</></entry> + <entry>Waiting to write the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry> + </row> + <row> + <entry><literal>WriteBootstrapXLog</></entry> + <entry>Waiting to write the XLOG page during bootstrap.</entry> + </row> + <row> + <entry><literal>SyncInitXLogFile</></entry> + <entry>Waiting to sync the XLOG page during XLOG file initialization.</entry> + </row> + <row> + <entry><literal>SyncCopyXLogFile</></entry> + <entry>Waiting to sync the XLOG page during create a new XLOG file segment by copying a pre-existing one.</entry> + </row> + <row> + <entry><literal>SyncBootstrapXLog</></entry> + <entry>Waiting to sync the XLOG page during bootstrap.</entry> + </row> + <row> + <entry><literal>SyncAssignXLogMethod</></entry> + <entry>Waiting to assign xlog sync method.</entry> + </row> + <row> + <entry><literal>WriteControlFile</></entry> + <entry>Waiting to write the control file.</entry> + </row> + <row> + <entry><literal>WriteUpdateControlFile</></entry> + <entry>Waiting to write the control file during update control file.</entry> + </row> + <row> + <entry><literal>ReadControlFile</></entry> + <entry>Waiting to read the control file.</entry> + </row> + <row> + <entry><literal>SyncWriteControlFile</></entry> + <entry>Waiting to sync the control file.</entry> + </row> + <row> + <entry><literal>SyncUpdateControlFile</></entry> + <entry>Waiting to sync the control file during update control file.</entry> + </row> + <row> + <entry><literal>ReadApplyLogicalMapping</></entry> + <entry>Waiting to read logical mapping during apply a single mapping file.</entry> + </row> + <row> + <entry><literal>WriteLogicalMappingRewrite</></entry> + <entry>Waiting to write logical mapping during xlog logical rewrite.</entry> + </row> + <row> + <entry><literal>SyncLogicalMappingRewrite</></entry> + <entry>Waiting to sync logical mapping during xlog logical rewrite.</entry> + </row> + <row> + <entry><literal>SyncLogicalMappingRewriteHeap</></entry> + <entry>Waiting to sync logical mapping during a checkpoint for logical rewrite mappings.</entry> + </row> + <row> + <entry><literal>TruncateLogicalMappingRewrite</></entry> + <entry>Waiting to truncate logical mapping during xlog logical rewrite.</entry> + </row> + <row> + <entry><literal>WriteSnapbuildSerialize</></entry> + <entry>Waiting to write snapshot during serialize the snapshot.</entry> + </row> + <row> + <entry><literal>ReadSnapbuildSerialize</></entry> + <entry>Waiting to read snapshot during serialize the snapshot.</entry> + </row> + <row> + <entry><literal>SyncSnapbuildSerialize</></entry> + <entry>Waiting to sync snapshot during serialize the snapshot.</entry> + </row> + <row> + <entry><literal>ReadSLRUPage</></entry> + <entry>Waiting to read page during physical read of a page into a buffer slot.</entry> + </row> + <row> + <entry><literal>WriteSLRUPage</></entry> + <entry>Waiting to write page during physical write of a page from a buffer slot.</entry> + </row> + <row> + <entry><literal>SyncSLRUWritePage</></entry> + <entry>Waiting to sync page during physical write of a page from a buffer slot.</entry> + </row> + <row> + <entry><literal>SyncSLRUFlush</></entry> + <entry>Waiting to sync page during flush dirty pages to disk during checkpoint or database shutdown.</entry> + </row> + <row> + <entry><literal>ReadTimelineHistoryWrite</></entry> + <entry>Waiting to read timeline history during write timeline history.</entry> + </row> + <row> + <entry><literal>ReadTimelineHistoryWalsender</></entry> + <entry>Waiting to read timeline history during walsander timeline command.</entry> + </row> + <row> + <entry><literal>WriteTimelineHistory</></entry> + <entry>Waiting to write timeline history.</entry> + </row> + <row> + <entry><literal>WriteTimelineHistoryFile</></entry> + <entry>Waiting to write timeline history during a history file write for given timeline and contents.</entry> + </row> + <row> + <entry><literal>SyncTimelineHistoryWrite</></entry> + <entry>Waiting to sync timeline history during write timeline history</entry> + </row> + <row> + <entry><literal>SyncTimelineHistoryFile</></entry> + <entry>Waiting to sync timeline history during a history file write for given timeline and contents.</entry> + </row> + <row> + <entry><literal>ReadTwophaseFile</></entry> + <entry>Waiting to read two phase file.</entry> + </row> + <row> + <entry><literal>WriteRecreateTwophaseFile</></entry> + <entry>Waiting to write two phase file during recreate two phase file.</entry> + </row> + <row> + <entry><literal>SyncRecreateTwophaseFile</></entry> + <entry>Waiting to sync two phase file during recreate two phase file.</entry> + </row> + <row> + <entry><literal>ReadSysloggerFile</></entry> + <entry>Wait during read syslogger file.</entry> + </row> + <row> + <entry><literal>WriteSysloggerFile</></entry> + <entry>Wait during write syslogger file.</entry> + </row> + <row> + <entry><literal>ReadRestorREPLSlot</></entry> + <entry>Wait to read REPL slot during load a single slot from disk into memory.</entry> + </row> + <row> + <entry><literal>WriteREPLSlot</></entry> + <entry>Wait to write REPL slot.</entry> + </row> + <row> + <entry><literal>SyncRestoreREPLSlot</></entry> + <entry>Wait to sync REPL slot during load a single slot from disk into memory.</entry> + </row> + <row> + <entry><literal>ReadCopyFile</></entry> + <entry>Waiting to read during copy file.</entry> + </row> + <row> + <entry><literal>WriteCopyFile</></entry> + <entry>Waiting to write during copy file.</entry> + </row> + <row> + <entry><literal>ReadLoadRELMAPFile</></entry> + <entry>Waiting to read RELMAP file.</entry> + </row> + <row> + <entry><literal>WriteLoadRELMAPFile</></entry> + <entry>Waiting to write RELMAP file.</entry> + </row> + <row> + <entry><literal>SyncLoadRELMAPFile</></entry> + <entry>Waiting to sync RELMAP file.</entry> + </row> + <row> + <entry><literal>ReadCreateLockFile</></entry> + <entry>Wait to read lock file during create lock file.</entry> + </row> + <row> + <entry><literal>ReadAddToDataDirLockFile</></entry> + <entry>Wait to read lock file during add a line in the data directory lock file.</entry> + </row> + <row> + <entry><literal>ReadRecheckDataDirLockFile</></entry> + <entry>Wait to read lock file during recheck that the data directory lock file.</entry> + </row> + <row> + <entry><literal>WriteCreateLockFile</></entry> + <entry>Wait to write lock file during create lock file.</entry> + </row> + <row> + <entry><literal>WriteAddToDataDirLockFile</></entry> + <entry>Wait to write lock file during add a line in the data directory lock file.</entry> + </row> + <row> + <entry><literal>SyncCreateLockFile</></entry> + <entry>Wait to sync lock file during create lock file.</entry> + </row> + <row> + <entry><literal>SyncAddToDataDirLockFile</></entry> + <entry>Wait to sync lock file during add a line in the data directory lock file.</entry> + </row> + <row> + <entry><literal>WriteZeroFillDSM</></entry> + <entry>Wait to write during zero-fill of DSM file.</entry> + </row> + </tbody> </tgroup> </table> diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c index c7b283c..2456500 100644 --- a/src/backend/access/heap/rewriteheap.c +++ b/src/backend/access/heap/rewriteheap.c @@ -119,6 +119,8 @@ #include "lib/ilist.h" +#include "pgstat.h" + #include "replication/logical.h" #include "replication/slot.h" @@ -916,7 +918,8 @@ logical_heap_rewrite_flush_mappings(RewriteState state) * Note that we deviate from the usual WAL coding practices here, * check the above "Logical rewrite support" comment for reasoning. */ - written = FileWrite(src->vfd, waldata_start, len); + written = FileWrite(src->vfd, waldata_start, len, + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK); if (written != len) ereport(ERROR, (errcode_for_file_access(), @@ -957,7 +960,7 @@ logical_end_heap_rewrite(RewriteState state) hash_seq_init(&seq_status, state->rs_logical_mappings); while ((src = (RewriteMappingFile *) hash_seq_search(&seq_status)) != NULL) { - if (FileSync(src->vfd) != 0) + if (FileSync(src->vfd, WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", src->path))); @@ -1141,11 +1144,13 @@ heap_xlog_logical_rewrite(XLogReaderState *r) * Truncate all data that's not guaranteed to have been safely fsynced (by * previous record or by the last checkpoint). */ + pgstat_report_wait_start(WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE); if (ftruncate(fd, xlrec->offset) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not truncate file \"%s\" to %u: %m", path, (uint32) xlrec->offset))); + pgstat_report_wait_end(); /* now seek to the position we want to write our data to */ if (lseek(fd, xlrec->offset, SEEK_SET) != xlrec->offset) @@ -1159,20 +1164,24 @@ heap_xlog_logical_rewrite(XLogReaderState *r) len = xlrec->num_mappings * sizeof(LogicalRewriteMappingData); /* write out tail end of mapping file (again) */ + pgstat_report_wait_start(WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE); if (write(fd, data, len) != len) ereport(ERROR, (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", path))); + pgstat_report_wait_end(); /* * Now fsync all previously written data. We could improve things and only * do this for the last write to a file, but the required bookkeeping * doesn't seem worth the trouble. */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE); if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", path))); + pgstat_report_wait_end(); CloseTransientFile(fd); } @@ -1266,10 +1275,12 @@ CheckPointLogicalRewriteHeap(void) * changed or have only been created since the checkpoint's start, * but it's currently not deemed worth the effort. */ - else if (pg_fsync(fd) != 0) + pgstat_report_wait_start(WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP); + if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", path))); + pgstat_report_wait_end(); CloseTransientFile(fd); } } diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index a66ef5c..237e8eb 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -54,6 +54,7 @@ #include "access/slru.h" #include "access/transam.h" #include "access/xlog.h" +#include "pgstat.h" #include "storage/fd.h" #include "storage/shmem.h" #include "miscadmin.h" @@ -675,13 +676,16 @@ SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno) } errno = 0; + pgstat_report_wait_start(WAIT_EVENT_READ_SLRU_PAGE); if (read(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ) { + pgstat_report_wait_end(); slru_errcause = SLRU_READ_FAILED; slru_errno = errno; CloseTransientFile(fd); return false; } + pgstat_report_wait_end(); if (CloseTransientFile(fd)) { @@ -834,8 +838,10 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata) } errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_SLRU_PAGE); if (write(fd, shared->page_buffer[slotno], BLCKSZ) != BLCKSZ) { + pgstat_report_wait_end(); /* if write didn't set errno, assume problem is no disk space */ if (errno == 0) errno = ENOSPC; @@ -845,6 +851,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata) CloseTransientFile(fd); return false; } + pgstat_report_wait_end(); /* * If not part of Flush, need to fsync now. We assume this happens @@ -852,6 +859,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata) */ if (!fdata) { + pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_WRITE_PAGE); if (ctl->do_fsync && pg_fsync(fd)) { slru_errcause = SLRU_FSYNC_FAILED; @@ -859,6 +867,7 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata) CloseTransientFile(fd); return false; } + pgstat_report_wait_end(); if (CloseTransientFile(fd)) { @@ -1126,6 +1135,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied) ok = true; for (i = 0; i < fdata.num_files; i++) { + pgstat_report_wait_start(WAIT_EVENT_SYNC_SLRU_FLUSH); if (ctl->do_fsync && pg_fsync(fdata.fd[i])) { slru_errcause = SLRU_FSYNC_FAILED; @@ -1133,6 +1143,7 @@ SimpleLruFlush(SlruCtl ctl, bool allow_redirtied) pageno = fdata.segno[i] * SLRU_PAGES_PER_SEGMENT; ok = false; } + pgstat_report_wait_end(); if (CloseTransientFile(fdata.fd[i])) { diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c index 1fdc591..5ca811f 100644 --- a/src/backend/access/transam/timeline.c +++ b/src/backend/access/transam/timeline.c @@ -38,6 +38,7 @@ #include "access/xlog.h" #include "access/xlog_internal.h" #include "access/xlogdefs.h" +#include "pgstat.h" #include "storage/fd.h" /* @@ -338,7 +339,9 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, for (;;) { errno = 0; + pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE); nbytes = (int) read(srcfd, buffer, sizeof(buffer)); + pgstat_report_wait_end(); if (nbytes < 0 || errno != 0) ereport(ERROR, (errcode_for_file_access(), @@ -346,6 +349,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, if (nbytes == 0) break; errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY); if ((int) write(fd, buffer, nbytes) != nbytes) { int save_errno = errno; @@ -365,6 +369,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); } CloseTransientFile(srcfd); } @@ -400,10 +405,12 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI, errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE); if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); + pgstat_report_wait_end(); if (CloseTransientFile(fd)) ereport(ERROR, @@ -460,6 +467,7 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size) errmsg("could not create file \"%s\": %m", tmppath))); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE); if ((int) write(fd, content, size) != size) { int save_errno = errno; @@ -475,11 +483,14 @@ writeTimeLineHistoryFile(TimeLineID tli, char *content, int size) (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); + pgstat_report_wait_start(WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE); if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); + pgstat_report_wait_end(); if (CloseTransientFile(fd)) ereport(ERROR, diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 5cefc43..35f2a25 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1200,8 +1200,10 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) */ buf = (char *) palloc(stat.st_size); + pgstat_report_wait_start(WAIT_EVENT_READ_TWOPHASE_FILE); if (read(fd, buf, stat.st_size) != stat.st_size) { + pgstat_report_wait_end(); CloseTransientFile(fd); if (give_warnings) ereport(WARNING, @@ -1212,6 +1214,7 @@ ReadTwoPhaseFile(TransactionId xid, bool give_warnings) return NULL; } + pgstat_report_wait_end(); CloseTransientFile(fd); hdr = (TwoPhaseFileHeader *) buf; @@ -1542,8 +1545,10 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) path))); /* Write content and CRC */ + pgstat_report_wait_start(WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE); if (write(fd, content, len) != len) { + pgstat_report_wait_end(); CloseTransientFile(fd); ereport(ERROR, (errcode_for_file_access(), @@ -1551,16 +1556,19 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) } if (write(fd, &statefile_crc, sizeof(pg_crc32c)) != sizeof(pg_crc32c)) { + pgstat_report_wait_end(); CloseTransientFile(fd); ereport(ERROR, (errcode_for_file_access(), errmsg("could not write two-phase state file: %m"))); } + pgstat_report_wait_end(); /* * We must fsync the file because the end-of-replay checkpoint will not do * so, there being no GXACT in shared memory yet to tell it to. */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE); if (pg_fsync(fd) != 0) { CloseTransientFile(fd); @@ -1568,6 +1576,7 @@ RecreateTwoPhaseFile(TransactionId xid, void *content, int len) (errcode_for_file_access(), errmsg("could not fsync two-phase state file: %m"))); } + pgstat_report_wait_end(); if (CloseTransientFile(fd) != 0) ereport(ERROR, diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index c0e5362..ba18954 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -2456,7 +2456,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible) do { errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_XLOG); written = write(openLogFile, from, nleft); + pgstat_report_wait_end(); if (written <= 0) { if (errno == EINTR) @@ -3207,6 +3209,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) for (nbytes = 0; nbytes < XLogSegSize; nbytes += XLOG_BLCKSZ) { errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_INIT_XLOG_FILE); if ((int) write(fd, zbuffer, XLOG_BLCKSZ) != (int) XLOG_BLCKSZ) { int save_errno = errno; @@ -3225,8 +3228,10 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); } + pgstat_report_wait_start(WAIT_EVENT_SYNC_INIT_XLOG_FILE); if (pg_fsync(fd) != 0) { close(fd); @@ -3234,6 +3239,7 @@ XLogFileInit(XLogSegNo logsegno, bool *use_existent, bool use_lock) (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); if (close(fd)) ereport(ERROR, @@ -3360,6 +3366,7 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, if (nread > sizeof(buffer)) nread = sizeof(buffer); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_READ_COPY_XLOG); if (read(srcfd, buffer, nread) != nread) { if (errno != 0) @@ -3372,8 +3379,10 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, (errmsg("not enough data in file \"%s\"", path))); } + pgstat_report_wait_end(); } errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_XLOG_FILE); if ((int) write(fd, buffer, sizeof(buffer)) != (int) sizeof(buffer)) { int save_errno = errno; @@ -3389,12 +3398,15 @@ XLogFileCopy(XLogSegNo destsegno, TimeLineID srcTLI, XLogSegNo srcsegno, (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); } + pgstat_report_wait_start(WAIT_EVENT_SYNC_COPY_XLOG_FILE); if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); + pgstat_report_wait_end(); if (CloseTransientFile(fd)) ereport(ERROR, @@ -4414,6 +4426,7 @@ WriteControlFile(void) XLOG_CONTROL_FILE))); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_CONTROL_FILE); if (write(fd, buffer, PG_CONTROL_SIZE) != PG_CONTROL_SIZE) { /* if write didn't set errno, assume problem is no disk space */ @@ -4423,11 +4436,14 @@ WriteControlFile(void) (errcode_for_file_access(), errmsg("could not write to control file: %m"))); } + pgstat_report_wait_end(); + pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_CONTROL_FILE); if (pg_fsync(fd) != 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not fsync control file: %m"))); + pgstat_report_wait_end(); if (close(fd)) ereport(PANIC, @@ -4453,10 +4469,12 @@ ReadControlFile(void) errmsg("could not open control file \"%s\": %m", XLOG_CONTROL_FILE))); + pgstat_report_wait_start(WAIT_EVENT_READ_CONTROL_FILE); if (read(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData)) ereport(PANIC, (errcode_for_file_access(), errmsg("could not read from control file: %m"))); + pgstat_report_wait_end(); close(fd); @@ -4634,6 +4652,7 @@ UpdateControlFile(void) XLOG_CONTROL_FILE))); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE); if (write(fd, ControlFile, sizeof(ControlFileData)) != sizeof(ControlFileData)) { /* if write didn't set errno, assume problem is no disk space */ @@ -4643,11 +4662,14 @@ UpdateControlFile(void) (errcode_for_file_access(), errmsg("could not write to control file: %m"))); } + pgstat_report_wait_end(); + pgstat_report_wait_start(WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE); if (pg_fsync(fd) != 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not fsync control file: %m"))); + pgstat_report_wait_end(); if (close(fd)) ereport(PANIC, @@ -5036,6 +5058,7 @@ BootStrapXLOG(void) /* Write the first page with the initial record */ errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_BOOTSTRAP_XLOG); if (write(openLogFile, page, XLOG_BLCKSZ) != XLOG_BLCKSZ) { /* if write didn't set errno, assume problem is no disk space */ @@ -5045,11 +5068,14 @@ BootStrapXLOG(void) (errcode_for_file_access(), errmsg("could not write bootstrap transaction log file: %m"))); } + pgstat_report_wait_end(); + pgstat_report_wait_start(WAIT_EVENT_SYNC_BOOTSTRAP_XLOG); if (pg_fsync(openLogFile) != 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not fsync bootstrap transaction log file: %m"))); + pgstat_report_wait_end(); if (close(openLogFile)) ereport(PANIC, @@ -9999,11 +10025,13 @@ assign_xlog_sync_method(int new_sync_method, void *extra) */ if (openLogFile >= 0) { + pgstat_report_wait_start(WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD); if (pg_fsync(openLogFile) != 0) ereport(PANIC, (errcode_for_file_access(), errmsg("could not fsync log segment %s: %m", XLogFileNameP(ThisTimeLineID, openLogSegNo)))); + pgstat_report_wait_end(); if (get_sync_bit(sync_method) != get_sync_bit(new_sync_method)) XLogFileClose(); } @@ -11456,6 +11484,7 @@ retry: goto next_record_is_invalid; } + pgstat_report_wait_start(WAIT_EVENT_READ_XLOG); if (read(readFile, readBuf, XLOG_BLCKSZ) != XLOG_BLCKSZ) { char fname[MAXFNAMELEN]; @@ -11467,6 +11496,7 @@ retry: fname, readOff))); goto next_record_is_invalid; } + pgstat_report_wait_end(); Assert(targetSegNo == readSegNo); Assert(targetPageOff == readOff); diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 8b99b78..d17d541 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -24,6 +24,7 @@ #include "access/xlogutils.h" #include "catalog/catalog.h" #include "miscadmin.h" +#include "pgstat.h" #include "storage/smgr.h" #include "utils/guc.h" #include "utils/hsearch.h" @@ -728,7 +729,9 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, Size count) else segbytes = nbytes; + pgstat_report_wait_start(WAIT_EVENT_READ_XLOG); readbytes = read(sendFile, p, segbytes); + pgstat_report_wait_end(); if (readbytes <= 0) { char path[MAXPGPATH]; diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c index 7cacb1e..00a0a55 100644 --- a/src/backend/postmaster/pgstat.c +++ b/src/backend/postmaster/pgstat.c @@ -280,6 +280,7 @@ static const char *pgstat_get_wait_activity(WaitEventActivity w); static const char *pgstat_get_wait_client(WaitEventClient w); static const char *pgstat_get_wait_ipc(WaitEventIPC w); static const char *pgstat_get_wait_timeout(WaitEventTimeout w); +static const char *pgstat_get_wait_io(WaitEventIO w); static void pgstat_setheader(PgStat_MsgHdr *hdr, StatMsgType mtype); static void pgstat_send(void *msg, int len); @@ -3176,6 +3177,9 @@ pgstat_get_wait_event_type(uint32 wait_event_info) case PG_WAIT_TIMEOUT: event_type = "Timeout"; break; + case PG_WAIT_IO: + event_type = "IO"; + break; default: event_type = "???"; break; @@ -3246,6 +3250,13 @@ pgstat_get_wait_event(uint32 wait_event_info) event_name = pgstat_get_wait_timeout(w); break; } + case PG_WAIT_IO: + { + WaitEventIO w = (WaitEventIO) wait_event_info; + + event_name = pgstat_get_wait_io(w); + break; + } default: event_name = "unknown wait event"; break; @@ -3439,6 +3450,251 @@ pgstat_get_wait_timeout(WaitEventTimeout w) } /* ---------- + * pgstat_get_wait_io() - + * + * Convert WaitEventIO to string. + * ---------- + */ +static const char * +pgstat_get_wait_io(WaitEventIO w) +{ + const char *event_name = "unknown wait event"; + + switch (w) + { + case WAIT_EVENT_READ_DATA_BLOCK: + event_name = "ReadDataBlock"; + break; + case WAIT_EVENT_WRITE_DATA_BLOCK: + event_name = "WriteDataBlock"; + break; + case WAIT_EVENT_SYNC_DATA_BLOCK: + event_name = "SyncDataBlock"; + break; + case WAIT_EVENT_EXTEND_DATA_BLOCK: + event_name = "ExtendDataBlock"; + break; + case WAIT_EVENT_FLUSH_DATA_BLOCKS: + event_name = "FlushDataBlocks"; + break; + case WAIT_EVENT_PREFETCH_DATA_BLOCK: + event_name = "PrefetchDataBlock"; + break; + case WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS: + event_name = "TruncateDataBlock"; + break; + case WAIT_EVENT_SYNC_RELATION: + event_name = "SyncRelation"; + break; + case WAIT_EVENT_SYNC_IMMED_RELATION: + event_name = "SyncImmedRelation"; + break; + case WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK: + event_name = "WriteRewriteDataBlock"; + break; + case WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK: + event_name = "SyncRewriteDataBlock"; + break; + case WAIT_EVENT_READ_BUFFILE: + event_name = "ReadBuffile"; + break; + case WAIT_EVENT_WRITE_BUFFILE: + event_name = "WriteBuffile"; + break; + /* XLOG wait event */ + case WAIT_EVENT_READ_XLOG: + event_name = "ReadXLog"; + break; + case WAIT_EVENT_READ_COPY_XLOG: + event_name = "ReadCopyXLog"; + break; + case WAIT_EVENT_WRITE_XLOG: + event_name = "WriteXLog"; + break; + case WAIT_EVENT_WRITE_INIT_XLOG_FILE: + event_name = "WriteInitXLogFile"; + break; + case WAIT_EVENT_WRITE_COPY_XLOG_FILE: + event_name = "WriteCopyXLogFile"; + break; + case WAIT_EVENT_WRITE_BOOTSTRAP_XLOG: + event_name = "WriteBootstrapXLog"; + break; + case WAIT_EVENT_SYNC_INIT_XLOG_FILE: + event_name = "SyncInitXLogFile"; + break; + case WAIT_EVENT_SYNC_COPY_XLOG_FILE: + event_name = "SyncCopyXLogFile"; + break; + case WAIT_EVENT_SYNC_BOOTSTRAP_XLOG: + event_name = "SyncBootStrapXLog"; + break; + case WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD: + event_name = "SyncAssignXLogMethod"; + break; + /* Control file wait events */ + case WAIT_EVENT_WRITE_CONTROL_FILE: + event_name = "WriteControlFile"; + break; + case WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE: + event_name = "WriteUpdateControlFile"; + break; + case WAIT_EVENT_READ_CONTROL_FILE: + event_name = "ReadControlFile"; + break; + case WAIT_EVENT_SYNC_WRITE_CONTROL_FILE: + event_name = "SyncWriteControlFile"; + break; + case WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE: + event_name = "SyncUpdateControlFile"; + break; + /* reorder buffer wait event */ + case WAIT_EVENT_READ_REORDER_BUFFER: + event_name = "ReadReorderBuffer"; + break; + case WAIT_EVENT_WRITE_REORDER_BUFFER: + event_name = "WriteReorderBuffer"; + break; + /* logical mapping wait event */ + case WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING: + event_name = "ReadApplyLogicalMapping"; + break; + case WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE: + event_name = "WriteLogicalMappingRewrite"; + break; + case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE: + event_name = "SyncLogicalMappingRewrite"; + break; + case WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP: + event_name = "SyncLogicalMappingRewriteHeap"; + break; + case WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE: + event_name = "TruncateLogicalMappingRewrite"; + break; + /* Snapbuild wait event */ + case WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE: + event_name = "WriteSnapbuildSerialize"; + break; + case WAIT_EVENT_READ_SNAPBUILD_RESTORE: + event_name = "ReadSnapbuildRestore"; + break; + case WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE: + event_name = "SyncSnapbuildSerialize"; + break; + /* SLRU wait event */ + case WAIT_EVENT_READ_SLRU_PAGE: + event_name = "ReadSLRUPage"; + break; + case WAIT_EVENT_WRITE_SLRU_PAGE: + event_name = "WriteSLRUPage"; + break; + case WAIT_EVENT_SYNC_SLRU_FLUSH: + event_name = "SyncSLRUFlush"; + break; + case WAIT_EVENT_SYNC_SLRU_WRITE_PAGE: + event_name = "SyncSLRUWritePage"; + break; + /* TIMELINE HISTORY wait event */ + case WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER: + event_name = "ReadTimelineHistoryWalsender"; + break; + case WAIT_EVENT_WRITE_TIMELINE_HISTORY: + event_name = "WriteTimelineHistory"; + break; + case WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE: + event_name = "WriteTimelineHistoryFile"; + break; + case WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE: + event_name = "ReadTimelineHistoryWrite"; + break; + case WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE: + event_name = "SyncTimelineHistoryWrite"; + break; + case WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE: + event_name = "SyncTimelineHistoryFile"; + break; + /* TWOPHASE FILE wait event */ + case WAIT_EVENT_READ_TWOPHASE_FILE: + event_name = "ReadTwophaseFile"; + break; + case WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE: + event_name = "WriteRecreateTwophaseFile"; + break; + case WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE: + event_name = "SyncRecreateTwophaseFile"; + break; + /* SYSLOGGER wait event */ + case WAIT_EVENT_READ_SYSLOGGER_FILE: + event_name = "ReadSysloggerFile"; + break; + case WAIT_EVENT_WRITE_SYSLOGGER_FILE: + event_name = "WriteSysloggerFile"; + break; + /* REPLSLOT wait event */ + case WAIT_EVENT_READ_RESTORE_REPLSLOT: + event_name = "ReadRestorREPLSlot"; + break; + case WAIT_EVENT_WRITE_REPLSLOT: + event_name = "WriteREPLSlot"; + break; + case WAIT_EVENT_SYNC_RESTORE_REPLSLOT: + event_name = "SyncRestoreREPLSlot"; + break; + case WAIT_EVENT_SYNC_SAVE_REPLSLOT: + event_name = "SyncSaveREPLSlot"; + break; + /* COPYDIR IO wait event */ + case WAIT_EVENT_READ_COPY_FILE: + event_name = "ReadCopyFile"; + break; + case WAIT_EVENT_WRITE_COPY_FILE: + event_name = "WriteCopyFile"; + break; + /* RELMAP IO wait event */ + case WAIT_EVENT_READ_LOAD_RELMAP_FILE: + event_name = "ReadLoadRELMAPFile"; + break; + case WAIT_EVENT_WRITE_RELMAP_FILE: + event_name = "WriteRELMAPFile"; + break; + case WAIT_EVENT_SYNC_WRITE_RELMAP_FILE: + event_name = "SyncWriteRELMAFile"; + break; + /* LOCK FILE IO wait event */ + case WAIT_EVENT_READ_CREATE_LOCK_FILE: + event_name = "ReadCreateLockFile"; + break; + case WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE: + event_name = "ReadAddToDataDirLockFile"; + break; + case WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE: + event_name = "ReadRecheckDataDirLockFile"; + break; + case WAIT_EVENT_WRITE_CREATE_LOCK_FILE: + event_name = "WriteCreateLockFile"; + break; + case WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE: + event_name = "WriteAddToDataDirLockFile"; + break; + case WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE: + event_name = "SyncAddToDataDirLockFile"; + break; + case WAIT_EVENT_SYNC_CREATE_LOCK_FILE: + event_name = "SyncCreateLockFile"; + break; + /* DSM IO wait event */ + case WAIT_EVENT_WRITE_ZERO_FILL_DSM: + event_name = "WriteZeroFillDSM"; + break; + + /* no default case, so that compiler will warn */ + } + + return event_name; +} + + +/* ---------- * pgstat_get_backend_current_activity() - * * Return a string representing the current activity of the backend with diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c index aaefdae..9328814 100644 --- a/src/backend/postmaster/syslogger.c +++ b/src/backend/postmaster/syslogger.c @@ -441,9 +441,11 @@ SysLoggerMain(int argc, char *argv[]) { int bytesRead; + pgstat_report_wait_start(WAIT_EVENT_READ_SYSLOGGER_FILE); bytesRead = read(syslogPipe[0], logbuffer + bytes_in_logbuffer, sizeof(logbuffer) - bytes_in_logbuffer); + pgstat_report_wait_end(); if (bytesRead < 0) { if (errno != EINTR) @@ -1001,7 +1003,9 @@ write_syslogger_file(const char *buffer, int count, int destination) open_csvlogfile(); logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile; + pgstat_report_wait_start(WAIT_EVENT_WRITE_SYSLOGGER_FILE); rc = fwrite(buffer, 1, count, logfile); + pgstat_report_wait_end(); /* can't use ereport here because of possible recursion */ if (rc != count) diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c index 8aac670..767ea5e 100644 --- a/src/backend/replication/logical/reorderbuffer.c +++ b/src/backend/replication/logical/reorderbuffer.c @@ -58,6 +58,7 @@ #include "catalog/catalog.h" #include "lib/binaryheap.h" #include "miscadmin.h" +#include "pgstat.h" #include "replication/logical.h" #include "replication/reorderbuffer.h" #include "replication/slot.h" @@ -2275,6 +2276,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn, ondisk->size = sz; + pgstat_report_wait_start(WAIT_EVENT_WRITE_REORDER_BUFFER); if (write(fd, rb->outbuf, ondisk->size) != ondisk->size) { int save_errno = errno; @@ -2286,6 +2288,7 @@ ReorderBufferSerializeChange(ReorderBuffer *rb, ReorderBufferTXN *txn, errmsg("could not write to data file for XID %u: %m", txn->xid))); } + pgstat_report_wait_end(); Assert(ondisk->change.action == change->action); } @@ -2366,7 +2369,9 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn, * end of this file. */ ReorderBufferSerializeReserve(rb, sizeof(ReorderBufferDiskChange)); + pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER); readBytes = read(*fd, rb->outbuf, sizeof(ReorderBufferDiskChange)); + pgstat_report_wait_end(); /* eof */ if (readBytes == 0) @@ -2393,8 +2398,10 @@ ReorderBufferRestoreChanges(ReorderBuffer *rb, ReorderBufferTXN *txn, sizeof(ReorderBufferDiskChange) + ondisk->size); ondisk = (ReorderBufferDiskChange *) rb->outbuf; + pgstat_report_wait_start(WAIT_EVENT_READ_REORDER_BUFFER); readBytes = read(*fd, rb->outbuf + sizeof(ReorderBufferDiskChange), ondisk->size - sizeof(ReorderBufferDiskChange)); + pgstat_report_wait_end(); if (readBytes < 0) ereport(ERROR, @@ -3047,7 +3054,9 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname) memset(&key, 0, sizeof(ReorderBufferTupleCidKey)); /* read all mappings till the end of the file */ + pgstat_report_wait_start(WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING); readBytes = read(fd, &map, sizeof(LogicalRewriteMappingData)); + pgstat_report_wait_end(); if (readBytes < 0) ereport(ERROR, diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c index 52601a5..077878c 100644 --- a/src/backend/replication/logical/snapbuild.c +++ b/src/backend/replication/logical/snapbuild.c @@ -115,6 +115,8 @@ #include "access/transam.h" #include "access/xact.h" +#include "pgstat.h" + #include "replication/logical.h" #include "replication/reorderbuffer.h" #include "replication/snapbuild.h" @@ -1580,6 +1582,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) ereport(ERROR, (errmsg("could not open file \"%s\": %m", path))); + pgstat_report_wait_start(WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE); if ((write(fd, ondisk, needed_length)) != needed_length) { CloseTransientFile(fd); @@ -1587,6 +1590,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); /* * fsync the file before renaming so that even if we crash after this we @@ -1596,6 +1600,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) * some noticeable overhead since it's performed synchronously during * decoding? */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE); if (pg_fsync(fd) != 0) { CloseTransientFile(fd); @@ -1603,6 +1608,7 @@ SnapBuildSerialize(SnapBuild *builder, XLogRecPtr lsn) (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", tmppath))); } + pgstat_report_wait_end(); CloseTransientFile(fd); fsync_fname("pg_logical/snapshots", true); @@ -1677,7 +1683,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn) /* read statically sized portion of snapshot */ + pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE); readBytes = read(fd, &ondisk, SnapBuildOnDiskConstantSize); + pgstat_report_wait_end(); if (readBytes != SnapBuildOnDiskConstantSize) { CloseTransientFile(fd); @@ -1703,7 +1711,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn) SnapBuildOnDiskConstantSize - SnapBuildOnDiskNotChecksummedSize); /* read SnapBuild */ + pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE); readBytes = read(fd, &ondisk.builder, sizeof(SnapBuild)); + pgstat_report_wait_end(); if (readBytes != sizeof(SnapBuild)) { CloseTransientFile(fd); @@ -1717,7 +1727,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn) /* restore running xacts information */ sz = sizeof(TransactionId) * ondisk.builder.running.xcnt_space; ondisk.builder.running.xip = MemoryContextAllocZero(builder->context, sz); + pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE); readBytes = read(fd, ondisk.builder.running.xip, sz); + pgstat_report_wait_end(); if (readBytes != sz) { CloseTransientFile(fd); @@ -1731,7 +1743,9 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn) /* restore committed xacts information */ sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt; ondisk.builder.committed.xip = MemoryContextAllocZero(builder->context, sz); + pgstat_report_wait_start(WAIT_EVENT_READ_SNAPBUILD_RESTORE); readBytes = read(fd, ondisk.builder.committed.xip, sz); + pgstat_report_wait_end(); if (readBytes != sz) { CloseTransientFile(fd); diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c index 10d69d0..d4ebcd9 100644 --- a/src/backend/replication/slot.c +++ b/src/backend/replication/slot.c @@ -43,6 +43,7 @@ #include "access/xlog_internal.h" #include "common/string.h" #include "miscadmin.h" +#include "pgstat.h" #include "replication/slot.h" #include "storage/fd.h" #include "storage/proc.h" @@ -1100,10 +1101,12 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel) SnapBuildOnDiskChecksummedSize); FIN_CRC32C(cp.checksum); + pgstat_report_wait_start(WAIT_EVENT_WRITE_REPLSLOT); if ((write(fd, &cp, sizeof(cp))) != sizeof(cp)) { int save_errno = errno; + pgstat_report_wait_end(); CloseTransientFile(fd); errno = save_errno; ereport(elevel, @@ -1112,8 +1115,10 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel) tmppath))); return; } + pgstat_report_wait_end(); /* fsync the temporary file */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_SAVE_REPLSLOT); if (pg_fsync(fd) != 0) { int save_errno = errno; @@ -1126,6 +1131,7 @@ SaveSlotToPath(ReplicationSlot *slot, const char *dir, int elevel) tmppath))); return; } + pgstat_report_wait_end(); CloseTransientFile(fd); @@ -1202,6 +1208,7 @@ RestoreSlotFromDisk(const char *name) * Sync state file before we're reading from it. We might have crashed * while it wasn't synced yet and we shouldn't continue on that basis. */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_RESTORE_REPLSLOT); if (pg_fsync(fd) != 0) { CloseTransientFile(fd); @@ -1210,6 +1217,7 @@ RestoreSlotFromDisk(const char *name) errmsg("could not fsync file \"%s\": %m", path))); } + pgstat_report_wait_end(); /* Also sync the parent directory */ START_CRIT_SECTION(); @@ -1217,7 +1225,9 @@ RestoreSlotFromDisk(const char *name) END_CRIT_SECTION(); /* read part of statefile that's guaranteed to be version independent */ + pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT); readBytes = read(fd, &cp, ReplicationSlotOnDiskConstantSize); + pgstat_report_wait_end(); if (readBytes != ReplicationSlotOnDiskConstantSize) { int saved_errno = errno; @@ -1253,9 +1263,11 @@ RestoreSlotFromDisk(const char *name) path, cp.length))); /* Now that we know the size, read the entire file */ + pgstat_report_wait_start(WAIT_EVENT_READ_RESTORE_REPLSLOT); readBytes = read(fd, (char *) &cp + ReplicationSlotOnDiskConstantSize, cp.length); + pgstat_report_wait_end(); if (readBytes != cp.length) { int saved_errno = errno; diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index dd3a936..9be0185 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -462,7 +462,9 @@ SendTimeLineHistory(TimeLineHistoryCmd *cmd) char rbuf[BLCKSZ]; int nread; + pgstat_report_wait_start(WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER); nread = read(fd, rbuf, sizeof(rbuf)); + pgstat_report_wait_end(); if (nread <= 0) ereport(ERROR, (errcode_for_file_access(), @@ -2076,7 +2078,9 @@ retry: else segbytes = nbytes; + pgstat_report_wait_start(WAIT_EVENT_READ_XLOG); readbytes = read(sendFile, p, segbytes); + pgstat_report_wait_end(); if (readbytes <= 0) { ereport(ERROR, diff --git a/src/backend/storage/file/buffile.c b/src/backend/storage/file/buffile.c index 7ebd636..971cfd1 100644 --- a/src/backend/storage/file/buffile.c +++ b/src/backend/storage/file/buffile.c @@ -37,6 +37,7 @@ #include "postgres.h" #include "executor/instrument.h" +#include "pgstat.h" #include "storage/fd.h" #include "storage/buffile.h" #include "storage/buf_internals.h" @@ -254,7 +255,10 @@ BufFileLoadBuffer(BufFile *file) /* * Read whatever we can get, up to a full bufferload. */ - file->nbytes = FileRead(thisfile, file->buffer, sizeof(file->buffer)); + file->nbytes = FileRead(thisfile, + file->buffer, + sizeof(file->buffer), + WAIT_EVENT_READ_BUFFILE); if (file->nbytes < 0) file->nbytes = 0; file->offsets[file->curFile] += file->nbytes; @@ -317,7 +321,10 @@ BufFileDumpBuffer(BufFile *file) return; /* seek failed, give up */ file->offsets[file->curFile] = file->curOffset; } - bytestowrite = FileWrite(thisfile, file->buffer + wpos, bytestowrite); + bytestowrite = FileWrite(thisfile, + file->buffer + wpos, + bytestowrite, + WAIT_EVENT_WRITE_BUFFILE); if (bytestowrite <= 0) return; /* failed to write */ file->offsets[file->curFile] += bytestowrite; diff --git a/src/backend/storage/file/copydir.c b/src/backend/storage/file/copydir.c index 101da47..2eda42d 100644 --- a/src/backend/storage/file/copydir.c +++ b/src/backend/storage/file/copydir.c @@ -25,7 +25,7 @@ #include "storage/copydir.h" #include "storage/fd.h" #include "miscadmin.h" - +#include "pgstat.h" /* * copydir: copy a directory @@ -169,7 +169,9 @@ copy_file(char *fromfile, char *tofile) /* If we got a cancel signal during the copy of the file, quit */ CHECK_FOR_INTERRUPTS(); + pgstat_report_wait_start(WAIT_EVENT_READ_COPY_FILE); nbytes = read(srcfd, buffer, COPY_BUF_SIZE); + pgstat_report_wait_end(); if (nbytes < 0) ereport(ERROR, (errcode_for_file_access(), @@ -177,8 +179,10 @@ copy_file(char *fromfile, char *tofile) if (nbytes == 0) break; errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_COPY_FILE); if ((int) write(dstfd, buffer, nbytes) != nbytes) { + pgstat_report_wait_end(); /* if write didn't set errno, assume problem is no disk space */ if (errno == 0) errno = ENOSPC; @@ -186,6 +190,7 @@ copy_file(char *fromfile, char *tofile) (errcode_for_file_access(), errmsg("could not write to file \"%s\": %m", tofile))); } + pgstat_report_wait_end(); /* * We fsync the files later but first flush them to avoid spamming the diff --git a/src/backend/storage/file/fd.c b/src/backend/storage/file/fd.c index fd02fc0..16f0c92 100644 --- a/src/backend/storage/file/fd.c +++ b/src/backend/storage/file/fd.c @@ -1550,7 +1550,7 @@ FileClose(File file) * to read into. */ int -FilePrefetch(File file, off_t offset, int amount) +FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info) { #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_WILLNEED) int returnCode; @@ -1565,8 +1565,10 @@ FilePrefetch(File file, off_t offset, int amount) if (returnCode < 0) return returnCode; + pgstat_report_wait_start(wait_event_info); returnCode = posix_fadvise(VfdCache[file].fd, offset, amount, POSIX_FADV_WILLNEED); + pgstat_report_wait_end(); return returnCode; #else @@ -1576,7 +1578,7 @@ FilePrefetch(File file, off_t offset, int amount) } void -FileWriteback(File file, off_t offset, off_t nbytes) +FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info) { int returnCode; @@ -1597,11 +1599,13 @@ FileWriteback(File file, off_t offset, off_t nbytes) if (returnCode < 0) return; + pgstat_report_wait_start(wait_event_info); pg_flush_data(VfdCache[file].fd, offset, nbytes); + pgstat_report_wait_end(); } int -FileRead(File file, char *buffer, int amount) +FileRead(File file, char *buffer, int amount, uint32 wait_event_info) { int returnCode; Vfd *vfdP; @@ -1619,6 +1623,7 @@ FileRead(File file, char *buffer, int amount) vfdP = &VfdCache[file]; + pgstat_report_wait_start(wait_event_info); retry: returnCode = read(vfdP->fd, buffer, amount); @@ -1658,12 +1663,13 @@ retry: /* Trouble, so assume we don't know the file position anymore */ vfdP->seekPos = FileUnknownPos; } + pgstat_report_wait_end(); return returnCode; } int -FileWrite(File file, char *buffer, int amount) +FileWrite(File file, char *buffer, int amount, uint32 wait_event_info) { int returnCode; Vfd *vfdP; @@ -1719,6 +1725,7 @@ FileWrite(File file, char *buffer, int amount) } } + pgstat_report_wait_start(wait_event_info); retry: errno = 0; returnCode = write(vfdP->fd, buffer, amount); @@ -1777,12 +1784,13 @@ retry: /* Trouble, so assume we don't know the file position anymore */ vfdP->seekPos = FileUnknownPos; } + pgstat_report_wait_end(); return returnCode; } int -FileSync(File file) +FileSync(File file, uint32 wait_event_info) { int returnCode; @@ -1795,7 +1803,11 @@ FileSync(File file) if (returnCode < 0) return returnCode; - return pg_fsync(VfdCache[file].fd); + pgstat_report_wait_start(wait_event_info); + returnCode = pg_fsync(VfdCache[file].fd); + pgstat_report_wait_end(); + + return returnCode; } off_t @@ -1887,7 +1899,7 @@ FileTell(File file) #endif int -FileTruncate(File file, off_t offset) +FileTruncate(File file, off_t offset, uint32 wait_event_info) { int returnCode; @@ -1900,7 +1912,9 @@ FileTruncate(File file, off_t offset) if (returnCode < 0) return returnCode; + pgstat_report_wait_start(wait_event_info); returnCode = ftruncate(VfdCache[file].fd, offset); + pgstat_report_wait_end(); if (returnCode == 0 && VfdCache[file].fileSize > offset) { diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index b2c9cdc..9aafea6 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -60,6 +60,7 @@ #ifdef HAVE_SYS_SHM_H #include <sys/shm.h> #endif +#include "pgstat.h" #include "portability/mem.h" #include "storage/dsm_impl.h" @@ -911,10 +912,12 @@ dsm_impl_mmap(dsm_op op, dsm_handle handle, Size request_size, if (goal > ZBUFFER_SIZE) goal = ZBUFFER_SIZE; + pgstat_report_wait_start(WAIT_EVENT_WRITE_ZERO_FILL_DSM); if (write(fd, zbuffer, goal) == goal) remaining -= goal; else success = false; + pgstat_report_wait_end(); } if (!success) diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index 6c17b54..1003f6b 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -28,6 +28,7 @@ #include "miscadmin.h" #include "access/xlog.h" #include "catalog/catalog.h" +#include "pgstat.h" #include "portability/instr_time.h" #include "postmaster/bgwriter.h" #include "storage/fd.h" @@ -536,7 +537,7 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, errmsg("could not seek to block %u in file \"%s\": %m", blocknum, FilePathName(v->mdfd_vfd)))); - if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ)) != BLCKSZ) + if ((nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_EXTEND_DATA_BLOCK)) != BLCKSZ) { if (nbytes < 0) ereport(ERROR, @@ -667,7 +668,7 @@ mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum) Assert(seekpos < (off_t) BLCKSZ * RELSEG_SIZE); - (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ); + (void) FilePrefetch(v->mdfd_vfd, seekpos, BLCKSZ, WAIT_EVENT_PREFETCH_DATA_BLOCK); #endif /* USE_PREFETCH */ } @@ -716,7 +717,7 @@ mdwriteback(SMgrRelation reln, ForkNumber forknum, seekpos = (off_t) BLCKSZ *(blocknum % ((BlockNumber) RELSEG_SIZE)); - FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush); + FileWriteback(v->mdfd_vfd, seekpos, (off_t) BLCKSZ * nflush, WAIT_EVENT_FLUSH_DATA_BLOCKS); nblocks -= nflush; blocknum += nflush; @@ -753,7 +754,7 @@ mdread(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, errmsg("could not seek to block %u in file \"%s\": %m", blocknum, FilePathName(v->mdfd_vfd)))); - nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ); + nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_READ_DATA_BLOCK); TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum, reln->smgr_rnode.node.spcNode, @@ -829,7 +830,7 @@ mdwrite(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, errmsg("could not seek to block %u in file \"%s\": %m", blocknum, FilePathName(v->mdfd_vfd)))); - nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ); + nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ, WAIT_EVENT_WRITE_DATA_BLOCK); TRACE_POSTGRESQL_SMGR_MD_WRITE_DONE(forknum, blocknum, reln->smgr_rnode.node.spcNode, @@ -967,7 +968,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks) * This segment is no longer active. We truncate the file, but do * not delete it, for reasons explained in the header comments. */ - if (FileTruncate(v->mdfd_vfd, 0) < 0) + if (FileTruncate(v->mdfd_vfd, 0, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not truncate file \"%s\": %m", @@ -993,7 +994,7 @@ mdtruncate(SMgrRelation reln, ForkNumber forknum, BlockNumber nblocks) */ BlockNumber lastsegblocks = nblocks - priorblocks; - if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ) < 0) + if (FileTruncate(v->mdfd_vfd, (off_t) lastsegblocks * BLCKSZ, WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS) < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not truncate file \"%s\" to %u blocks: %m", @@ -1037,7 +1038,7 @@ mdimmedsync(SMgrRelation reln, ForkNumber forknum) { MdfdVec *v = &reln->md_seg_fds[forknum][segno - 1]; - if (FileSync(v->mdfd_vfd) < 0) + if (FileSync(v->mdfd_vfd, WAIT_EVENT_SYNC_IMMED_RELATION) < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", @@ -1232,7 +1233,7 @@ mdsync(void) INSTR_TIME_SET_CURRENT(sync_start); if (seg != NULL && - FileSync(seg->mdfd_vfd) >= 0) + FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_RELATION) >= 0) { /* Success; update statistics about sync timing */ INSTR_TIME_SET_CURRENT(sync_end); @@ -1443,7 +1444,7 @@ register_dirty_segment(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg) ereport(DEBUG1, (errmsg("could not forward fsync request because request queue is full"))); - if (FileSync(seg->mdfd_vfd) < 0) + if (FileSync(seg->mdfd_vfd, WAIT_EVENT_SYNC_DATA_BLOCK) < 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync file \"%s\": %m", diff --git a/src/backend/utils/cache/relmapper.c b/src/backend/utils/cache/relmapper.c index c9d6e44..ee5ded9 100644 --- a/src/backend/utils/cache/relmapper.c +++ b/src/backend/utils/cache/relmapper.c @@ -50,6 +50,7 @@ #include "catalog/pg_tablespace.h" #include "catalog/storage.h" #include "miscadmin.h" +#include "pgstat.h" #include "storage/fd.h" #include "storage/lwlock.h" #include "utils/inval.h" @@ -658,11 +659,15 @@ load_relmap_file(bool shared) * look, the sinval signaling mechanism will make us re-read it before we * are able to access any relation that's affected by the change. */ + pgstat_report_wait_start(WAIT_EVENT_READ_LOAD_RELMAP_FILE); if (read(fd, map, sizeof(RelMapFile)) != sizeof(RelMapFile)) + { ereport(FATAL, (errcode_for_file_access(), errmsg("could not read relation mapping file \"%s\": %m", mapfilename))); + } + pgstat_report_wait_end(); CloseTransientFile(fd); @@ -774,6 +779,7 @@ write_relmap_file(bool shared, RelMapFile *newmap, } errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_RELMAP_FILE); if (write(fd, newmap, sizeof(RelMapFile)) != sizeof(RelMapFile)) { /* if write didn't set errno, assume problem is no disk space */ @@ -784,6 +790,7 @@ write_relmap_file(bool shared, RelMapFile *newmap, errmsg("could not write to relation mapping file \"%s\": %m", mapfilename))); } + pgstat_report_wait_end(); /* * We choose to fsync the data to disk before considering the task done. @@ -791,11 +798,13 @@ write_relmap_file(bool shared, RelMapFile *newmap, * issue, but it would complicate checkpointing --- see notes for * CheckPointRelationMap. */ + pgstat_report_wait_start(WAIT_EVENT_SYNC_WRITE_RELMAP_FILE); if (pg_fsync(fd) != 0) ereport(ERROR, (errcode_for_file_access(), errmsg("could not fsync relation mapping file \"%s\": %m", mapfilename))); + pgstat_report_wait_end(); if (CloseTransientFile(fd)) ereport(ERROR, diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c index e0298ee..90cc9bf 100644 --- a/src/backend/utils/init/miscinit.c +++ b/src/backend/utils/init/miscinit.c @@ -35,6 +35,7 @@ #include "libpq/libpq.h" #include "mb/pg_wchar.h" #include "miscadmin.h" +#include "pgstat.h" #include "postmaster/autovacuum.h" #include "postmaster/postmaster.h" #include "storage/fd.h" @@ -856,11 +857,13 @@ CreateLockFile(const char *filename, bool amPostmaster, errmsg("could not open lock file \"%s\": %m", filename))); } + pgstat_report_wait_start(WAIT_EVENT_READ_CREATE_LOCK_FILE); if ((len = read(fd, buffer, sizeof(buffer) - 1)) < 0) ereport(FATAL, (errcode_for_file_access(), errmsg("could not read lock file \"%s\": %m", filename))); + pgstat_report_wait_end(); close(fd); if (len == 0) @@ -1009,6 +1012,7 @@ CreateLockFile(const char *filename, bool amPostmaster, strlcat(buffer, "\n", sizeof(buffer)); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_CREATE_LOCK_FILE); if (write(fd, buffer, strlen(buffer)) != strlen(buffer)) { int save_errno = errno; @@ -1021,6 +1025,9 @@ CreateLockFile(const char *filename, bool amPostmaster, (errcode_for_file_access(), errmsg("could not write lock file \"%s\": %m", filename))); } + pgstat_report_wait_end(); + + pgstat_report_wait_start(WAIT_EVENT_SYNC_CREATE_LOCK_FILE); if (pg_fsync(fd) != 0) { int save_errno = errno; @@ -1032,6 +1039,7 @@ CreateLockFile(const char *filename, bool amPostmaster, (errcode_for_file_access(), errmsg("could not write lock file \"%s\": %m", filename))); } + pgstat_report_wait_end(); if (close(fd) != 0) { int save_errno = errno; @@ -1164,7 +1172,9 @@ AddToDataDirLockFile(int target_line, const char *str) DIRECTORY_LOCK_FILE))); return; } + pgstat_report_wait_start(WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE); len = read(fd, srcbuffer, sizeof(srcbuffer) - 1); + pgstat_report_wait_end(); if (len < 0) { ereport(LOG, @@ -1217,6 +1227,7 @@ AddToDataDirLockFile(int target_line, const char *str) */ len = strlen(destbuffer); errno = 0; + pgstat_report_wait_start(WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE); if (lseek(fd, (off_t) 0, SEEK_SET) != 0 || (int) write(fd, destbuffer, len) != len) { @@ -1230,6 +1241,8 @@ AddToDataDirLockFile(int target_line, const char *str) close(fd); return; } + pgstat_report_wait_end(); + pgstat_report_wait_start(WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE); if (pg_fsync(fd) != 0) { ereport(LOG, @@ -1237,6 +1250,7 @@ AddToDataDirLockFile(int target_line, const char *str) errmsg("could not write to file \"%s\": %m", DIRECTORY_LOCK_FILE))); } + pgstat_report_wait_end(); if (close(fd) != 0) { ereport(LOG, @@ -1293,7 +1307,9 @@ RecheckDataDirLockFile(void) return true; } } + pgstat_report_wait_start(WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE); len = read(fd, buffer, sizeof(buffer) - 1); + pgstat_report_wait_end(); if (len < 0) { ereport(LOG, diff --git a/src/include/pgstat.h b/src/include/pgstat.h index 60c78d1..032f0c4 100644 --- a/src/include/pgstat.h +++ b/src/include/pgstat.h @@ -723,6 +723,7 @@ typedef enum BackendState #define PG_WAIT_EXTENSION 0x07000000U #define PG_WAIT_IPC 0x08000000U #define PG_WAIT_TIMEOUT 0x09000000U +#define PG_WAIT_IO 0x0A000000U /* ---------- * Wait Events - Activity @@ -806,6 +807,100 @@ typedef enum } WaitEventTimeout; /* ---------- + * Wait Events - IO + * + * Use this category when a process is waiting for a IO. + * ---------- + */ +typedef enum +{ + WAIT_EVENT_READ_DATA_BLOCK = PG_WAIT_IO, + WAIT_EVENT_WRITE_DATA_BLOCK, + WAIT_EVENT_SYNC_DATA_BLOCK, + WAIT_EVENT_EXTEND_DATA_BLOCK, + WAIT_EVENT_FLUSH_DATA_BLOCKS, + WAIT_EVENT_PREFETCH_DATA_BLOCK, + WAIT_EVENT_WRITE_REWRITE_DATA_BLOCK, + WAIT_EVENT_SYNC_REWRITE_DATA_BLOCK, + WAIT_EVENT_TRUNCATE_RELATION_DATA_BLOCKS, + WAIT_EVENT_SYNC_RELATION, + WAIT_EVENT_SYNC_IMMED_RELATION, + WAIT_EVENT_READ_BUFFILE, + WAIT_EVENT_WRITE_BUFFILE, + /* Wait event for XLOG */ + WAIT_EVENT_READ_XLOG, + WAIT_EVENT_READ_COPY_XLOG, + WAIT_EVENT_WRITE_XLOG, + WAIT_EVENT_WRITE_INIT_XLOG_FILE, + WAIT_EVENT_WRITE_COPY_XLOG_FILE, + WAIT_EVENT_WRITE_BOOTSTRAP_XLOG, + WAIT_EVENT_SYNC_INIT_XLOG_FILE, + WAIT_EVENT_SYNC_COPY_XLOG_FILE, + WAIT_EVENT_SYNC_BOOTSTRAP_XLOG, + WAIT_EVENT_SYNC_ASSIGN_XLOG_SYNC_METHOD, + /* Wait event for CONTROL_FILE */ + WAIT_EVENT_WRITE_CONTROL_FILE, + WAIT_EVENT_WRITE_UPDATE_CONTROL_FILE, + WAIT_EVENT_SYNC_WRITE_CONTROL_FILE, + WAIT_EVENT_SYNC_UPDATE_CONTROL_FILE, + WAIT_EVENT_READ_CONTROL_FILE, + /* Wait event for REORDER BUFFER */ + WAIT_EVENT_READ_REORDER_BUFFER, + WAIT_EVENT_WRITE_REORDER_BUFFER, + /* Wait event for LOGICAL MAPPING */ + WAIT_EVENT_READ_APPLY_LOGICAL_MAPPING, + WAIT_EVENT_WRITE_LOGICAL_MAPPING_REWRITE, + WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE, + WAIT_EVENT_SYNC_LOGICAL_MAPPING_REWRITE_HEAP, + WAIT_EVENT_TRUNCATE_LOGICAL_MAPPING_REWRITE, + /* Wait event for SNAPBUILD */ + WAIT_EVENT_WRITE_SNAPBUILD_SERIALIZE, + WAIT_EVENT_READ_SNAPBUILD_RESTORE, + WAIT_EVENT_SYNC_SNAPBUILD_SERIALIZE, + /* Wait event for SLRU */ + WAIT_EVENT_READ_SLRU_PAGE, + WAIT_EVENT_WRITE_SLRU_PAGE, + WAIT_EVENT_SYNC_SLRU_FLUSH, + WAIT_EVENT_SYNC_SLRU_WRITE_PAGE, + /* Wait event for TIMELINE HISTORY */ + WAIT_EVENT_READ_TIMELINE_HISTORY_WALSENDER, + WAIT_EVENT_READ_TIMELINE_HISTORY_WRITE, + WAIT_EVENT_WRITE_TIMELINE_HISTORY, + WAIT_EVENT_WRITE_TIMELINE_HISTORY_FILE, + WAIT_EVENT_SYNC_TIMELINE_HISTORY_WRITE, + WAIT_EVENT_SYNC_TIMELINE_HISTORY_FILE, + /* Wait event for TWOPHASE FILE */ + WAIT_EVENT_READ_TWOPHASE_FILE, + WAIT_EVENT_WRITE_RECREATE_TWOPHASE_FILE, + WAIT_EVENT_SYNC_RECREATE_TWOPHASE_FILE, + /* Wait event for SYSLOGGER */ + WAIT_EVENT_READ_SYSLOGGER_FILE, + WAIT_EVENT_WRITE_SYSLOGGER_FILE, + /* Wait event for REPLSLOT */ + WAIT_EVENT_READ_RESTORE_REPLSLOT, + WAIT_EVENT_WRITE_REPLSLOT, + WAIT_EVENT_SYNC_RESTORE_REPLSLOT, + WAIT_EVENT_SYNC_SAVE_REPLSLOT, + /* Wait event for copydir */ + WAIT_EVENT_READ_COPY_FILE, + WAIT_EVENT_WRITE_COPY_FILE, + /* Wait event RELMAP FILE */ + WAIT_EVENT_READ_LOAD_RELMAP_FILE, + WAIT_EVENT_WRITE_RELMAP_FILE, + WAIT_EVENT_SYNC_WRITE_RELMAP_FILE, + /* Wait event for LOCK FILE */ + WAIT_EVENT_READ_CREATE_LOCK_FILE, + WAIT_EVENT_READ_ADDTODATEDIR_LOCK_FILE, + WAIT_EVENT_READ_RECHECKDATADIR_LOCK_FILE, + WAIT_EVENT_WRITE_CREATE_LOCK_FILE, + WAIT_EVENT_WRITE_ADDTODATEDIR_LOCK_FILE, + WAIT_EVENT_SYNC_ADDTODATEDIR_LOCK_FILE, + WAIT_EVENT_SYNC_CREATE_LOCK_FILE, + /* Wait event for DSM */ + WAIT_EVENT_WRITE_ZERO_FILL_DSM +} WaitEventIO; + +/* ---------- * Command type for progress reporting purposes * ---------- */ diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h index 1a43a2c..ac37502 100644 --- a/src/include/storage/fd.h +++ b/src/include/storage/fd.h @@ -68,13 +68,13 @@ extern int max_safe_fds; extern File PathNameOpenFile(FileName fileName, int fileFlags, int fileMode); extern File OpenTemporaryFile(bool interXact); extern void FileClose(File file); -extern int FilePrefetch(File file, off_t offset, int amount); -extern int FileRead(File file, char *buffer, int amount); -extern int FileWrite(File file, char *buffer, int amount); -extern int FileSync(File file); +extern int FilePrefetch(File file, off_t offset, int amount, uint32 wait_event_info); +extern int FileRead(File file, char *buffer, int amount, uint32 wait_event_info); +extern int FileWrite(File file, char *buffer, int amount, uint32 wait_event_info); +extern int FileSync(File file, uint32 wait_event_info); extern off_t FileSeek(File file, off_t offset, int whence); -extern int FileTruncate(File file, off_t offset); -extern void FileWriteback(File file, off_t offset, off_t nbytes); +extern int FileTruncate(File file, off_t offset, uint32 wait_event_info); +extern void FileWriteback(File file, off_t offset, off_t nbytes, uint32 wait_event_info); extern char *FilePathName(File file); extern int FileGetRawDesc(File file); extern int FileGetRawFlags(File file);
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers