Hi, On 2023-04-07 12:17:38 -0400, Melanie Plageman wrote: > Attached v9 addresses review feedback as well as resolving merge > conflicts with recent relation extension patchset.
I've edited it a bit more: - removed pgstat_tracks_io_time() and replaced it by returning the new IO_COL_INVALID = -1 from pgstat_get_io_time_index() when there's no time - moved PgStat_Counter count, time into the respective branches. It feels somewhat wrong to access the time when we then decide there is no time. - s/io_object/io_obj/ in pgstat_count_io_op_time(), combined with added linebreaks, got the code to under 80 chars - renamed pg_stat_microseconds_to_milliseconds to pg_stat_us_to_ms - removed a spurious newline - the times reported by pg_stat_io had their fractional part removed, due to pg_stat_us_to_ms returning an integer Verifying this, I saw that the write time visible in pg_stat_io didn't quite match what I saw in log_checkpoints. But not always. Eventually I figured out that that's not pg_stat_io's fault - log_checkpoint's write includes a lot of things, including several other CheckPoint* routines, flushing WAL, asking the kernel to flush things to disk... The biggest portion in my case were the smgrwriteback() calls - which pg_stat_io doesn't track - oops. Pushed up to and including 0003. > I've changed pgstat_count_io_op_time() to take a count and call > pgstat_count_io_op_n() so it can be used with smgrzeroextend(). I do > wish that the parameter to pgstat_count_io_op_n() was called "count" and > not "cnt"... Heh. > I've also reordered the call site of pgstat_count_io_op_time() in a few > locations, but I have some questions about this. > > Before, I didn't think it mattered much that we didn't finish counting > IO time until after setting BM_VALID or BM_DIRTY and unsetting > BM_IO_IN_PROGRESS. With the relation extension code doing this for many > buffers at once, though, I wondered if this will make the IO timing too > inaccurate. > As such, I've moved pgstat_count_io_op_time() to before we set those > flags in all locations. I did wonder if it is bad to prolong having the > buffer pinned and not having those flags set, though. I went back and forth about this before. I think it's ok the way you did it. I think 0004 needs a bit more work. At the very least we would have to swap the order of pgstat_flush_pending_entries() and pgstat_flush_io() - entirely doable. Unlike 0003, this doesn't make pg_stat_io more complete, or such, so I'm inclined to leave it for 17. I think there might be some more opportunities for having counts "flow down", like the patch does. Greetings, Andres Freund