From: Alvaro Herrera [mailto:alvhe...@alvh.no-ip.org] > Tsunakawa, Takayuki wrote: > > > (Although unrelated to this, I've also been wondering why PostgreSQL > > flushes WAL to disk when writing a page in the shared buffer, because > > PostgreSQL doesn't use WAL for undo.) > > The reason is that if the system crashes after writing the data page to > disk, but before writing the WAL, the data page would be inconsistent with > data in pages that weren't flushed, since there is no WAL to update those > other pages. Also, if the system crashes after partially writing the page > (say it writes the first 4kB) then the page is downright corrupted with > no way to fix it. > > So there has to be a barrier that ensures that the WAL is flushed up to > the last position that modified a page (i.e. that page's LSN) before actually > writing that page to disk. And this is why we can't use mmap() for shared > buffers -- there is no mechanism to force the WAL down if the operation > system has the liberty to flush pages whenever it likes.
I see. The latter is a torn page problem, which is solved by a full page image WAL record. I understood that an example of the former problem is the inconsistency between a table page and an index page -- if an index page is flushed to disk without slushing the WAL and the corresponding table page, an index entry would point to a wroing table record after recovery. Thanks, my long-standing question has beenn solved. Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers