Re: [HACKERS] Avoiding unnecessary reads in recovery

Jim Nasby Thu, 26 Apr 2007 00:54:16 -0700

On Apr 25, 2007, at 2:48 PM, Heikki Linnakangas wrote:

In recovery, with full_pages_writes=on, we read in each page onlyto overwrite the contents with a full page image. That's a waste oftime, and can have a surprisingly large effect on recovery time.
As a quick test on my laptop, I initialized a DBT-2 test with 5warehouses, and let it run for 2 minutes without think-times togenerate some WAL. Then I did a "kill -9 postmaster", and took acopy of the data directory to use for testing recovery.
With CVS HEAD, the recovery took ~ 2 minutes. With the attachedpatch, it took 5 seconds. (yes, I used the same not-yet-recovereddata directory in both tests, and cleared the os cache with "echo 1> /proc/sys/vm/drop_caches").
I was surprised how big a difference it makes, but when you thinkabout it it's logical. Without the patch, it's doing roughly thesame I/O as the test itself, reading in pages, modifying them, andwriting them back. With the patch, all the reads are donesequentially from the WAL, and then written back in a batch at theend of the WAL replay which is a lot more efficient.
It's interesting that (with the patch) full_page_writes can*shorten* your recovery time. I've always thought it to have apurely negative effect on performance.
I'll leave it up to the jury if this tiny little change isappropriate after feature freeze...
While working on this, this comment in ReadBuffer caught my eye:
        /*
         * During WAL recovery, the first access to any data page should
         * overwrite the whole page from the WAL; so a clobbered page
         * header is not reason to fail.  Hence, when InRecovery we may
         * always act as though zero_damaged_pages is ON.
         */
        if (zero_damaged_pages || InRecovery)
        {
But that assumption only holds if full_page_writes is enabled,right? I changed that in the attached patch as well, but if itisn't accepted that part of it should still be applied, I think.

So what happens if a backend is running with full_page_writes = off,someone edits postgresql.conf to turns it on and forgets to reload/restart, and then we crash? You'll come up in recovery mode thinkingthat f_p_w was turned on, when in fact it wasn't.

ISTM that we need to somehow log what the status of full_page_writesis, if it's going to affect how recovery works.

--
Jim Nasby                                            [EMAIL PROTECTED]
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] Avoiding unnecessary reads in recovery

Reply via email to