On 12-01-17 05:38 AM, Fujii Masao wrote:
On Fri, Jan 13, 2012 at 5:02 PM, Fujii Masao<masao.fu...@gmail.com>  wrote:
The amount of code changes to allow pg_basebackup to make a backup from
the standby seems to be small. So I ended up merging that changes and the
infrastructure patch. WIP patch attached. But I'd happy to split the patch again
if you want.
Attached is the updated version of the patch. I wrote the limitations of
standby-only backup in the document and changed the error messages.


Here is my review of this verison of the patch. I think this patch has been in every CF for 9.2 and I feel it is getting close to being committed. The only issue of significants is a crash I encountered while testing, see below.

I am fine with including the pg_basebackup changes in the patch it also makes testing some of the other changes possible.


The documentation updates you have are good

I don't see any issues looking at the code.



Testing Review
--------------------------------

I encountered this on my first replica (the one based on the master). I am not sure if it is related to this patch, it happened after the pg_basebackup against the replica finished.

TRAP: FailedAssertion("!(((xid) != ((TransactionId) 0)))", File: "twophase.c", Line: 1238)
LOG:  startup process (PID 12222) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes

A little earlier this postmaster had printed.

LOG:  restored log file "00000001000000000000001F" from archive
LOG:  restored log file "000000010000000000000020" from archive
cp: cannot stat `/usr/local/pgsql92git/archive/000000010000000000000021': No such file or directory
LOG:  unexpected pageaddr 0/19000000 in log file 0, segment 33, offset 0
cp: cannot stat `/usr/local/pgsql92git/archive/000000010000000000000021': No such file or directory


I have NOT been able to replicate this error and I am not sure exactly what I had done in my testing prior to that point.


In another test run I had

- set full page writes=off and did a checkpoint
- Started the pg_basebackup
- set full_page_writes=on and did a HUP + some database activity that might have forced a checkpoint.

I got this message from pg_basebackup.
./pg_basebackup -D ../data3 -l foo -h localhost -p 5438
pg_basebackup: could not get WAL end position from server

I point this out because the message is different than the normal "could not initiate base backup: FATAL: WAL generated with full_page_writes=off" thatI normally see. We might want to add a PQerrorMessage(conn)) to pg_basebackup to print the error details. Since this patch didn't actually change pg_basebackup I don't think your required to improve the error messages in it. I am just mentioning this because it came up in testing.


The rest of the tests I did involving changing full_page_writes with/without checkpoints and sighups and promoting the replica seemed to work as expected.




Regards,





Reply via email to