[HACKERS] Online backup cause boot failure, anyone know why?
I want to create a database backup when PG is running, so I call pg_start_backup(''), scp the data to a backup directory, pg_stop_backup. Then I reboot PG , PG boot failed with log like unexpected pageaddr X/X in log file X, segment X, offset X WAL ends before end time of backup dump. Then I check the failure XLOG file, found the error page contains a pageaddr 8K before it should be, and the failure XLOG record a ONLINE CHECKPONT with 60 bytes in former page, the other 4 bytes missing. Any one met this before? Please help me! -- Richard 2010-08-05 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
PS : I am using PG 8.3.7 -- Richard 2010-08-05 - 发件人:Richard 发送日期:2010-08-05 21:19:27 收件人:pgsql-hackers 抄送: 主题:Online backup cause boot failure, anyone know why? I want to create a database backup when PG is running, so I call pg_start_backup(''), scp the data to a backup directory, pg_stop_backup. Then I reboot PG , PG boot failed with log like unexpected pageaddr X/X in log file X, segment X, offset X WAL ends before end time of backup dump. Then I check the failure XLOG file, found the error page contains a pageaddr 8K before it should be, and the failure XLOG record a ONLINE CHECKPONT with 60 bytes in former page, the other 4 bytes missing. Any one met this before? Please help me! -- Richard 2010-08-05 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
On 08/05/2010 09:19 AM, Richard wrote: I want to create a database backup when PG is running, so I call pg_start_backup(''), scp the data to a backup directory, pg_stop_backup. Then I reboot PG , PG boot failed with log like unexpected pageaddr X/X in log file X, segment X, offset X WAL ends before end time of backup dump. Then I check the failure XLOG file, found the error page contains a pageaddr 8K before it should be, and the failure XLOG record a ONLINE CHECKPONT with 60 bytes in former page, the other 4 bytes missing. Any one met this before? Please help me! This question really belongs on the pgsql-general list, not the -hackers list. If all you copied was the data directory then you haven't done this right anyway. See http://www.postgresql.org/docs/8.3/static/continuous-archiving.html#BACKUP-TIPS Why did you reboot postgres after taking your backup? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Re: [HACKERS] Online backup cause boot failure, anyone know why?
I reboot PG because I found PG recovery end point if far away from the actual end point of the XLOG on the backup directory, so I want to test if the original DB is OK. Unfortunately, I got the same PG log on the original DB. I don't unstand what you said, I missing what? -- Richard 2010-08-05 - 发件人:Andrew Dunstan 发送日期:2010-08-05 21:40:13 收件人:Richard 抄送:pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? On 08/05/2010 09:19 AM, Richard wrote: I want to create a database backup when PG is running, so I call pg_start_backup(''), scp the data to a backup directory, pg_stop_backup. Then I reboot PG , PG boot failed with log like unexpected pageaddr X/X in log file X, segment X, offset X WAL ends before end time of backup dump. Then I check the failure XLOG file, found the error page contains a pageaddr 8K before it should be, and the failure XLOG record a ONLINE CHECKPONT with 60 bytes in former page, the other 4 bytes missing. Any one met this before? Please help me! This question really belongs on the pgsql-general list, not the -hackers list. If all you copied was the data directory then you haven't done this right anyway. See http://www.postgresql.org/docs/8.3/static/continuous-archiving.html#BACKUP-TIPS Why did you reboot postgres after taking your backup? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
Richard husttrip...@vip.sina.com writes: PS : I am using PG 8.3.7 I believe there's a related bug fix in 8.3.8. BTW, -hackers is not the place for this type of question. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure, anyone know why?
On Thu, Aug 5, 2010 at 9:50 AM, Richard husttrip...@vip.sina.com wrote: I reboot PG because I found PG recovery end point if far away from the actual end point of the XLOG on the backup directory, so I want to test if the original DB is OK. Unfortunately, I got the same PG log on the original DB. I don't unstand what you said, I missing what? The transaction logs archived during the backup? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
Thanks for replying. But I could not find any relation between the RequestXLogSwitch function and the error I met. For perfromance purpose , I change the pg_start_backup checkpoint type from CHECKPOINT_WAIT to CHECKPOINT_IMMEDIATE, does it matter? -- Richard 2010-08-05 - 发件人:Tom Lane 发送日期:2010-08-05 22:04:30 收件人:Richard 抄送:pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? Richard husttrip...@vip.sina.com writes: PS : I am using PG 8.3.7 I believe there's a related bug fix in 8.3.8. BTW, -hackers is not the place for this type of question. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure,anyone know why?
Oh sorry, I missed something. I turned off the XLOG archive in code after pg_start_backup so the pg_xlog directory contains all the xlog files. And for performance purpose, I change the checkpoint type in pg_start_backup to CHECKPOINT_IMMEDIATE, does it matter? The PG log I mentioned above is the running error log not the XLOG. -- Richard 2010-08-05 - 发件人:Robert Haas 发送日期:2010-08-05 22:07:45 收件人:Richard 抄送:Andrew Dunstan; pgsql-hackers 主题:Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure,anyone know why? On Thu, Aug 5, 2010 at 9:50 AM, Richard husttrip...@vip.sina.com wrote: I reboot PG because I found PG recovery end point if far away from the actual end point of the XLOG on the backup directory, so I want to test if the original DB is OK. Unfortunately, I got the same PG log on the original DB. I don't unstand what you said, I missing what? The transaction logs archived during the backup? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
Thanks for replying. But I could't find relation between the RequestXLogSwitch function and the error I met. For perfromance purpose , I change the pg_start_backup checkpoint type from CHECKPOINT_WAIT to CHECKPOINT_IMMEDIATE, does it matter? -- Richard 2010-08-05 - 发件人:Tom Lane 发送日期:2010-08-05 22:04:30 收件人:Richard 抄送:pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? Richard husttrip...@vip.sina.com writes: PS : I am using PG 8.3.7 I believe there's a related bug fix in 8.3.8. BTW, -hackers is not the place for this type of question. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure,anyone know why?
On Thu, Aug 5, 2010 at 10:20 AM, Richard husttrip...@vip.sina.com wrote: Oh sorry, I missed something. I turned off the XLOG archive in code after pg_start_backup so the pg_xlog directory contains all the xlog files. And for performance purpose, I change the checkpoint type in pg_start_backup to CHECKPOINT_IMMEDIATE, does it matter? The PG log I mentioned above is the running error log not the XLOG. Well, it's pretty clear that you're missing some WAL; otherwise, you wouldn't be getting an error that says WAL ends before end time of backup dump. It's hard to speculate as to whether that's a configuration problem or a result of your custom modifications to the source code, since you haven't provided many details about either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
Richard husttrip...@vip.sina.com writes: For perfromance purpose , I change the pg_start_backup checkpoint type from CHECKPOINT_WAIT to CHECKPOINT_IMMEDIATE, does it matter? Oh, so this isn't so much 8.3.7 as randomly-hacked-up 8.3.7. Yes, that'd break it, I believe. CHECKPOINT_IMMEDIATE doesn't imply waiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
I am sorry, my English is poor. I was confused by what you said. What do you mean by saying that'd break it! -- Richard 2010-08-05 - 发件人:Tom Lane 发送日期:2010-08-05 22:44:50 收件人:Richard 抄送:pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? Richard husttrip...@vip.sina.com writes: For perfromance purpose , I change the pg_start_backup checkpoint type from CHECKPOINT_WAIT to CHECKPOINT_IMMEDIATE, does it matter? Oh, so this isn't so much 8.3.7 as randomly-hacked-up 8.3.7. Yes, that'd break it, I believe. CHECKPOINT_IMMEDIATE doesn't imply waiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
I am sorry, my English is poor. I was confused by what you said. What do you mean by saying that'd break it! -- Richard 2010-08-05 - 发件人:Tom Lane 发送日期:2010-08-05 22:44:50 收件人:Richard 抄送:pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? Richard husttrip...@vip.sina.com writes: For perfromance purpose , I change the pg_start_backup checkpoint type from CHECKPOINT_WAIT to CHECKPOINT_IMMEDIATE, does it matter? Oh, so this isn't so much 8.3.7 as randomly-hacked-up 8.3.7. Yes, that'd break it, I believe. CHECKPOINT_IMMEDIATE doesn't imply waiting. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
On 05/08/10 17:56, Richard wrote: I am sorry, my English is poor. I was confused by what you said. What do you mean by saying that'd break it! Replacing CHECKPOINT_WAIT with CHECKPOINT_IMMEDIATE broke it. Don't do that. If you want to change the behavior of pg_start_backup() to perform the checkpoint immediately, change CHECKPOINT_WAIT to CHECKPOINT_WAIT | CHECKPOINT_IMMEDIATE. The usual work-around though is not to hack the source code, but perform a manual CHECKPOINT just before calling pg_start_backuo(). That makes the checkpoint performed by pg_start_backup() finish quickly. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
All jods are done by client code, not manually. I still did't not understand what you said. What break what? Thandks! -- Richard 2010-08-05 - 发件人:Heikki Linnakangas 发送日期:2010-08-05 23:21:54 收件人:Richard 抄送:Tom Lane; pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? On 05/08/10 17:56, Richard wrote: I am sorry, my English is poor. I was confused by what you said. What do you mean by saying that'd break it! Replacing CHECKPOINT_WAIT with CHECKPOINT_IMMEDIATE broke it. Don't do that. If you want to change the behavior of pg_start_backup() to perform the checkpoint immediately, change CHECKPOINT_WAIT to CHECKPOINT_WAIT | CHECKPOINT_IMMEDIATE. The usual work-around though is not to hack the source code, but perform a manual CHECKPOINT just before calling pg_start_backuo(). That makes the checkpoint performed by pg_start_backup() finish quickly. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Online backup cause boot failure, anyone know why?
2010/8/5 Richard husttrip...@vip.sina.com: All jods are done by client code, not manually. What is a jod? I still did't not understand what you said. What break what? The fact that you replaced CHECKPOINT_WAIT with CHECKPOINT_IMMEDIATE is the cause of your problem. You broke the correctness of the system by doing so. Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: Re: [HACKERS] Online backup cause boot failure, anyone know why?
Sorry, wrong word, it should be job. You mean the wrong type of checkpoint causes XLOG file recovery fail? I was confused, the XLOG files seem corrupted, is it also caused by the checkpoint type? If so , why it can do this? -- Richard 2010-08-05 - 发件人:Nicolas Barbier 发送日期:2010-08-05 23:43:22 收件人:Richard 抄送:Heikki Linnakangas; Tom Lane; pgsql-hackers 主题:Re: [HACKERS] Online backup cause boot failure, anyone know why? 2010/8/5 Richard husttrip...@vip.sina.com: All jods are done by client code, not manually. What is a jod? I still did't not understand what you said. What break what? The fact that you replaced CHECKPOINT_WAIT with CHECKPOINT_IMMEDIATE is the cause of your problem. You broke the correctness of the system by doing so. Nicolas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure,anyone know why?
Let's be clear. If you change the postgres code and then things break I think you're pretty much on your own. We can accept some responsibility for helping you if you're running our code, but not if you're running our code which you have subsequently mangled. If you break things you get to fix them. cheers andrew On 08/05/2010 10:20 AM, Richard wrote: Oh sorry, I missed something. I turned off the XLOG archive in code after pg_start_backup so the pg_xlog directory contains all the xlog files. And for performance purpose, I change the checkpoint type in pg_start_backup to CHECKPOINT_IMMEDIATE, does it matter? The PG log I mentioned above is the running error log not the XLOG. -- Richard 2010-08-05 - 发件人:Robert Haas 发送日期:2010-08-05 22:07:45 收件人:Richard 抄送:Andrew Dunstan; pgsql-hackers 主题:Re: [HACKERS] Re: Re: [HACKERS] Online backup cause boot failure,anyone know why? On Thu, Aug 5, 2010 at 9:50 AM, Richardhusttrip...@vip.sina.com wrote: I reboot PG because I found PG recovery end point if far away from the actual end point of the XLOG on the backup directory, so I want to test if the original DB is OK. Unfortunately, I got the same PG log on the original DB. I don't unstand what you said, I missing what? The transaction logs archived during the backup? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers