Re: [HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-20 Thread Neil Thombre
On Wed, Feb 19, 2014 at 1:42 PM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 02/19/2014 11:15 PM, Neil Thombre wrote:

 And that is where I have a question. I noticed that in pg_standby.c when
 we
 detect the word fast in the trigger file we truncate the file.

 https://github.com/postgres/postgres/blob/REL9_1_11/
 contrib/pg_standby/pg_standby.c#L456

 There is also a comment above it about not upsetting the server.

 https://github.com/postgres/postgres/blob/REL9_1_11/
 contrib/pg_standby/pg_standby.c#L454

 What is the purpose of truncating the file? To do a smart failover once
 you
 come out of standby? But, when I look at xlog.c, when we come out of
 standby due to a failure returned by restore_command, we call
 CheckForStandbyTrigger() here:

 https://github.com/postgres/postgres/blob/REL9_1_11/src/
 backend/access/transam/xlog.c#L10441

 Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
 the debugger that the unlinking happens before xlog.c makes a call to the
 next restore_command.  So, what is the reason for truncating the fast
 word from the trigger file if the file is going to be deleted soon after
 it
 is discovered? How will we upset the server if we don't?


 At end-of-recovery, the server will fetch again the last WAL file that was
 replayed. If it can no longer find it, because restore_command now returns
 an error even though it succeeded for the same file few seconds earlier, it
 will throw an error and refuse to start up.


The restore_command returns error exactly once in my setup. So the next
time around, it does go back and is able to fetch the last segment
successfully applied. Let me go through the steps:

New restore_command = ! fgrep -qsi fast trigger_file  Old
restore_command

1. Until it finds fast in the trigger file it will continue running the
(old) restore_command. It applies 00030C310099 successfully
from the archive.
2. I echo fast  trigger_file
3. The next  restore_command returns failure (because of the first part of
my ) so it never ends up applying the next segment it was supposed to
apply,i.e.,  00030C31009A
4. Db comes out of standby and checks for trigger file in
CheckForStandbyTrigger() that unlinks it. Now the trigger file is gone!
5. Next, it tries to read the last applied segment 00030C310099
again - the new restore_command WILL NOT return failure because the trigger
file is gone and the first part (before the ) is true and it will run the
old restore_command and try get the same file 00030C310099
6. It is during reapplying this file that I get the following error:

Feb  7 00:37:45  LOG:  restored log file
00030C310099 from archive

Feb  7 00:37:45  FATAL:  WAL ends before consistent
recovery point

This error comes from:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L6782-L6783

Therefore, I feel that something was amiss in my setup. And I wanted to
understand the motive/tribal-knowledge behind the truncation part of
pg_standby's fast failover so as not to upset the server. In other words, I
have a feeling that by not truncating the trigger file I am inadvertently
upsetting the server which is the cause of my FATAL error.






 That's the way it used to be until 9.2, anyway. In 9.2, the behavior was
 changed, so that the server keeps all the files restored from archive, in
 pg_xlog, so that it can access them again. I haven't tried, but it's
 possible that the truncation is no longer necessary. Try it, with 9.1 and
 9.3, and see what happens.


It may very well be that 9.3 will not have this problem. I will definitely
try this out when I have a chance - this problem is pretty obscure and
rarely happens  on our customer databases, therefore we just work with
whatever database version we have at the moment to get as much forensics
and try out any solutions.


 - Heikki


Thanks a lot for your help.


[HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Neil Thombre
I was trying to understand  (and then perhaps mimic) how pg_standby does a
fast failover.

My current understanding is that when a secondary db is in standby mode, it
will exhaust all the archive log to be replayed from the primary and then
start streaming. It is at this point that xlog.c checks for the existence
of a trigger file to promote the secondary. This was been a cause of some
irritation for some of our customers who do not really care  about catching
up all the way. I want to achieve the exact semantics of pg_standby's fast
failover option.

I manipulated the restore command to return 'failure' when the word fast
is present in the trigger file (see below), hoping that when I want a
secondary database to come out fast, I can just echo the word fast into
the trigger file thereby simulating pg_standby's fast failover behavior.
However, that did not work. Techically, I did not truncate the trigger file
like how pg_standby.

New restore_command = ! fgrep -qsi fast trigger_file  Old
restore_command


And that is where I have a question. I noticed that in pg_standby.c when we
detect the word fast in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not upsetting the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the fast
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we upset the server if we don't?


Assuming this question is answered and I get a better understanding, I have
a follow up question. If  truncation is indeed necessary, can I simulate
the truncation by manipulating restore_command and achieve the same effect
as a fast failover in pg_standby?



Thanks in advance for the help.

Neil


Re: [HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Heikki Linnakangas

On 02/19/2014 11:15 PM, Neil Thombre wrote:

And that is where I have a question. I noticed that in pg_standby.c when we
detect the word fast in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not upsetting the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the fast
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we upset the server if we don't?


At end-of-recovery, the server will fetch again the last WAL file that 
was replayed. If it can no longer find it, because restore_command now 
returns an error even though it succeeded for the same file few seconds 
earlier, it will throw an error and refuse to start up.


That's the way it used to be until 9.2, anyway. In 9.2, the behavior was 
changed, so that the server keeps all the files restored from archive, 
in pg_xlog, so that it can access them again. I haven't tried, but it's 
possible that the truncation is no longer necessary. Try it, with 9.1 
and 9.3, and see what happens.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers