On Thursday, October 18, 2012 8:49 PM Fujii Masao wrote:
On Wed, Oct 17, 2012 at 8:46 PM, Amit Kapila <amit.kap...@huawei.com> wrote:
>> On Monday, October 15, 2012 3:43 PM Heikki Linnakangas wrote:
>> On 13.10.2012 19:35, Fujii Masao wrote:
>> > On Thu, Oct 11, 2012 at 11:52 PM, Heikki Linnakangas
>> > <hlinnakan...@vmware.com>  wrote:
>> >> Ok, thanks. Committed.
>> >
>> > I found one typo. The attached patch fixes that typo.
>>
>> Thanks, fixed.
>>
>> > ISTM you need to update the protocol.sgml because you added
>> > the field 'replyRequested' to WalSndrMessage and StandbyReplyMessage.
>
>
>>
>> > Is it worth adding the same mechanism (send back the reply immediately
>> > if walsender request a reply) into pg_basebackup and pg_receivexlog?
>>
>> Good catch. Yes, they should be taught about this too. I'll look into
>> doing that too.
>
> If you have not started and you don't have objection, I can pickup this to
> complete it.
>
> For both (pg_basebackup and pg_receivexlog), we need to get a timeout
> parameter from user in command line, as
> there is no conf file here. New Option can be -t (parameter name can be
> recvtimeout).
>
> The main changes will be in function ReceiveXlogStream(), it is a common
> function for both
> Pg_basebackup and pg_receivexlog. Handling will be done in same way as we
> have done in walreceiver.
>
> Suggestions/Comments?

>Before implementing the timeout parameter, I think that it's better to change
>both pg_basebackup background process and pg_receivexlog so that they
>send back the reply message immediately when they receive the keepalive
>message requesting the reply. Currently, they always ignore such keepalive
>message, so status interval parameter (-s) in them always must be set to
>the value less than replication timeout. We can avoid this troublesome
>parameter setting by introducing the same logic of walreceiver into both
>pg_basebackup background process and pg_receivexlog.

Please find the patch attached to address the modification mentioned by you 
(send immediate reply for keepalive).
Both basebackup and pg_receivexlog uses the same function ReceiveXLogStream, so 
single change for both will address the issue.


Now further to this for introducing timeout in pg_basebackup and pg_receivexlog:
We can have mechanism similar to wal receiver timeout while streaming the data 
from server, but same logic can not be used incase network goes down during 
getting other database file from server. 
The reason for the same is to receive the data files PQgetCopyData() is called 
in synchronous mode, so it keeps waiting for infinite time till it gets some 
data. 
In order to solve this issue, I can think of following options: 
1. Making this call also asynchronous (but now sure about impact of this). 
2. In function pqWait, instead of passing hard-code value -1 (i.e. infinite 
wait), we can send some finite time. This time can be received as command line 
argument 
    from respective utility and set the same in PGconn structure. 
    In order to have timeout value in PGconn, we can have: 
        a. Add new parameter in PGconn to indicate the receive timeout. 
        b. Use the existing parameter connect_timeout for receive timeout also 
but this may lead to confusion. 
3. Any other better option?
        
Apart from above issue, there is possibility that if during connect time 
network goes down, then it might hang,  because connect_timeout by default will 
be NULL and connectDBComplete will start waiting inifinitely for connection to 
become successful. 
So shall we have command line argument separately for this also or any other 
way as you suugest. 

Suggestions/Comments

With Regards,
Amit Kapila.

Attachment: pg_basebackup_keepalive_reply.patch
Description: pg_basebackup_keepalive_reply.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to