Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2017-01-24 Thread Fujii Masao
On Thu, Nov 24, 2016 at 4:24 PM, Amit Kapila  wrote:
> On Thu, Nov 24, 2016 at 10:29 AM, Tsunakawa, Takayuki
>  wrote:
>> From: pgsql-hackers-ow...@postgresql.org
>>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
>>> Thanks for the clarification, I could reproduce the issue and confirms that
>>> patch has fixed it.  Find logs of cascading standby at  PG9.2 Head and Patch
>>> attached (I have truncated few lines at end of server log generated in Head
>>> as those were repetitive).  I think the way you have directly explained
>>> the bug steps in code comments is not right (think if we start writing bug
>>> steps for each bug fix, how the code will look like).  So I have modified
>>> the comment to explain the situation and reason of check,  see if you find
>>> that as okay?
>>
>> Thank you, I'm happy with your comment.
>>
>
> Okay, I have marked the patch as 'Ready for Committer'.

Pushed. Thanks!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-11-23 Thread Amit Kapila
On Thu, Nov 24, 2016 at 10:29 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
>> Thanks for the clarification, I could reproduce the issue and confirms that
>> patch has fixed it.  Find logs of cascading standby at  PG9.2 Head and Patch
>> attached (I have truncated few lines at end of server log generated in Head
>> as those were repetitive).  I think the way you have directly explained
>> the bug steps in code comments is not right (think if we start writing bug
>> steps for each bug fix, how the code will look like).  So I have modified
>> the comment to explain the situation and reason of check,  see if you find
>> that as okay?
>
> Thank you, I'm happy with your comment.
>

Okay, I have marked the patch as 'Ready for Committer'.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-11-23 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
> Thanks for the clarification, I could reproduce the issue and confirms that
> patch has fixed it.  Find logs of cascading standby at  PG9.2 Head and Patch
> attached (I have truncated few lines at end of server log generated in Head
> as those were repetitive).  I think the way you have directly explained
> the bug steps in code comments is not right (think if we start writing bug
> steps for each bug fix, how the code will look like).  So I have modified
> the comment to explain the situation and reason of check,  see if you find
> that as okay?

Thank you, I'm happy with your comment.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-11-23 Thread Amit Kapila
On Tue, Nov 22, 2016 at 8:48 AM, Tsunakawa, Takayuki
 wrote:
> From: pgsql-hackers-ow...@postgresql.org
>
> If the problem occurs, the following pair of lines appear in the server log 
> of the cascading standby.  Could you check it?
>
> LOG:  restored log file "00020003" from archive
> LOG:  out-of-sequence timeline ID 1 (after 2) in log file 0, segment 3, 
> offset 0
>

Thanks for the clarification, I could reproduce the issue and confirms
that patch has fixed it.  Find logs of cascading standby at  PG9.2
Head and Patch attached (I have truncated few lines at end of server
log generated in Head as those were repetitive).  I think the way you
have directly explained the bug steps in code comments is not right
(think if we start writing bug steps for each bug fix, how the code
will look like).  So I have modified the comment to explain the
situation and reason of check,  see if you find that as okay?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


postgresql_Thu_Patch.log
Description: Binary data


postgresql_Thu_Head.log
Description: Binary data


cascading_standby_stuck_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-11-21 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Amit Kapila
> I have tried using attached script multiple times on latest 9.2 code, but
> couldn't reproduce the issue.  Please find the log attached with this mail.
> Apart from log file, below prints appear:
> 
> WARNING: enabling "trust" authentication for local connections You can
> change this by editing pg_hba.conf or using the option -A, or --auth-local
> and --auth-host, the next time you run initdb.
> 20075/20075 kB (100%), 1/1 tablespace
> NOTICE:  pg_stop_backup complete, all required WAL segments have been
> archived
> 20079/20079 kB (100%), 1/1 tablespace
> 
> Let me know, if some parameters need to be tweaked to reproduce the issue?
> 
> 
> It seems that the patch proposed is good, but it is better if somebody other
> than you can reproduce the issue and verify if the patch fixes the same.
> 

Thank you for reviewing the code and testing.  Hmm, we could reproduce the 
problem on PostgreSQL 9.2.19.  The script's stdout is attached as test.log, and 
the stderr is as follows:

WARNING: enabling "trust" authentication for local connections You can change 
this by editing pg_hba.conf or using the option -A, or --auth-local and 
--auth-host, the next time you run initdb.
20099/20099 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
20103/20103 kB (100%), 1/1 tablespace

The sizes pg_basebackup outputs is a bit different from yours.  I don't see a 
reason for this.  The test script explicitly specifies the database encoding 
and locale, so the encoding difference doesn't seem to be the cause.  The 
target problem occurs only when a WAL record crosses a WAL segment boundary, so 
subtle change in WAL record volume would prevent the problem from happening.

Anyway, could you retry with the attached test.sh?  It just changes 
restore_command.

If the problem occurs, the following pair of lines appear in the server log of 
the cascading standby.  Could you check it?

LOG:  restored log file "00020003" from archive
LOG:  out-of-sequence timeline ID 1 (after 2) in log file 0, segment 3, offset 0

Regards
Takayuki Tsunakawa




test.sh
Description: test.sh


test.log
Description: test.log

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-11-18 Thread Amit Kapila
On Mon, Sep 5, 2016 at 8:42 AM, Tsunakawa, Takayuki
 wrote:
>
> From: pgsql-hackers-ow...@postgresql.org
> > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tsunakawa,
> > Our customer hit a problem of cascading replication, and we found the cause.
> > They are using the latest PostgreSQL 9.2.18.  The bug seems to have been
> > fixed in 9.4 and higher during the big modification of xlog.c, but it's
> > not reflected in older releases.
> >
> > The attached patch is for 9.2.18.  This just borrows the idea from 9.4 and
> > higher.
> >
> > But we haven't been able to reproduce the problem.  Could you review the
> > patch and help to test it?  I would very much appreciate it if you could
> > figure out how to reproduce the problem easily.
>
> We could successfully reproduce the problem and confirm that the patch fixes 
> it.  Please use the attached script to reproduce the problem.
>

I have tried using attached script multiple times on latest 9.2 code,
but couldn't reproduce the issue.  Please find the log attached with
this mail.  Apart from log file, below prints appear:

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
20075/20075 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
20079/20079 kB (100%), 1/1 tablespace

Let me know, if some parameters need to be tweaked to reproduce the issue?


It seems that the patch proposed is good, but it is better if somebody
other than you can reproduce the issue and verify if the patch fixes
the same.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


test.log
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [bug fix] Cascading standby cannot catch up and get stuck emitting the same message repeatedly

2016-09-04 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tsunakawa,
> Our customer hit a problem of cascading replication, and we found the cause.
> They are using the latest PostgreSQL 9.2.18.  The bug seems to have been
> fixed in 9.4 and higher during the big modification of xlog.c, but it's
> not reflected in older releases.
> 
> The attached patch is for 9.2.18.  This just borrows the idea from 9.4 and
> higher.
> 
> But we haven't been able to reproduce the problem.  Could you review the
> patch and help to test it?  I would very much appreciate it if you could
> figure out how to reproduce the problem easily.

We could successfully reproduce the problem and confirm that the patch fixes 
it.  Please use the attached script to reproduce the problem.  Place it in an 
empty directory and just run "./test.sh" with no argument.  It creates three 
database clusters (primary, standby, and cascading standby) in the current 
directory.

Could you review the patch and commit it for the next release?  If you think I 
should register the patch with the CommitFest even if the problem occurs in 9.2 
and 9.3, please say so.  I'll do so if there's no comment.

Regards
Takayuki Tsunakawa




test.sh
Description: test.sh

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers