On Tue, Jul 30, 2024 at 9:25 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Jul 30, 2024 at 1:48 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > > > Robert Haas <robertmh...@gmail.com> writes: > > > On Sun, Jun 30, 2024 at 2:40 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > >> ... However, I added a new open item about how the > > >> 040_pg_createsubscriber.pl test is slow and still unstable. > > > > > But that said, I see no commits in the commit history which purport to > > > improve performance, so I guess the performance is probably still not > > > what you want, though I am not clear on the details. > > > > My concern is described at [1]: > > > > >> I have a different but possibly-related complaint: why is > > >> 040_pg_createsubscriber.pl so miserably slow? On my machine it > > >> runs for a bit over 19 seconds, which seems completely out of line > > >> (for comparison, 010_pg_basebackup.pl takes 6 seconds, and the > > >> other test scripts in this directory take much less). It looks > > >> like most of the blame falls on this step: > > >> > > >> [12:47:22.292](14.534s) ok 28 - run pg_createsubscriber on node S > > >> > > >> AFAICS the amount of data being replicated is completely trivial, > > >> so that it doesn't make any sense for this to take so long --- and > > >> if it does, that suggests that this tool will be impossibly slow > > >> for production use. But I suspect there is a logic flaw causing > > >> this. Speculating wildly, perhaps that is related to the failure > > >> Alexander spotted? > > > > The followup discussion in that thread made it sound like there's > > some fairly fundamental deficiency in how wait_for_end_recovery() > > detects end-of-recovery. I'm not too conversant with the details > > though, and it's possible that pg_createsubscriber is just falling > > foul of a pre-existing infelicity. > > > > If the problem can be correctly described as "pg_createsubscriber > > takes 10 seconds or so to detect end-of-stream", > > > > The problem can be defined as: "pg_createsubscriber waits for an > additional (new) WAL record to be generated on primary before it > considers the standby is ready for becoming a subscriber". Now, on > busy systems, this shouldn't be a problem but for idle systems, the > time to detect end-of-stream can't be easily defined.
AFAIU, the server will emit running transactions WAL record at least 15 seconds. So the subscriber should not have to wait longer than 15 seconds. I understand that it would be a problem for tests, but will it be a problem for end users? Sorry for repetition, if this has been discussed. -- Best Wishes, Ashutosh Bapat