> On 11 Aug 2016, at 17:43, Petr Jelinek <p...@2ndquadrant.com> wrote:
> 
>> 
>> * Also I wasn’t able actually to run replication itself =) While regression 
>> tests passes, TAP
>> tests and manual run stuck. pg_subscription_rel.substate never becomes ‘r’. 
>> I’ll investigate
>> that more and write again.
> 
> Interesting, please keep me posted. It's possible for tables to stay in 's' 
> state for some time if there is nothing happening on the server, but that 
> should not mean anything is stuck.

Slightly played around, it seems that apply worker waits forever for substate 
change.

(lldb) bt
* thread #1: tid = 0x183e00, 0x00007fff88c7f2a2 libsystem_kernel.dylib`poll + 
10, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00007fff88c7f2a2 libsystem_kernel.dylib`poll + 10
    frame #1: 0x00000001017ca8a3 
postgres`WaitEventSetWaitBlock(set=0x00007fd2dc816b30, cur_timeout=10000, 
occurred_events=0x00007fff5e7f67d8, nevents=1) + 51 at latch.c:1108
    frame #2: 0x00000001017ca438 
postgres`WaitEventSetWait(set=0x00007fd2dc816b30, timeout=10000, 
occurred_events=0x00007fff5e7f67d8, nevents=1) + 248 at latch.c:941
    frame #3: 0x00000001017c9fde 
postgres`WaitLatchOrSocket(latch=0x000000010ab208a4, wakeEvents=25, sock=-1, 
timeout=10000) + 254 at latch.c:347
    frame #4: 0x00000001017c9eda postgres`WaitLatch(latch=0x000000010ab208a4, 
wakeEvents=25, timeout=10000) + 42 at latch.c:302
  * frame #5: 0x0000000101793352 
postgres`wait_for_sync_status_change(tstate=0x0000000101e409b0) + 178 at 
tablesync.c:228
    frame #6: 0x0000000101792bbe 
postgres`process_syncing_tables_apply(slotname="subbi", 
end_lsn=140734778796592) + 430 at tablesync.c:436
    frame #7: 0x00000001017928c1 
postgres`process_syncing_tables(slotname="subbi", end_lsn=140734778796592) + 81 
at tablesync.c:518
    frame #8: 0x000000010177b620 
postgres`LogicalRepApplyLoop(last_received=140734778796592) + 704 at 
apply.c:1122
    frame #9: 0x000000010177bef4 postgres`ApplyWorkerMain(main_arg=0) + 1044 at 
apply.c:1353
    frame #10: 0x000000010174cb5a postgres`StartBackgroundWorker + 826 at 
bgworker.c:729
    frame #11: 0x0000000101762227 
postgres`do_start_bgworker(rw=0x00007fd2db700000) + 343 at postmaster.c:5553
    frame #12: 0x000000010175d42b postgres`maybe_start_bgworker + 427 at 
postmaster.c:5761
    frame #13: 0x000000010175bccf 
postgres`sigusr1_handler(postgres_signal_arg=30) + 383 at postmaster.c:4979
    frame #14: 0x00007fff9ab2352a libsystem_platform.dylib`_sigtramp + 26
    frame #15: 0x00007fff88c7e07b libsystem_kernel.dylib`__select + 11
    frame #16: 0x000000010175d5ac postgres`ServerLoop + 252 at postmaster.c:1665
    frame #17: 0x000000010175b2e0 postgres`PostmasterMain(argc=3, 
argv=0x00007fd2db403840) + 5968 at postmaster.c:1309
    frame #18: 0x000000010169507f postgres`main(argc=3, 
argv=0x00007fd2db403840) + 751 at main.c:228
    frame #19: 0x00007fff8d45c5ad libdyld.dylib`start + 1
(lldb) p state
(char) $1 = 'c'
(lldb) p tstate->state
(char) $2 = ‘c’

Also I’ve noted that some lsn position looks wrong on publisher:

postgres=# select restart_lsn, confirmed_flush_lsn from pg_replication_slots;
 restart_lsn | confirmed_flush_lsn 
-------------+---------------------
 0/1530EF8   | 7FFF/5E7F6A30
(1 row)

postgres=# select sent_location, write_location, flush_location, 
replay_location from pg_stat_replication;
 sent_location | write_location | flush_location | replay_location 
---------------+----------------+----------------+-----------------
 0/1530F30     | 7FFF/5E7F6A30  | 7FFF/5E7F6A30  | 7FFF/5E7F6A30
(1 row)



-- 
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to