Hello, thank you Christopher for very professional answer, I believe it was right solution, but to late, I recreated node and not it is working..
Why slon tries to make all SYNCs in one timeout, isn't it better to make them one by one? thx for help Lukas > Lukas wrote: >> Hello, >> >> no, there was no schema changes at all. Postgres logfile does not shows >> anything interesting, 01-24 looks like was power failure, but postgres >> started successfully after that.. >> >> >> One more think, time to time I am getting from slon: >> ERROR remoteListenThread_1: timeout for event selection >> that is only one error, no more errors at all.. >> I am using slon version 1.2.0 >> >> Any ideas? >> >> > Yes, that was exactly the error message I was expecting... > > The problem here is that the node has been disconnected for WAY WAY too > long. > > When the slon managing that node connects, it tries reading thru > sl_event to determine the list of relevant events that need to be applied. > > After 9-odd days, this has evidently grown to 350K events, and this > evidently takes more than the 300 seconds that the code in > src/slon/remote_listener.c allows for. > > You could, in principle, alter src/slon/remote_listener.c to change this > time: > > At about line 830: > time(&timeout); > timeout += 300; > > You might change 300, which is 5 minutes, to something higher; 30000 > would doubtless be enough time to let the slon get through the query on > sl_event. > > If that did work out, you'd want to set sync grouping (-g) to some > Relatively Large Number; 10000 would probably be good... > > Alternatively, you'll need to treat the node as failed, and > drop/recreate it. > > In future, you need to have some sort of monitoring in place so that it > doesn't take a week to notice that the node isn't working. > >>> Lukas wrote: >>> >>>> and nothing is changing.. >>>> Table sl_event has 350 000 records, sl_log_1 has 3800 records, >>>> sl_log_2 >>>> has 900 and sl_seqlog has 70000 recors on master side.. >>>> >>>> Where can be the problem? What we can do? >>>> >>> Have you checked the slon logs for all nodes as well the PostgreSQL >>> logs >>> for all nodes? >>> >>> Using the information available to determine when this started, did you >>> make any data model or schema changes about that time? >>> > > > -- > This message has been scanned for viruses and > dangerous content, and is believed to be clean. > > -- This message has been scanned for viruses and dangerous content, and is believed to be clean. _______________________________________________ Slony1-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/slony1-general
