It's probably a firewall timing out your PostgreSQL connection while the indexes are being built on the replica.
Look into tcp keep alive settings. > On Feb 15, 2014, at 22:09, Tory M Blue <[email protected]> wrote: > > > So I've been fighting with this for a few months. I had someone on slony Dev > attempt to lend a hand but others in the group, felt it was more of a > postgres issue. While this may be true, I'm still looking for some > assistance. Everything points to a disconnect in slony. > > Wide area replication, fails on one of my largest tables. Now the table will > copy over complete no issues (using standard pgsql commands), it's the post > processing after the data is copied that seems to cause a sig term or > something on the connection, since slony states that the set failed and tries > again, fails at the same place , > > 2014-02-15 15:23:00 PST CONFIG remoteWorkerThread_1: Begin COPY of table > "tracking"."spotlightimp" > 2014-02-15 16:46:45 PST CONFIG remoteWorkerThread_1: 5643041332 bytes copied > for table "tracking"."spotlightimp" <--- Completes transfer > 2014-02-15 17:34:10 PST CONFIG remoteWorkerThread_1: 7870.124 seconds to > copy table "tracking"."spotlightimp" <-- At this point it finishes the > index creation and everything else > 2014-02-15 17:34:10 PST CONFIG remoteWorkerThread_1: copy table > "tracking"."adimp" > 2014-02-15 17:34:10 PST CONFIG remoteWorkerThread_1: Begin COPY of table > "tracking"."adimp" > 2014-02-15 17:34:10 PST ERROR remoteWorkerThread_1: "select > "_slonyschema".copyFields(19);" <--- FAILS but adimp table is there, this > is a red herring. the issue is above! > 2014-02-15 17:34:10 PST WARN remoteWorkerThread_1: data copy for set 2 > failed 1 times - sleep 15 seconds > NOTICE: Slony-I: Logswitch to sl_log_1 initiated > CONTEXT: SQL statement "SELECT "_slonyschema".logswitch_start()" > PL/pgSQL function _slonyschema.cleanupevent(interval) line 96 at PERFORM > 2014-02-15 17:34:14 PST INFO cleanupThread: 7209.360 seconds for > cleanupEvent() > > > I've brought my work_mem to over 40GB and that's not helping the length of > time for this large table. I have even removed the index statement still > doesn't cut the time, The copy is fine, all the data comes over. It's > something in the processing of the table. There is s disconnect at some point > between when slony finishes up the copy of the spotlightimp, and Postgres > processes the rules in the table, and slony starts on the next table. > > > 2014-02-15 18:48:27 PST CONFIG remoteWorkerThread_1: copy table > "tracking"."spotlightimp" > 2014-02-15 18:48:27 PST CONFIG remoteWorkerThread_1: Begin COPY of table > "tracking"."spotlightimp" > 2014-02-15 20:11:07 PST CONFIG remoteWorkerThread_1: 5643067207 bytes copied > for table "tracking"."spotlightimp" > 2014-02-15 20:59:46 PST CONFIG remoteWorkerThread_1: 7878.124 seconds to copy > table "tracking"."spotlightimp" > 2014-02-15 20:59:46 PST CONFIG remoteWorkerThread_1: copy table > "tracking"."adimp" > 2014-02-15 20:59:46 PST CONFIG remoteWorkerThread_1: Begin COPY of table > "tracking"."adimp" > 2014-02-15 20:59:46 PST ERROR remoteWorkerThread_1: "select > "_slonyschema".copyFields(19);" > 2014-02-15 20:59:46 PST WARN remoteWorkerThread_1: data copy for set 2 > failed 1 times - sleep 15 seconds > NOTICE: Slony-I: log switch to sl_log_2 complete - truncate sl_log_1 > CONTEXT: PL/pgSQL function _slonyschema.cleanupevent(interval) line 94 at > assignment > 2014-02-15 20:59:50 PST INFO cleanupThread: 7203.435 seconds for > cleanupEvent() > > I do feel incredibly strongly it's the size of the table and how long the > process takes, the network / postgres is either reaping the connection or > other causing slony to be in an unknown state and causes the error the minute > we try to move forward from the spotlightimp table.. If I could cut down the > preprocessing after the table was copied that may solve it, but removing the > index part has not helped the situation as I hoped it would. This is a > complicated table, as well as it's size. > > I would love to get this sorted out, slony should allow for this remote > replication, but something is going wrong and man would I love to get this > resolved! > > CentOS6.2 > Postgres 9.2.4 slony 2.1.3 > > Thanks > Tory > _______________________________________________ > Slony1-general mailing list > [email protected] > http://lists.slony.info/mailman/listinfo/slony1-general _______________________________________________ Slony1-general mailing list [email protected] http://lists.slony.info/mailman/listinfo/slony1-general
