Re: [Slony1-general] Uninterrupted Slony Replication

Steve Singer Mon, 08 Aug 2011 17:12:44 -0700

On Mon, 8 Aug 2011, Dilraj Singh wrote:

Hi,


Yup, it works for 2.0.7. Thanks.

But i tried for version 2.0.4 also, still its giving the same errors. We are
little bit inclined to use version 2.0.4 as it is current version available
with apt-get on debian and hence can be updated easily using apt-get. So is
there any way i can make this work in the version 2.0.4 too?

Also, I noticed that on rebooting the machine, it does not even work when i
kill the slon process started on reboot and manually start the slon process
like ./slon conninfo=.

Once your network and postgresql instances are up you should just be able torestart all of your slon processes and replication should resume (with2.0.4) it should recover from the dropped connections when slon isrestarting.

How are you starting slon? Are you using a slon.conf file or passing theconninfo on the command line? (you need to be doing one of the two).


Steve


Regards
Dilraj Singh

On Sat, Aug 6, 2011 at 8:42 AM, Steve Singer <[email protected]>
wrote:
      On Fri, 5 Aug 2011, Dilraj Singh wrote:

            Hi,

            I am using postgresql-8.4 and slony1-1.2.0.3 and i
            have been able implement
            a 4 node replication cluster where nodes communicate
            successfully with each


Try upgrading to 2.0.7 and see if it fixes your problem.

1) 2.0.3 has a bug (unrelated to your current issue) that isn't
present in 2.0.2 or 2.0.4 so that release should be avoided

2) 2.0.7 has some fixes related to recovering from dropped connections
that might fix your issue, the error you paste below looks familiar.

<snip>


      2011-08-05 09:25:40 PDTERROR  remoteListenThread_3:
      "select con_origin,
      con_received,     max(con_seqno) as con_seqno,    
      max(con_timestamp) as
      con_timestamp from "_four_node_rep_cluster20".sl_confirm
      where con_received
      <> 2 group by con_origin, con_received" 2011-08-05
      09:25:42 PDTERROR 
      remoteListenThread_3: "select ev_origin, ev_seqno,
      ev_timestamp,       
      ev_snapshot,       
      "pg_catalog".txid_snapshot_xmin(ev_snapshot),       
      "pg_catalog".txid_snapshot_xmax(ev_snapshot),       
      ev_type,       
      ev_data1, ev_data2,        ev_data3, ev_data4,       
      ev_data5,
      ev_data6,        ev_data7, ev_data8 from
      "_four_node_rep_cluster20".sl_event
      e where (e.ev_origin = '3' and e.ev_seqno > '5000000005')
      or (e.ev_origin =
      '4' and e.ev_seqno > '5000000039') order by e.ev_origin,
      e.ev_seqno limit
      40" - no connection to the server

      and then the replication wont start working again till the
      time i reboot all
      the nodes. I am guessing it might be the case that the
      provider node gets
      reinitialized on rebooting thats why the replication
      starts again. I know
      slony is used for automated database replication so i was
      wondering whether
      there is any way in which i can make this work without
      rebooting all the
      nodes, which will be inconvenient if the number of nodes
      increase or for
      production server

      Any inputs on the above error will be greatly appreciated.

      Regards
      Dilraj Singh

_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Re: [Slony1-general] Uninterrupted Slony Replication

Reply via email to