Re: [Slony1-general] Uninterrupted Slony Replication

Steve Singer Tue, 09 Aug 2011 04:32:14 -0700

On Mon, 8 Aug 2011, Dilraj Singh wrote:

Hi Steve,
Yeah, i am sorry but i missed the clustername definition while writing the
mail. Thanks for pointing that out. It will definitely not work without me
defining the cluster name same as the one in the cluster setup and
subscription scripts. Exact initialization script is :
 
#!/bin/sh
bash -u postgres -c '/usr/lib/postgresql/8.4/bin/pg_ctl start
-D /var/lib/postgresql/8.4/main" '
bash -u postgres -c '/usr/bin/slon four_node_replication_cluster20
"dbname=testdb user=postgres" '


As i said in my earlier mail, I even manually started slon processes on the
rebooted machine, but even then replication does not start.

You need to restart the slon processes on all the *other* machines, not therebooted. The slon process on the reboted one gets restart by the act ofrebooting.


Steve

Regards
Dilraj Singh


On Mon, Aug 8, 2011 at 8:33 PM, Steve Singer <[email protected]>
wrote:
      On Mon, 8 Aug 2011, Dilraj Singh wrote:

      Hi Steve,
      I have placed a script in the /etc/init.d folder of my
      debian machine which
      has the commands as


To restart slon manually after your rebooted node i back up try

slon four_node_rep_cluster20 'dbname=testdb user=postgres'

on all the other nodes.


      #!/bin/sh
      bash -u postgres -c '/usr/lib/postgresql/8.4/bin/pg_ctl
      start -D
      /var/lib/postgresql/8.4/main" '
      bash -u postgres -c '/usr/bin/slon conninfo=
      "dbname=testdb user=postgres" '


What the above line does is start slon with a cluster name of
'conninfo='
in your previous email you pasted output that indicated that your
clustername is 'four_node_rep_cluster20'

I suspect that the slon started but your init script isn't actually
the slon instance doing the work but you have somethign somewhere else
that is starting up the slon with the clustername
'four_node_rep_cluster20'  I suspect that other slon instance recovers
properly from the reboot of the remote node (since 2.0.7 tends to
recover properly) while with 2.0.4 you need to manually correctly
restart the remote slons





      I have configured this script on each of the 4 machines to
      run at the the
      reboot time which will start the database and then will
      run the slon
      process. I am passing conninfo on the command line itself
      and before doing
      the reboot, i have also made the cluster_setup and
      subscriptions for the
      four nodes. So its like replication is going on when i do
      reboot on one of
      the machines. 

      As you pointed out, this all procedure works fine in
      2.0.7, but fails in the
      version 2.0.4. Also while seeing the output of the        
                 ps aux
      | grep postgres command at the times of broken and
      not-broken connection, i
      can see the entries for the processes related to database
      of the other
      machines (which are connected to it as described in
      subscription script) in
      the not-broken connection whereas broken connection (after
      reboot) has only
      local database entries in the command output.

      Thanks for helping me out. :)

      Regards
      Dilraj Singh

      On Mon, Aug 8, 2011 at 5:08 PM, Steve Singer
      <[email protected]>
      wrote:
           Once your network and postgresql instances are up you
      should
           just be able to restart all of your slon processes
      and
           replication should resume (with 2.0.4) it should
      recover from
           the dropped connections when slon is restarting.

      How are you starting slon?  Are you using a slon.conf file
      or passing
      the conninfo on the command line? (you need to be doing
      one of the
      two).

      Steve

_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Re: [Slony1-general] Uninterrupted Slony Replication

Reply via email to