Hello 

I'm testing a setup with Pg 8.3.0 and Slony 1.2.13 on RHEL4.

I've got setup 3 box, and run Slony. Everything is ok (replication, move set, 
drop node, store node, ...) , until i try a failover. 

The script is : 

#!/bin/sh
slonik <<_EOF_
cluster name = foo_repl;
node 1 admin conninfo = 'dbname=bar host=10.99.29.38 user=postgres';
node 2 admin conninfo = 'dbname=bar host=10.99.29.49 user=postgres';
node 3 admin conninfo = 'dbname=bar host=10.99.29.120 user=postgres';

#failover (id = 2, backup node = 1);

_EOF_

And the output is : 

[EMAIL PROTECTED] ~]# ./slonik_failover.sh
<stdin>:7: NOTICE:  failedNode: set 1 has other direct receivers - change 
providers only
<stdin>:7: PGRES_FATAL_ERROR select "_qsr_repl".failedNode(2, 1);  - ERROR:  
null value in column "li_provider" violates not-null constraint
CONTEXT:  SQL statement "INSERT INTO "_qsr_repl".sl_listen (li_origin, 
li_provider, li_receiver) select distinct set_origin, sub_provider,  $1  from 
"_qsr_repl".sl_set, "_qsr_repl".sl_subscribe where set_origin =  $2  and 
sub_set = set_id and sub_receiver =  $3  and sub_active"
PL/pgSQL function "rebuildlistenentries" line 75 at SQL statement
SQL statement "SELECT  "_qsr_repl".RebuildListenEntries()"
PL/pgSQL function "failednode" line 155 at PERFORM

I debug this pl and see that slony try to use node 2 ( old master in failover 
), and i think it's not correct. Nothing is done, and, if i re-activate the old 
master, replication work. 

I try to switch over with move set, it's ok, but a new failover with the new 
master failed with the same error. 

I unable to reproduce this bug on Debian boxes, with the same version of slony 
and PostgreSQL : it works !

WTF ?

--
Sébastien 

_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to