Hi,

I found an annoying problem with the PCP command pcp_detach_node. I have
3 computers running each a postgresql instance in a streaming
replication line. PgPool is running on the first node which is the
master. The problem comes when you give a node id outside the real node
numbers.

As I explain above I just have 3 nodes so node id goes from 0 up to 2
and if I use node id 3 that doesn't exists, here are the results:

/usr/bin/pcp_detach_node -d 10 192.168.1.11 9898 postgres postgres 3

DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
DEBUG: recv: tos="d", len=20, data=CommandComplete
DEBUG: send: tos="X", len=4
------------- log file ----------------
LOG: notice_backend_error: node 0 is not valid backend.
LOG: starting degeneration. shutdown host 192.168.1.13(5432)
LOG: execute command: /home/postgres/bin/failover.sh 2 192.168.1.13
192.168.1.11 /home/postgres/data/postgres.trigger
LOG: failover_handler: set new master node: 0
LOG: failover done. shutdown host 192.168.1.13(5432)
LOG: find_primary_node: primary node id is 0
 
[postg...@vm1 ~]$ psql -p 9999 -c "SHOW pool_nodes;"
 node_id |   hostname   | port | status | lb_weight | state
---------+--------------+------+--------+-----------+-------
 0       | 192.168.1.11 | 5432 | 2      | 0.333333  | P
 1       | 192.168.1.12 | 5432 | 2      | 0.333333  | S
 2       | 192.168.1.13 | 5432 | 3      | 0.333333  | S
(3 rows)

As you can see node 2 has been detached instead of aborting and
displaying an error, I also experienced that the detached node was node
0, which is worst.

I've attached a patch that will return the following :

/usr/bin/pcp_detach_node -d 10 192.168.1.11 9898 postgres postgres 3
DEBUG: send: tos="R", len=46
DEBUG: recv: tos="r", len=21, data=AuthenticationOK
DEBUG: send: tos="D", len=6
EOFError
DEBUG: send: tos="X", len=4
------------- log file ----------------
LOG: pcp_child: node id 3 is not valid
LOG: PCP child 32232 exits with status 256
LOG: fork a new PCP child pid 32299


Regards,

-- 
Gilles Darold
http://dalibo.com - http://dalibo.org

--- pgpool-II/pcp_child.c	2010-08-06 01:37:43.000000000 +0200
+++ pgpool-II-test/pcp_child.c	2011-01-06 23:23:34.000000000 +0100
@@ -700,6 +700,11 @@
 					gracefully = true;
 
 				node_id = atoi(buf);
+				if ( (node_id >= 0) && (node_id >= pool_config->backend_desc->num_backends) )
+				{
+					pool_error("pcp_child: node id %d is not valid", node_id);
+					exit(1);
+				}
 				pool_debug("pcp_child: detaching Node ID %d", node_id);
 				pool_detach_node(node_id, gracefully);
 
_______________________________________________
Pgpool-hackers mailing list
[email protected]
http://pgfoundry.org/mailman/listinfo/pgpool-hackers

Reply via email to