I tested this again with pgpool_walrecrunning() installed in every database, and with a valid health_check_user setting (didn't have that the first time around either), and now it works as expected. Thanks everybody!
-- Matt On Jun 17, 2011, at 12:14 PM, Matt Solnit wrote: It's possible that I did not have pgpool_walrecrunning() installed in every database. I am trying again now to make certain my setup is correct. -- Matt On Jun 17, 2011, at 12:10 PM, <[email protected]<mailto:[email protected]>> wrote: According to Matt, he is using pgpool-II 3.0.4 built from source. I have not tried either. From: [email protected]<mailto:[email protected]> [mailto:[email protected]] On Behalf Of Anton Koldaev Sent: Friday, June 17, 2011 2:39 PM To: Matt Solnit Cc: [email protected]<mailto:[email protected]> Subject: Re: [Pgpool-general] Can a failed master rejoin as a slave? Hmm... it seems to me your problem was resolved in 3.0.3: 3.0.3 (umiyameboshi) 2011/02/23 * Version 3.0.3 This version fixes various bugs since 3.0.1. Please note that 3.0.2 was canceled due to a packaging problem. - Fix online recovery problem in the streaming replication mode(Tatsuo). Consider following scenario. Suppose node 0 is the initial primary server and 1 is the initial standby server. 1) Node 0 going down and node 1 promotes to new primary. 2) Recover node 0 as new standby. 3) pgpool-II assumes that node 0 is the new primary. This problem happens because pgpool-II regarded unconditionally the youngest node to be the primary. pgpool-II 3.0.3 now checks each node by using pgpool_walrecrunning() to see if it is a actually primary or not and is able to avoid the problem and regards node as standby correctly. Also you can use new variable "%P" to be used in the recovery script. If you do not install the function, the above problem is not resolved. On Fri, Jun 17, 2011 at 8:02 PM, Matt Solnit <[email protected]<mailto:[email protected]>> wrote: On Jun 17, 2011, at 8:17 AM, <[email protected]<mailto:[email protected]>> wrote: Hi, Matt pgpool-II immediately attempts to use it as a master again. This doesn't work, obviously, because it's no longer a master. I dont understand why it doesnt work. AFAIK node with the youngest id(backendX in pgpool.conf) and status 2(psql -c 'show pool_nodes;') will always become a primary node. Check this out: The backend which was given the DB node ID of 0 will be called "Master DB". When multiple backends are defined, the service can be continued even if the Master DB is down (not true in some modes). In this case, the youngest DB node ID alive will be the new Master DB. http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html The problem Matt points out is precisely when primary DB *is re-attached*. After re-attaching the primary DB (node ID 0), it's "back online", therefore, pgpool treats it as the master again, according to your cited explanation. So I agree with Matt: the just re-attached Node 0 should be slave from now on, since it was technically attached AFTER selecting the new master (which is Node 1 at this point). -Daniel Exactly. With streaming replication, only the "true" master can accept DML statements (insert/update/delete), so if pgpool-II attempts to send them to the wrong node, you get a "connect execute XYZ in a read-only transaction" error. This thread seems to cover the same question, but I couldn't really tell what the resolution was: http://lists.pgfoundry.org/pipermail/pgpool-general/2011-April/003568.ht ml -- Matt _______________________________________________ Pgpool-general mailing list [email protected]<mailto:[email protected]> http://pgfoundry.org/mailman/listinfo/pgpool-general -- Best regards, Koldaev Anton
_______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
