> > On Fri, May 20, 2011 at 3:42 AM, Eamon Roque <eamon.ro...@lex-com.net>wrote: > > > Hi, > > > > > > >> On Thu, May 19, 2011 at 5:05 AM, Eamon Roque <eamon.ro...@lex-com.net > > >wrote: > > > > >> Hi, > > >> > > >> I've put together a cluster of two nodes running a databank without > > shared > > >> storage. Both nodes replicate data between them, which is taken care of > > by > > >> the databank itself. > > >> > > >> I have a resource for the databank and ip. I then created a stateful > > clone > > >> from the databank resource. I created colocation rules joining the > > >> databank-ms-clone and ip: > > >> > > >> node pgsqltest1 > > >> node pgsqltest2 > > >> primitive Postgres-IP ocf:heartbeat:IPaddr2 \ > > >> params ip="10.19.57.234" cidr_netmask="32" \ > > >> op monitor interval="30s" \ > > >> meta is-managed="false" > > >> primitive resPostgres ocf:heartbeat:pgsql \ > > >> params pgctl="/opt/PostgreSQL/9.0/bin/pg_ctl" > > >>pgdata="/opt/PostgreSQL/9.0/data" psql="/opt/PostgreSQL/9.0/bin/psql" > > >> pgdba="postgres" \ > > >> op monitor interval="1min" \ > > >> meta is-managed="false" > > >> ms msPostgres resPostgres \ > > >> meta master-max="1" master-node-max="1" clone-max="2" > > >> clone-node-max="1" notify="true" target-role="started" > > >> colocation colPostgres inf: Postgres-IP msPostgres:Master > > >> order ordPostgres inf: msPostgres:promote Postgres-IP:start > > >> property $id="cib-bootstrap-options" \ > > >> dc-version="1.1.2-2e096a41a5f9e184a1c1537c82c6da1093698eb5" \ > > >> cluster-infrastructure="openais" \ > > >> expected-quorum-votes="2" \ > > >> stonith-enabled="false" \ > > >> no-quorum-policy="ignore" \ > > >> last-lrm-refresh="1302707146" > > >> rsc_defaults $id="rsc-options" \ > > >> resource-stickiness="200" > > >> op_defaults $id="op_defaults-options" \ > > >> record-pending="false" > > >> > > >> The normal postgres agent doesn't support this functionality, but I've > > put > > >> together my own using the mysql agent as a model. Before running the > > script > > >> through ocf-tester, I unmanage the postgres resource. > > >> > > > > > Could you show how you implemented promote/demote for pgsql? > > > > Sure, let's start with the ultra-simple "promote" function: > > > > # > > # These variables are higher up in the file, but they will probably help > > with understanding the error of > > # my ways. > > > > CRM_MASTER="${HA_SBIN_DIR}/crm_master" > > ATTRD_UPDATER="${HA_SBIN_DIR}/attrd_updater" > > > > pgsql_promote() { > > local output > > local rc > > local CHECK_PG_SQL > > local COMPLETE_STANDBY_QUERY > > local PROMOTE_SCORE_HIGH > > local MOD_PSQL_M_FORMAT > > > > > > PROMOTE_SCORE_HIGH=1000 > > CHECK_PG_SQL="SELECT pg_is_in_recovery()" > > MOD_PSQL_M_FORMAT="$OCF_RESKEY_psql -Atc" > > COMPLETE_STANDBY_QUERY="$MOD_PSQL_M_FORMAT \"$CHECK_PG_SQL\"" > > > > output=$(su - $OCF_RESKEY_pgdba -c "$COMPLETE_STANDBY_QUERY" 2>&1) > > echo $output > > > > rc=$? > > > > case $output in > > f) > > ocf_log debug "PostgreSQL Node is running in Master > > mode..." > > return $OCF_RUNNING_MASTER > > ;; > > > > t) > > ocf_log debug "PostgreSQL Node is in Hot_Standby > > mode..." > > return $OCF_SUCCESS > > ;; > > > > *) > > ocf_log err "Critical error in $CHECK_PG_SQL: > > $output" > > return $OCF_ERR_GENERIC > > ;; > > esac > > > > # > > # "Real" promotion is handled here. > > # The trigger file is created and we check for "recovery.conf" on the host. > > # If we can't find it, then the file will be copied from the HA-Config into > > postgres' data folder. > > # > > > > if ! touch $OCF_RESKEY_trigger_file; then > > ocf_log err "$OCF_RESKEY_trigger_file could not be created!" > > return $OCF_ERR_GENERIC > > fi > > > > if [ ! -f $OCF_RESKEY_recovery_conf ]; then > > ocf_log err "$OCF_RESKEY_recovery_conf doesn't exist!" > > cp $OCF_RESKEY_recovery_conf_ersatz $OCF_RESKEY_pgdata > > return $OCF_SUCCESS > > fi > > > Why do you need this? As far as I know when you switch standby database to > primary using trigger file recovery.conf gets renamed to recovery.done. If > you rename it back DB will be put into standby mode after restart.We are > talking about streaming replication, right? > > Right. The order is wrong. According to the Binary Replication tutorial on the postgres wiki, when I perform a failover with a trigger file, it wants to find a "recovery.conf", which it then processes (checking the archive for missing updates etc.) and renames (after noticing the trigger file).
I assumed that this would work in exactly the same way with Streaming Replication. Am I wrong? > > > > > > # If both file exist or can be created, then the failover fun can start. > > > > ocf_log info "$OCF_RESKEY_trigger_file was created." > > ocf_log info "$OCF_RESKEY_recovery_conf exists and can be copied to the > > correct location." > > > > # Sometimes, the master needs a bit of time to take the reins. So... > > > > while : > > do > > pgsql_monitor warn > > rc=$? > > > > if [ $rc -eq $OCF_RUNNING_MASTER ]; then > > break; > > fi > > > > ocf_log debug "Postgres Server could not be promoted. Please > > wait..." > > > > sleep 1 > > > > done > > > > ocf_log info "Postgres Server has been promoted. Please check on the > > previous master." > > > > ################################# > > #Attributes Update: # > > ################################# > > > > $ATTRD_UPDATER -n $PGSQL_STATUS_NAME -v \"PRI\" || exit $(echo "Eh! > > Attrd_updater is not working!") > > > > ############################################# > > # Resource stickiness pumped up to 1000 : # > > ############################################# > > > > $CRM_MASTER -v $PROMOTE_WERT_HOCH || exit $(echo "crm_master could not > > change the Master's status!") > > > > ############ > > # Success! # > > ############ > > > > return $OCF_SUCCESS > > > > } > > > > > > > ###################################################################################################### > > > > Thanks! > > > > > And what about demote? Switching standby into primary using trigger files > changes TIMELINE in the DB and that invalidates all other standby databases > as well as previous master database. After that you have to restore them > from a fresh backup made on new master. This particular behavior stopped me > from implementing Master/Slave functionality in pgsql RA so far. > > BTW, why pgsql is set to is-managed="false" in your configuration.With this > setting cluster will keep monitoring it but won't take any other actions > AFAIK. Demote? Well, seeing as neither promote nor demote actually worked for me, I thought I would start small. As far as the trigger file switching goes, you're of course completely right. This behavior isn't really a big deal in my environment, as it's meant as more of test and we want to bring back the demoted servers up manually, but I can see that it would cause a lot of problems in a more complex environment. When I tested the failover functionality without pacemaker, I have to perform a fresh backup even if I waited less than 30s to bring the old master back up as a standby. I guess that with 9.1 this will be easier... I unmanaged the resources so that my test agent would handle them. Is this incorrect? > > > ?amon > > > > > > > > >> Unfortunately, promote/demote doesn't work. ocf-tester tries to use the > > >> "crm_attribute -N pgsql1 -n master-pgrql-replication-agent -l reboot -v > > >> 100", but the (unmanaged) resources don't accept the score change. > > >> > > >> I'm pretty sure that I just need to be hit with a clue stick and would > > be > > >> grateful for any help. > > >> > > >> Thanks, > > >> > > >> ?amon > > >> > > > > > > > > -- > > Serge Dubrouski. > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > > > -- > Serge Dubrouski. > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: <http://oss.clusterlabs.org/pipermail/pacemaker/attachments/ > 20110520/e1f26230/attachment.html> > > ------------------------------ > > _______________________________________________ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > End of Pacemaker Digest, Vol 42, Issue 53 > *****************************************
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker