2013/7/3 Andrey Groshev <gre...@yandex.ru>: > > > 03.07.2013, 16:26, "Takatoshi MATSUO" <matsuo....@gmail.com>: >> Hi Andrey >> >> 2013/7/3 Andrey Groshev <gre...@yandex.ru>: >> >>> 03.07.2013, 06:43, "Takatoshi MATSUO" <matsuo....@gmail.com>: >>>> Hi Stefano >>>> >>>> 2013/7/2 Stefano Sasso <stesa...@gmail.com>: >>>>> Hello folks, >>>>> I have the following setup in mind, but I need some advice and one >>>>> hint on >>>>> how to realize a particular function. >>>>> >>>>> I have a N (>= 2) nodes cluster, with data storage on postgresql. >>>>> I would like to manage postgres master-slave replication in this way: >>>>> one >>>>> node is the "master", one is the "slave", and the others are "standby" >>>>> nodes. >>>>> If the master fails, the slave becomes the master, and one of the >>>>> standby >>>>> becomes the slave. >>>>> If the slave fails, one of the standby becomes the new slave. >>>> Does "standby" mean that PostgreSQL is stopped ? >>>> If Master doesn't have WAL files which new slave needs, >>>> new slave can't connect master. >>>> >>>> How do you solve it ? >>>> copy data or wal-archive on start automatically ? >>>> It may cause timed-out if PostgreSQL has large database. >>>>> If one of the "standby" fails, no problem :) >>>>> I can correctly manage this configuration with ms and a custom script >>>>> (using >>>>> ocf:pacemaker:Stateful as example). If the cluster is already >>>>> operational, >>>>> the failover works fine. >>>>> >>>>> My problem is about cluster start-up: in fact, only the previous running >>>>> master and slave own the most updated data; so I would like that the new >>>>> master should be the "old master" (or, even, the old slave), and the new >>>>> slave should be the "old slave" (but this one is not mandatory). The >>>>> important thing is that the new master should have up-to-date data. >>>>> This should happen even if the servers are booted up with some minutes >>>>> of >>>>> delay between them. (users are very stupid sometimes). >>>> Latest pgsql RA embraces these ideas to manage replication. >>>> >>>> 1. First boot >>>> RA compares data and promotes PostgreSQL which has latest data. >>>> The number of comparison can be changed using xlog_check_count parameter. >>>> If monitor interval is 10 sec and xlog_check_count is 360, RA can wait >>>> 1 hour to promote :) >>> But in this case, when master dies, election a new master will continue >>> one hour too. >>> Is that right? >> >> No, if slave's data is up to date, master changes slave's master-score. >> So pacemaker stops master and promote slave immediately when master dies. >> > > Wait.... in function have_master_right. > > ....snip.... > # get xlog locations of all nodes > for node in ${NODE_LIST}; do > output=`$CRM_ATTR_REBOOT -N "$node" -n \ > "$PGSQL_XLOG_LOC_NAME" -G -q 2>/dev/null` > ....snip.... > if [ "$new" -ge "$OCF_RESKEY_xlog_check_count" ]; then > newestXlog=`printf "$newfile\n" | sort -t " " -k 2,3 -r | \ > head -1 | cut -d " " -f 2` > if [ "$newestXlog" = "$mylocation" ]; then > ocf_log info "I have a master right." > $CRM_MASTER -v $PROMOTE_ME > return 0 > fi > change_data_status "$NODENAME" "DISCONNECT" > ocf_log info "I don't have correct master data." > # reset counter > rm -f ${XLOG_NOTE_FILE}.* > printf "$newfile\n" > ${XLOG_NOTE_FILE}.0 > fi > > return 1 > } > > As I understand, check xlog on all nodes $OCF_RESKEY_xlog_check_count more > times. > And call this function from pgsql_replication_monitor - and she has in turn > from pgsql_monitoring. > That is, while "monitoring" will not be called again > $OCF_RESKEY_xlog_check_count have_master..... not return true. > I remember the entire structure of your code in memory :) > Or am I wrong?
have_master_right() doesn't change master score. So PostgreSQL is promoted immediately if slave has master-score > 0 regardress of return-code of have_master_right(). Note that it makes an exception when using rep_mode=async and the number of nodes >= 3 because RA can not know which node should be promoted. control_slave_status() ------------------------------------------------------------------ if [ $number_of_nodes -le 2 ]; then change_master_score "$target" "$CAN_PROMOTE" else # I can't determine which slave's data is newest in async mode. change_master_score "$target" "$CAN_NOT_PROMOTE" fi ------------------------------------------------------------------ > > >>>> 2. Second boot >>>> Master manages slave's data using attribute with "-l forever" option. >>>> So RA can't start PostgreSQL, if the node has no latest data. >>>>> My idea is the following: >>>>> the MS resource is not started when the cluster comes up, but on startup >>>>> there will only be one "arbitrator" resource (started on only one node). >>>>> This resource reads from somewhere which was the previous master and the >>>>> previous slave, and it wait up to 5 minutes to see if one of them comes >>>>> up. >>>>> In positive case, it forces the MS master resource to be run on that >>>>> node >>>>> (and start it); in negative case, if the wait timer expired, it start >>>>> the >>>>> master resource on a random node. >>>>> >>>>> Is that possible? How can avoid a single resource to start on cluster >>>>> boot? >>>>> Or, could you advise another way to do this setup? >>>>> >>>>> I hope I was clear, my english is not so good :) >>>>> thank you so much, >>>>> stefano >>>>> >>>>> -- >>>>> Stefano Sasso >>>>> http://stefano.dscnet.org/ >>>> Regards, >>>> Takatoshi MATSUO _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org