14.02.2013 10:03, Takatoshi MATSUO пишет: > Hi > > 2013/2/13 Andrew <ni...@seti.kr.ua>: >> 12.02.2013 02:35, Takatoshi MATSUO пишет: >> >>> Hi >>> >>> 2013/2/9 Andrew <ni...@seti.kr.ua>: >>>> Hi all. >>>> For what reason is implemented PGSQL.lock in RA, and what pbs may happen >>>> if >>>> it'll be removed from RA code? >>> It may cause data inconsistency. >>> If the file exists in a node, you need to copy data from new master. >> I noticed that during master migration lock still remains and postgresql >> isn't started on old master; demote also will fail with lock file. Also, if >> cluster fails (for ex., power failure occurs), old master will not start, >> and slave after startup will be promoted to master - it's OK when both nodes >> are crashed simultaneously, and it's really bad when old slave was crashed >> earlier. If postgres crashed/killed by OOM/etc - it also will not be >> restarted... > The existence of lock file dose not necessarily mean that data is > inconsistent. > RA can't know detail data status. > > If you know that data is valid, you can delete the lock file and clear > failcount. Really - RA can check last log replay, and choose behaviour (to start old 'master' as master if it's log position is ahead 'old-slave' one, or to fail/try to start as slave and fail if it isn't synced at timeout/to force sync if it's log position is behind 'old-slave' one) >> Maybe it'll be better to watch log files on slave that tries to sync with >> master/to check slave timeline, and if slave can't sync with error that >> timeline differs - to fail it with error (or even to sync with master with >> pg_basebackup - it supports connection to remote server and works quick: >> http://sharingtechknowledge.blogspot.com/2011/12/postgresql-pgbasebackup-forget-about.html >> - example)? >> >> >>>> Also, 2nd question: how I can prevent pgsql RA from promoting master >>>> before >>>> both nodes will brings up OR before timeout is reached (for ex., if 2nd >>>> node >>>> is dead)? >>> You can use xlog_check_count parameter set up with a large number. >>> RA retries comparing data with specified number of times in Slave. >> Thanks; I'll try this. >> >>> Or you can use "target-role" such as below too. >>> ---- >>> ms msPostgresql pgsql \ >>> meta master-max="1" master-node-max="1" clone-max="2" >>> clone-node-max="1" notify="true" target-role="Slave" >>> --- >> In that case, how can I choose on what node I should promote resource to >> master (which has fresher WAL position) - I should do this manually, or I >> can just run promote? >> > In master/slave configuration, RA decides which node can be promoted > using master-score > and Pacemaker promotes it considering "colocation", "order", "rule" and so on. > So you can't promote it manually. > > But as far as pgsql RA goes, you can do it such as below > > 1. stop all pacemakers > 2. clear all settings of pacemaker such as "rm > /var/lib/heartbeat/crm/cib*" in both nodes. > 3. start pacemaker in one server which should be Master. > -> RA certainly increments master-score in Slave and PostgreSQL is promoted > because there is no pgsql-data-status and no other node. > Ok, thanks. I'm not too familiar with pacemaker, so some operation details are still hidden from me.
But for master migration there is much easier solution: to migrate collocated master IP. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org