>>> Andrew Beekhof <and...@beekhof.net> schrieb am 26.11.2012 um 01:39 in Nachricht <caedlwg0cimdhxernd_7h1gm3-bwhpgd7bwz1lgmr9d_hemo...@mail.gmail.com>: > On Fri, Nov 23, 2012 at 3:08 AM, Rafał Radecki <radecki.ra...@gmail.com> wrote: > > Hi all. > > > > I am currently making a Pacemaker/Corosync cluster which serves Tomcat > > resource in master/slave mode. This Tomcat serves Solr java application. > > My configuration is: > > > > node storage1 > > node storage2 > > > > primitive TSVIP ocf:heartbeat:IPaddr2 \ > > params ip="192.168.100.204" cidr_netmask="32" nic="eth0" \ > > op monitor interval="30s" > > > > primitive TomcatSolr ocf:polskapresse:tomcat6 \ > > op start interval="0" timeout="60" on-fail="stop" \ > > op stop interval="0" timeout="60" on-fail="stop" \ > > op monitor interval="31" role="Slave" timeout="60" on-fail="stop" \ > > op monitor interval="30" role="Master" timeout="60" on-fail="stop" > > > > ms TomcatSolrClone TomcatSolr \ > > meta master-max="1" master-node-max="1" clone-max="2" > > clone-node-max="1" notify="false" globally-unique="true" ordered="false" > > target-role="Master" > > > > colocation TomcatSolrClone_with_TSVIP inf: TomcatSolrClone:Master > > TSVIP:Started > > order TomcatSolrClone_after_TSVIP inf: TSVIP:start TomcatSolrClone:promote > > > > property $id="cib-bootstrap-options" \ > > dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ > > cluster-infrastructure="openais" \ > > expected-quorum-votes="4" \ > > stonith-enabled="false" \ > > no-quorum-policy="ignore" \ > > symmetric-cluster="true" \ > > default-resource-stickiness="1" \ > > last-lrm-refresh="1353594420" > > rsc_defaults $id="rsc-options" \ > > resource-stickiness="10" \ > > migration-threshold="1000000 > > > > So logically I have: > > - one node with TSVIP and TomcatSolrClone Master; > > - one node with TomcatSolrClone Slave. > > I have set up replication beetwen Solr on TomcatSolrClone Master and Slave > > and written an ocf agent (attached). > > Few moments ago when I killed the Slave resource with 'pkill java' the > > resource was restarted on the same node despite the fact that the monitor > > action returned $OCF_ERROR_GENERIC and I have on-fail="stop" for TomcatSolr > > set (I have also tried "block" with same effect). > > > > Then I have added a migration threshold: > > > > ms TomcatSolrClone TomcatSolr \ > > meta master-max="1" master-node-max="1" clone-max="2" > > clone-node-max="1" notify="false" globally-unique="true" ordered="false" > > target-role="Started" \ > > params migration-threshold="1" > > > > and now when I kill java on Slave it does not start anymore (the Master is > > ok). But when I then kill java on Master (no resource running on both > > nodes) everything gets restarted by the cluster and Master and Slave are > > running afterwards.
Hi! May I guess?: Slave want to migrate after 1 failure, but may not start on master's node. If master also fails, master migrates to slave's node, and then slave is OK to start on master's former node. > > How to stop this restart when Slave and Master both fail? Why would you want to stop restarts? Regards, Ulrich > > Could you file a bug (https://bugs.clusterlabs.org) for this and > include a crm_report for your testcase? > Its likely that you've hit a bug. > > > > > Best regards, > > Rafal. > > > > _______________________________________________ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems