Hello. I have two servers (radius1, radius2). I've set up the cluster resource - IPaddr2. I used next commands to set up this resource:
# crm configure property stonith-enabled="false" # crm configure property no-quorum-policy="ignore" # crm configure primitive raddb_ip ocf:heartbeat:IPaddr2 params ip="10.99.2.57" cidr_netmask="32" op monitor interval="15s" # crm configure group raddb raddb_ip # crm configure location raddb-prefers-radius1 raddb inf: radius1 # crm configure rsc_defaults resource-stickiness=1000001 All ok. But sometimes on server radius1 the load is increasing and server is swapping and at that moment resource becomes "(unmanaged) FAILED". Below I've presented example "unmanaged" resource: # crm_mon ============ Last updated: Wed Dec 7 14:56:20 2011 Stack: openais Current DC: radius1 - partition with quorum Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 2 Nodes configured, 2 expected votes 1 Resources configured. ============ Online: [ radius2 radius1 ] Resource Group: raddb raddb_ip (ocf::heartbeat:IPaddr2): Started radius1 (unmanaged) FAILED Failed actions: raddb_ip_monitor_15000 (node=radius1, call=4, rc=-2, status=Timed Out): unknown exec error raddb_ip_stop_0 (node=radius1, call=5, rc=-2, status=Timed Out): unknown exec error I've presented part of /var/log/syslog (radius1) here - http://paste.org/41963 In that moment ip address 10.99.2.57 is alive and server responds to requests coming to this ip. However sometimes this resource becomes completely unavailable and I restart corosync. It's very bad. I think resource becomes unmanaged because server is using swap and part of corosync processes is in swap. I tested this suggestion and when server is using a lot of swap resource becomes "unmanaged". I use debian gnu/linux 5.x and this packages - http://people.debian.org/~madkiss/ha/: # dpkg -l |grep cluster ii cluster-glue 1.0.7+hg2618-2~bpo50+1 The reusable cluster components for Linux HA ii corosync 1.4.2-1~bpo50+1 Standards-based cluster framework (daemon an ii libcluster-glue 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries (transitional pac ii libcorosync4 1.4.2-1~bpo50+1 Standards-based cluster framework (libraries ii libcrmcluster1 1.1.5-3~bpo50+1 Pacemaker libraries - CRM ii liblrm2 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- liblrm2 ii libpils2 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libpils2 ii libplumb2 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumb2 ii libplumbgpl2 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libplumbgpl2 ii libstonith1 1.0.7+hg2618-2~bpo50+1 Reusable cluster libraries -- libstonith1 ii pacemaker 1.1.5-3~bpo50+1 HA cluster resource manager I can't increase ram on this servers. How can I do that resource isn't becomes "unmanaged/failed" ? With Best Regards. Aleksey V. Kashin _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems