Bonjour à tous, Pour un projet que je compte présenter aux cours, j’expérimente DRBD et heartbeat pour approcher les bases de la haute disponibilité. Je suis un débutant sous Linux et encore plus sous Debian ( j'ai commencé avec Ubuntu). J’ai suivis plusieurs tutoriaux pour mettre en place DRBD et heartbeat. DRBD ne pose pas de problèmes. Par contre heartbeat, arg, ca parrait simple mais rien ne marche. De plus je fais la configuration en style V1, donc c’est facilement lisible.
Lorceque je lance heartbeat via: Hello everyone, I try to experiment DRBD and heartbeat. I am a beginner in Linux. I followed several tutorials to develop DRBD and heartbeat. DRBD works perfectly. Also I am setting V1 style, so it is easily readable. When I start heartbeat via: /etc/init.d/heartbeat restart I get to folow message: Stopping High-Availability services: Done. Waiting to allow resource takeover to complete: Done. Starting High-Availability services: 2010/05/31_22:12:33 INFO: Resource is stopped Done. And the ip alias is not created My stucture look like this LAN0 sur eth0: 192.168.0.0 /24 # Lan users + heartbeat LAN1 sur eth1: 192.168.1.0 /30 # Lan DRBD: is working. LAN2 sur eth2: 192.168.2.0 /28 # Lan apps servers: Not yet used. Heartbeat have 2 node: frontal1 and frontal2 Frontal1| eth1------DRBD------eth1 | frontal2 ------------ ------------ eth0 _____________________eth0 | | |______________________ | frontal1 eth0: 192.168.0.2 frontal2 eth0: 192.168.0.3 heartbeat ip alias eth0:0: 192.168.0.1 frontal1 eth1: 192.168.1.1 frontal2 eth2: 192.168.1.2 My sotware are Debian 5.0 Lenny and heartbeat 2.1.3-6lenny4 I folowed the folow guides without any succes http://howtoforge.net/highly-available-nfs-server-using-drbd-and-heartbeat-on-debian-5.0-lenny http://doc.ubuntu-fr.org/tutoriel/mirroring_sur_deux_serveurs http://www.drbd.org/users-guide/ch-heartbeat.html http://www.linux-ha.org/doc/ Here you have my logs, commands result When I do a BasicSanityCheck (2) I see a problem with IPaddr But when I launch manualy the script Ipaddr ou Ipaddr2 the ip alias is created and avaliable on the network. I looked on a few forum about the subject, and I don't find any solution on my problem Thanks for your help (1) vim /etc/ha.d/ha.cf [/b] Code: autojoin none mcast eth0 239.0.0.43 694 1 0 warntime 5 deadtime 5 initdead 15 keepalive 2 node frontal1 node frontal2 (2) sh /usr/share/heartbeat/BasicSanityCheck Code: RTNETLINK answers: Network is unreachable Using interface: eth0 Should not run tests with heartbeat already running. Starting base64 and md5 algorithm tests base64 and md5 algorithm tests succeeded. Starting Resource Agent tests Testing RA: Dummy Testing RA: IPaddr ERROR: IPaddr RA failed Starting IPC tests That's weird. Heartbeat seems to be running... Stopping heartbeat Stopping High-Availability services: Done. Starting heartbeat Starting High-Availability services: 2010/05/31_22:16:04 INFO: Resource is stopped Done. Does not look like we ARPed the address Looks like monitor operation failed Reloading heartbeat Reloading heartbeat Stopping heartbeat Stopping High-Availability services: Done. Checking STONITH basic sanity. Performing apphbd success case tests Performing apphbd failure case tests Starting LRM tests Starting heartbeat Starting High-Availability services: 2010/05/31_22:18:25 INFO: Resource is stopped Done. (3)sh /usr/share/heartbeat/ResourceManager listkeys frontal1 192.168.0.1 (4)sh /usr/share/heartbeat/ResourceManager listkeys frontal2 (5)ip addr show Code: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:cb:86:45 brd ff:ff:ff:ff:ff:ff inet 192.168.0.2/24 brd 192.168.0.255 scope global eth0 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:cb:86:4f brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/30 brd 192.168.1.3 scope global eth1 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 00:0c:29:cb:86:59 brd ff:ff:ff:ff:ff:ff (6)/etc/ha.d/resource.d/IPaddr 192.168.0.1 start Code: 2010/05/31_22:30:37 INFO: Success (7)/etc/ha.d/resource.d/IPaddr2 192.168.0.1 start Code: 2010/05/31_22:30:24 INFO: Using calculated nic for 192.168.0.1: eth0 2010/05/31_22:30:24 INFO: Using calculated netmask for 192.168.0.1: 255.255.255.0 2010/05/31_22:30:25 INFO: eval ifconfig eth0:0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255 2010/05/31_22:30:25 INFO: Success INFO: Success (8) cat /etc/ha.d/haresources frontal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255 OU frontal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server (9) cat /var/log/heartbeat/log Code: heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated 'legacy' auto_failback opt ion selected. heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to 'auto_failback on'. heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta ils. heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[7179]: 2010/05/31_22:38:40 info: ************************** heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea rtbeat 2.1.3 heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3 heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613 heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0) heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign al manual handler heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign al manual handler heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa l handler for signal 17 heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: 'up' heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up. heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat [7m--More-- [27m us active harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act ive heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: 'active' IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed. frontal1:~# cat /var/log/heartbeat/log|more heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated 'legacy' auto_failback opt ion selected. heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to 'auto_failback on'. heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta ils. heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling logging daemon is recommended heartbeat[7179]: 2010/05/31_22:38:40 info: ************************** heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea rtbeat 2.1.3 heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3 heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613 heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0) heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign al manual handler heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign al manual handler heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa l handler for signal 17 heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: 'up' heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up. heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat [7m--More-- [27m us active harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act ive heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: 'active' IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed. harc[7337]: 2010/05/31_22:39:06 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[7337]: 2010/05/31_22:39:06 received ip-request-resp IPaddr2::19 2.168.0.1/24/eth0/192.168.0.255 OK no ResourceManager[7356]: 2010/05/31_22:39:06 info: Acquiring resource group: fron tal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255 IPaddr2[7382]: 2010/05/31_22:39:06 INFO: Resource is stopped ResourceManager[7356]: 2010/05/31_22:39:06 info: Running /etc/ha.d/resource.d/I Paddr2 192.168.0.1/24/eth0/192.168.0.255 start IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip -f inet addr add 192.168.0.1/24 brd 192.168.0.255 dev eth0 IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip link set eth0 up IPaddr2[7491]: 2010/05/31_22:39:07 INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.0.1 eth0 192.168.0.1 au to not_used not_used IPaddr2[7462]: 2010/05/31_22:39:07 INFO: Success heartbeat[7180]: 2010/05/31_22:39:07 info: Initial resource acquisition complete [7m--More-- [27m (ip-request-resp) harc[7549]: 2010/05/31_22:39:07 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[7549]: 2010/05/31_22:39:07 received ip-request-resp drbddisk::r 0 OK no ResourceManager[7568]: 2010/05/31_22:39:07 info: Acquiring resource group: fron tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/d rbddisk r0 start Filesystem[7633]: 2010/05/31_22:39:07 INFO: Resource is stopped ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/F ilesystem /dev/drbd1 /serveur ext3 start Filesystem[7711]: 2010/05/31_22:39:07 INFO: Running start for /dev/drbd1 o n /serveur Filesystem[7700]: 2010/05/31_22:39:07 INFO: Success ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/init.d/dhcp3-serv er start ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa start ResourceManager[7568]: 2010/05/31_22:39:09 ERROR: Return code 71 from /etc/init .d/tftpd-hpa ResourceManager[7568]: 2010/05/31_22:39:09 CRIT: Giving up resources due to fai lure of tftpd-hpa ResourceManager[7568]: 2010/05/31_22:39:09 info: Releasing resource group: fron [7m--More-- [27m tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa stop ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/dhcp3-serv er stop ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/ha.d/resource.d/F ilesystem /dev/drbd1 /serveur ext3 stop Filesystem[7898]: 2010/05/31_22:39:09 INFO: Running stop for /dev/drbd1 on /serveur Filesystem[7898]: 2010/05/31_22:39:09 INFO: Trying to unmount /serveur Filesystem[7898]: 2010/05/31_22:39:10 INFO: unmounted /serveur successfull y Filesystem[7887]: 2010/05/31_22:39:10 INFO: Success ResourceManager[7568]: 2010/05/31_22:39:10 info: Running /etc/ha.d/resource.d/d rbddisk r0 sto _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems