Ok. I started from scratch, using your kind replies as a guide line. Yet, still no fail over when brining down the first MGS. Below are the steps I've taken to setup, hopefully some one here can spot my err. I got rid of keepalived and drbd (was this wise? or should I keep this for the MGS/MDT syncing?) and setup just Lustre.
Two nodes vor MGS/MDT, and two nodes for OSTs. fs-mgs-001:~# mkfs.lustre --mgs --failnode=fs-mgs-...@tcp --reformat /dev/VG1/mgs fs-mgs-001:~# mkfs.lustre --mdt --mgsnode=fs-mgs-...@tcp --failnode=fs-mgs-...@tcp --fsname=datafs --reformat /dev/VG1/mdt fs-mgs-001:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/ fs-mgs-001:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/ fs-mgs-002:~# mkfs.lustre --mgs --failnode=fs-mgs-...@tcp --reformat /dev/VG1/mgs fs-mgs-002:~# mkfs.lustre --mdt --mgsnode=fs-mgs-...@tcp --failnode=fs-mgs-...@tcp --fsname=datafs --reformat /dev/VG1/mdt fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/ fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/ fs-ost-001:~# mkfs.lustre --ost --mgsnode=fs-mgs-...@tcp --mgsnode=fs-mgs-...@tcp --failnode=fs-ost-...@tcp --reformat --fsname=datafs /dev/VG1/ost1 fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ fs-ost-002:~# mkfs.lustre --ost --mgsnode=fs-mgs-...@tcp --mgsnode=fs-mgs-...@tcp --failnode=fs-ost-...@tcp --reformat --fsname=datafs /dev/VG1/ost1 fs-ost-002:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ fs-mgs-001:~# lctl dl 0 UP mgs MGS MGS 7 1 UP mgc mgc192.168.21...@tcp 5b8fb365-ae8e-9742-f374-539d8876276f 5 2 UP mgc mgc127.0....@tcp 380bc932-eaf3-9955-7ff0-af96067a2487 5 3 UP mdt MDS MDS_uuid 3 4 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 5 UP mds datafs-MDT0000 datafs-MDT0000_UUID 5 6 UP osc datafs-OST0000-osc datafs-mdtlov_UUID 5 7 UP osc datafs-OST0001-osc datafs-mdtlov_UUID 5 fs-mgs-001:~# lctl list_nids 192.168.21...@tcp client:~# mount -t lustre 192.168.21...@tcp:192.168.21...@tcp:/datafs /data client:~# time cp test.file /data/ real 0m47.793s user 0m0.001s sys 0m3.155s So far, so good. Lets try that again, now bringing down mgs-001 client:~# time cp test.file /data/ fs-mgs-001:~# umount /mnt/mdt && umount /mnt/mgs fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt fs-mgs-002:~# lctl dl 0 UP mgs MGS MGS 5 1 UP mgc mgc192.168.21...@tcp 82b34916-ed89-f5b9-026e-7f8e1370765f 5 2 UP mdt MDS MDS_uuid 3 3 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 4 UP mds datafs-MDT0000 datafs-MDT0000_UUID 3 Missing the OSTs here, so I (try to..) remount these too fs-ost-001:~# umount /mnt/ost/ fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ mount.lustre: mount /dev/mapper/VG1-ost1 at /mnt/ost failed: No such device or address The target service failed to start (bad config log?) (/dev/mapper/VG1-ost1). See /var/log/messages. After this I can only get back to a running state by umounting everything on the mgs-002, and remount on the mgs-001 What am I missing here?? Am I messing things up by creating two mgs, one on each mgs node? Leen On 05/20/2010 03:40 PM, Gabriele Paciucci wrote: > For a clearification in a two servers configuration: > > server1 -> 192.168.2.20 MGS+MDT+OST0 > server2 -> 192.168.2.22 OST1 > /dev/sdb is a lun shared between server1 and server 2 > > from server1: mkfs.lustre --mgs --failnode=192.168.2.22 --reformat /dev/sdb1 > from server1: mkfs.lustre --reformat --mdt --mgsnode=192.168.2.20 > --fsname=prova --failover=192.168.2.22 /dev/sdb4 > from server1: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20 > --failover=192.168.2.22 --fsname=prova /dev/sdb2 > from server2: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20 > --failover=192.168.2.20 --fsname=prova /dev/sdb3 > > > from server1: mount -t lustre /dev/sdb1 /lustre/mgs_prova > from server1: mount -t lustre /dev/sdb4 /lustre/mdt_prova > from server1: mount -t lustre /dev/sdb2 /lustre/ost0_prova > from server2: mount -t lustre /dev/sdb3 /lustre/ost1_prova > > > from client: > modprobe lustre > mount -t lustre 192.168.2...@tcp:192.168.2...@tcp:/prova /prova > > now halt server1 and mount MGS, MDT and OST0 on server2, the client > should continue the activity without problem > > > > On 05/20/2010 02:55 PM, Kevin Van Maren wrote: > >> leen smit wrote: >> >> >>> Ok, no VIP's then.. But how does failover work in lustre then? >>> If I setup everything using the real IP and then mount from a client and >>> bring down the active MGS, the client will just sit there until it comes >>> back up again. >>> As in, there is no failover to the second node. So how does this >>> internal lustre failover mechanism work? >>> >>> I've been going trought the docs, and I must say there is very little on >>> the failover mechanism, apart from mentions that a seperate app should >>> care of that. Thats the reason I'm implementing keepalived.. >>> >>> >>> >> Right: the external service needs to keep the "mount" active/healthy on >> one of the servers. >> Lustre handles reconnecting clients/servers as long as the volume is >> mounted where it expects >> (ie, the mkfs node or the --failover node). >> >> >>> At this stage I really am clueless, and can only think of creating a TUN >>> interface, which will have the VIP address (thus, it becomes a real IP, >>> not just a VIP). >>> But I got a feeling that ain't the right approach either... >>> Is there any docs available where a active/passive MGS setup is described? >>> Is it sufficient to define a --failnode=nid,... at creation time? >>> >>> >>> >> Yep. See Johann's email on the MGS, but for the MDTs and OSTs that's >> all you have to do >> (besides listing both MGS NIDs at mkfs time). >> >> For the clients, you specify both MGS NIDs at mount time, so it can >> mount regardless of which >> node has the active MGS. >> >> Kevin >> >> >> >>> Any help would be greatly appreciated! >>> >>> Leen >>> >>> >>> On 05/20/2010 01:45 PM, Brian J. Murrell wrote: >>> >>> >>> >>>> On Thu, 2010-05-20 at 12:46 +0200, leen smit wrote: >>>> >>>> >>>> >>>> >>>>> Keepalive uses a VIP in a active/passive state. In a failover situation >>>>> the VIP gets transferred to the passive one. >>>>> >>>>> >>>>> >>>>> >>>> Don't use virtual IPs with Lustre. Lustre clients know how to deal with >>>> failover nodes that have different IP addresses and using a virtual, >>>> floating IP address will just confuse it. >>>> >>>> b. >>>> >>>> >>>> >>>> >>>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> > > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss