Wouldn't it be easier then to use brdb on the msg disk, so you dont have to move the lvm over to a new node?
On 05/21/2010 12:14 PM, Gabriele Paciucci wrote: > Hi, > be carefoul with LVM, you should import and export the volume when you > try to mount from one machine to an other!!!! > > please refer to: http://kbase.redhat.com/faq/docs/DOC-4124 > > > On 05/21/2010 11:57 AM, leen smit wrote: > >> Ok. I started from scratch, using your kind replies as a guide line. >> Yet, still no fail over when brining down the first MGS. >> Below are the steps I've taken to setup, hopefully some one here can >> spot my err. >> I got rid of keepalived and drbd (was this wise? or should I keep this >> for the MGS/MDT syncing?) and setup just Lustre. >> >> Two nodes vor MGS/MDT, and two nodes for OSTs. >> >> >> fs-mgs-001:~# mkfs.lustre --mgs --failnode=fs-mgs-...@tcp --reformat >> /dev/VG1/mgs >> fs-mgs-001:~# mkfs.lustre --mdt --mgsnode=fs-mgs-...@tcp >> --failnode=fs-mgs-...@tcp --fsname=datafs --reformat /dev/VG1/mdt >> fs-mgs-001:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/ >> fs-mgs-001:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/ >> >> >> > > >> fs-mgs-002:~# mkfs.lustre --mgs --failnode=fs-mgs-...@tcp --reformat >> /dev/VG1/mgs >> fs-mgs-002:~# mkfs.lustre --mdt --mgsnode=fs-mgs-...@tcp >> --failnode=fs-mgs-...@tcp --fsname=datafs --reformat /dev/VG1/mdt >> fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/ >> fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/ >> >> >> > this is an error ^.. don't do it!!! > > >> fs-ost-001:~# mkfs.lustre --ost --mgsnode=fs-mgs-...@tcp >> --mgsnode=fs-mgs-...@tcp --failnode=fs-ost-...@tcp --reformat >> --fsname=datafs /dev/VG1/ost1 >> fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ >> >> >> > > >> fs-ost-002:~# mkfs.lustre --ost --mgsnode=fs-mgs-...@tcp >> --mgsnode=fs-mgs-...@tcp --failnode=fs-ost-...@tcp --reformat >> --fsname=datafs /dev/VG1/ost1 >> fs-ost-002:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ >> >> >> >> > this is an error ^.. don't do it!!! > > the correct way is (WARNING: please use the IP address) : > > fs-mgs-001:~# mkfs.lustre --mgs --failnode=fs-mgs-...@tcp --reformat > /dev/VG1/mgs > fs-mgs-001:~# mount -t lustre /dev/VG1/mgs /mnt/mgs/ > > fs-mgs-001:~# mkfs.lustre --mdt --mgsnode=fs-mgs-...@tcp > --failnode=fs-mgs-...@tcp --fsname=datafs --reformat /dev/VG1/mdt > fs-mgs-001:~# mount -t lustre /dev/VG1/mdt /mnt/mdt/ > > fs-ost-001:~# mkfs.lustre --ost --mgsnode=fs-mgs-...@tcp > --failnode=fs-ost-...@tcp --reformat --fsname=datafs /dev/VG1/ost1 > fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ > > yust this, nothing to do on the second node!!! > > mount -t lustre fs-mgs-...@tcp:fs-mgs-...@tcp:/datafs /data > Bye > > > > >> fs-mgs-001:~# lctl dl >> 0 UP mgs MGS MGS 7 >> 1 UP mgc mgc192.168.21...@tcp 5b8fb365-ae8e-9742-f374-539d8876276f 5 >> 2 UP mgc mgc127.0....@tcp 380bc932-eaf3-9955-7ff0-af96067a2487 5 >> 3 UP mdt MDS MDS_uuid 3 >> 4 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 >> 5 UP mds datafs-MDT0000 datafs-MDT0000_UUID 5 >> 6 UP osc datafs-OST0000-osc datafs-mdtlov_UUID 5 >> 7 UP osc datafs-OST0001-osc datafs-mdtlov_UUID 5 >> >> fs-mgs-001:~# lctl list_nids >> 192.168.21...@tcp >> >> >> client:~# mount -t lustre 192.168.21...@tcp:192.168.21...@tcp:/datafs /data >> client:~# time cp test.file /data/ >> real 0m47.793s >> user 0m0.001s >> sys 0m3.155s >> >> So far, so good. >> >> >> Lets try that again, now bringing down mgs-001 >> >> client:~# time cp test.file /data/ >> >> fs-mgs-001:~# umount /mnt/mdt&& umount /mnt/mgs >> >> fs-mgs-002:~# mount -t lustre /dev/VG1/mgs /mnt/mgs >> fs-mgs-002:~# mount -t lustre /dev/VG1/mdt /mnt/mdt >> fs-mgs-002:~# lctl dl >> 0 UP mgs MGS MGS 5 >> 1 UP mgc mgc192.168.21...@tcp 82b34916-ed89-f5b9-026e-7f8e1370765f 5 >> 2 UP mdt MDS MDS_uuid 3 >> 3 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 >> 4 UP mds datafs-MDT0000 datafs-MDT0000_UUID 3 >> >> Missing the OSTs here, so I (try to..) remount these too >> >> fs-ost-001:~# umount /mnt/ost/ >> fs-ost-001:~# mount -t lustre /dev/VG1/ost1 /mnt/ost/ >> mount.lustre: mount /dev/mapper/VG1-ost1 at /mnt/ost failed: No such >> device or address >> The target service failed to start (bad config log?) >> (/dev/mapper/VG1-ost1). See /var/log/messages. >> >> >> After this I can only get back to a running state by umounting >> everything on the mgs-002, and remount on the mgs-001 >> What am I missing here?? Am I messing things up by creating two mgs, one >> on each mgs node? >> >> >> Leen >> >> >> >> On 05/20/2010 03:40 PM, Gabriele Paciucci wrote: >> >> >>> For a clearification in a two servers configuration: >>> >>> server1 -> 192.168.2.20 MGS+MDT+OST0 >>> server2 -> 192.168.2.22 OST1 >>> /dev/sdb is a lun shared between server1 and server 2 >>> >>> from server1: mkfs.lustre --mgs --failnode=192.168.2.22 --reformat /dev/sdb1 >>> from server1: mkfs.lustre --reformat --mdt --mgsnode=192.168.2.20 >>> --fsname=prova --failover=192.168.2.22 /dev/sdb4 >>> from server1: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20 >>> --failover=192.168.2.22 --fsname=prova /dev/sdb2 >>> from server2: mkfs.lustre --reformat --ost --mgsnode=192.168.2.20 >>> --failover=192.168.2.20 --fsname=prova /dev/sdb3 >>> >>> >>> from server1: mount -t lustre /dev/sdb1 /lustre/mgs_prova >>> from server1: mount -t lustre /dev/sdb4 /lustre/mdt_prova >>> from server1: mount -t lustre /dev/sdb2 /lustre/ost0_prova >>> from server2: mount -t lustre /dev/sdb3 /lustre/ost1_prova >>> >>> >>> from client: >>> modprobe lustre >>> mount -t lustre 192.168.2...@tcp:192.168.2...@tcp:/prova /prova >>> >>> now halt server1 and mount MGS, MDT and OST0 on server2, the client >>> should continue the activity without problem >>> >>> >>> >>> On 05/20/2010 02:55 PM, Kevin Van Maren wrote: >>> >>> >>> >>>> leen smit wrote: >>>> >>>> >>>> >>>> >>>>> Ok, no VIP's then.. But how does failover work in lustre then? >>>>> If I setup everything using the real IP and then mount from a client and >>>>> bring down the active MGS, the client will just sit there until it comes >>>>> back up again. >>>>> As in, there is no failover to the second node. So how does this >>>>> internal lustre failover mechanism work? >>>>> >>>>> I've been going trought the docs, and I must say there is very little on >>>>> the failover mechanism, apart from mentions that a seperate app should >>>>> care of that. Thats the reason I'm implementing keepalived.. >>>>> >>>>> >>>>> >>>>> >>>>> >>>> Right: the external service needs to keep the "mount" active/healthy on >>>> one of the servers. >>>> Lustre handles reconnecting clients/servers as long as the volume is >>>> mounted where it expects >>>> (ie, the mkfs node or the --failover node). >>>> >>>> >>>> >>>> >>>>> At this stage I really am clueless, and can only think of creating a TUN >>>>> interface, which will have the VIP address (thus, it becomes a real IP, >>>>> not just a VIP). >>>>> But I got a feeling that ain't the right approach either... >>>>> Is there any docs available where a active/passive MGS setup is described? >>>>> Is it sufficient to define a --failnode=nid,... at creation time? >>>>> >>>>> >>>>> >>>>> >>>>> >>>> Yep. See Johann's email on the MGS, but for the MDTs and OSTs that's >>>> all you have to do >>>> (besides listing both MGS NIDs at mkfs time). >>>> >>>> For the clients, you specify both MGS NIDs at mount time, so it can >>>> mount regardless of which >>>> node has the active MGS. >>>> >>>> Kevin >>>> >>>> >>>> >>>> >>>> >>>>> Any help would be greatly appreciated! >>>>> >>>>> Leen >>>>> >>>>> >>>>> On 05/20/2010 01:45 PM, Brian J. Murrell wrote: >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On Thu, 2010-05-20 at 12:46 +0200, leen smit wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Keepalive uses a VIP in a active/passive state. In a failover situation >>>>>>> the VIP gets transferred to the passive one. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> Don't use virtual IPs with Lustre. Lustre clients know how to deal with >>>>>> failover nodes that have different IP addresses and using a virtual, >>>>>> floating IP address will just confuse it. >>>>>> >>>>>> b. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> Lustre-discuss@lists.lustre.org >>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>>> >>>>> >>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss@lists.lustre.org >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>>> >>>> >>>> >>>> >>> >>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss@lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> > > _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss