Re: [Lustre-discuss] How smart is Lustre?
In my experience, if there is a particular driver for multipathing from the vendor, go for that. In our setup, we have Oracle/Sun disk arrays and with the standard linux multipathing daemon, I would get lots of weird I/O errors. Turns out the disk arrays had picked their preferred path, but Linux was trying to talk to the LUNs on both paths and would only receive a response on the preferred one. There is an rdac driver that can be installed. Simply disable the multipathing daemon or configure it to ignore the disk arrays and use the vendor solution. I had no more I/O errors(Which only served to slow down the boot up process). On Wed, Dec 19, 2012 at 11:36 AM, Jason Brooks wrote: > Hello, > > I am building a 2.3.x filesystem right now, and I am looking at setting up > some active-active failover abilities to my oss's. I have been looking at > Dell's md3xxx arrays, as they have redundant controllers, and allow up to > four hosts to connect to each controller. > > I can see how linux multi-path can be used with redundant disk > controllers. I can even (slightly) understand how lustre fails over when > an oss goes down. > > >1. Is lustre smart enough to use redundant paths, or failover oss's if >an oss is congested? (it would be cool, no?) >2. Does the linux multi-path module slow performance? >3. How much does a raid array such as the one listed above act as a >bottleneck, say if I have as many volumes available on the raid controllers >as there are oss hosts? >4. Are there arrays similar to Dell's model that would work? > > Thanks! > > --jason > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Errors after restarting: start index not found?
Working with a set of test systems. 4 OST machines / 1 MDT MGT all running 2.6.32-279.2.1.el6_lustre.gc46c389.x86_64 kernel on CentOS 6.3 The systems lost power due to a UFU. When they were brought back up the IP for the MDT changed. In order to get them working I did the steps listed on: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreMaintenance.html The MDT reported everyone up and happy. ‘lctl dl’ shows what I would suspect is ok. Mounting a client seems to work in that pre-existing files are folders are found. *but* when I try to ‘touch tmp’ I get ‘touch: setting times of ‘tmp’ : no such file or directory When I look at ‘dmesg’ on the MDT machine I see these errors: LustreError: 2937:0:(lov_qos.c:721:alloc_specific()) Start index 0 not found in pool '' There weren’t any pools setup prior to the reboot / rebuild. Did I miss a step in fixing things? Is there something I can do to fix this now? Im preparing to deploy a much larger setup so Im hoping to get the various processes understood soon. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Restoration of mds is of a different size!
On 2012-12-18, at 16:39, "Jason Brooks" wrote: > I am currently using lustre 1.8. I am relatively new with lustre. At the > moment, I am trying to move an mds/mgs system from one disk to another. > > I used the version of tar modified by whamcloud in order to dump and restore > the filesytem. I used mkfs.lustre to create a new mds/mgs filesystem on my > /dev/sdc. I then I mounted the old mds at /mnt/mdsold and the new mds at > /mnt/mdsnew using filesystem type "ldiskfs" > > I replicated the filesystem with the following command sequence: > > cd /mnt/mdsold && tar —sparse —xattrs –cf - . | (cd /mnt/mdsnew && tar > —sparse —xattrs –xvpf -) Looks correct. > > However, look at the output of df: > > FilesystemSize Used Avail Use% Mounted on > /dev/sdd 2.3T 7.9G 2.1T 1% /mnt/mdsold > /dev/sdc 815G 3.1G 765G 1% /mnt/mdsnew > > > Now, why would the new filesystem use 3.1 gigabytes when the old one takes up > 7.9G? Perhaps different inode size, or fewer total inodes? > === > > Side note question: I have used the same version of tar to get a file-level > backup of my mds: > > tar —xattrs —sparse –cf - . >sdd.tar > > The file created was 61 gigabytes in size. > > FilesystemSize Used Avail Use% Mounted on > /dev/sdd 2.3T 7.9G 2.1T 1% /mnt/mdsold > > Why is this? Do the attributes take up extra space? Tar is padding each file up to 20kB or something like that. Cheers, Andreas ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] LNET over multiple NICs
On 12/19/12 2:03 PM, "Alexander Oltu" wrote: > >> I have no experience in doing multirail on ethernet, sorry. The >> principle is exactly the same as for Infiniband, but as Infiniband >> interfaces cannot be bonded (except for IPoIB which is not of >> interest when considering performance), I cannot tell. > >Looks like --network option is not helping. I have tried to unmount osts >and run for half of OSTs: > >tunefs.lustre --network=tcp0 > >and for another half: > >tunefs.lustre --network=tcp1 > >Mounted OSTs back. The client still sends all traffic through tcp0. I >have suspicion that --network option is to separate from different >network types, like tcp, o2ib, etc. > >Probably I will go bonding way. For Ethernet, bonding is the preferred method. Use the standard method, and then point Lustre at the 'bond' interface. The modprobe option 'networks=' is used to bind hardware interfaces, for kernel bonding you would normally use: options lnet networks=tcp(bond0) See http://wiki.lustre.org/manual/LustreManual20_HTML/SettingUpBonding.html#504 38258_99571 For more (perhaps a little old) info. cliffw > >Thank you for your help, >Alex. >___ >Lustre-discuss mailing list >Lustre-discuss@lists.lustre.org >http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss