On May 10, 2009, at 7:12 AM, Arden Wiebe <[email protected]> wrote:
> > Mag, your welcome. From the page referenced first for a search for > Linux Bonding it states: > > How many bonding devices can I have? > > There is no limit. > How many slaves can a bonding device have? > > This is limited only by the number of network interfaces Linux > supports and/or the number of network cards you can place in your > system. In practice, most configurations are limited to the (typical) 4 or 8 maximum supported by the switch you are using. > --- On Sun, 5/10/09, Mag Gam <[email protected]> wrote: > >> From: Mag Gam <[email protected]> >> Subject: Re: [Lustre-discuss] tcp network load balancing >> understanding lustre 1.8 >> To: "Arden Wiebe" <[email protected]> >> Cc: "Andreas Dilger" <[email protected]>, "Michael Ruepp" >> <[email protected] >> >, [email protected] >> Date: Sunday, May 10, 2009, 5:48 AM >> Thanks for the screen shot Arden. >> >> What is the maximum # of slaves you can have on a bonded >> interface? >> >> >> >> On Sun, May 10, 2009 at 12:15 AM, Arden Wiebe <[email protected]> >> wrote: >>> >>> Bond0 knows which interface to utilize because all the >> other eth0-5 are designated as slaves in their configuration >> files. The manual is fairly clear on that. >>> >>> In the screenshot the memory used in gnome system >> monitor is at 452.4 MiB of 7.8 GiB and the sustained >> bandwidth to the OSS and OST is 404.2 MiB/s which >> corresponds roughly to what collectl is showing for KBWrite >> for Disks. Collectl shows a few different results for >> Disks, Network and Lustre OST and I believe it to be >> measuring the other OST on the network around 170MiB/s if >> you view the other screenshot for OST1 or lustrethree. >>> >>> In the screenshots Lustreone=MGS Lustretwo=MDT >> Lustrethree=OSS+raid10 target Lustrefour=OSS+raid10 target >>> >>> To help clarify the entire network and stress testing >> I did with all the clients I could give it is at >> www.ioio.ca/Lustre-tcp-bonding/images/html and >> www.ioio.ca/Lustre-tcp-bonding/Lustre-notes/images.html >>> >>> Proper benchmarking would be nice though as I just hit >> it with everything I could and it lived so I was happy. I >> found the manual to be lacking in benchmarking and really >> wanted to make nice graphs of it all but failed with iozone >> to do so for some reason. >>> >>> I'll be taking a run at upgrading everything to 1.8 in >> the coming week or so and when I do I'll grab some new >> screenshots and post the relevant items to the wiki. >> Otherwise if someone else wants to post the existing >> screenshots your welcome to use them as they do detail a >> ground up build. Apparently 1.8 is great with small files >> now so it should work even better with >> www.oil-gas.ca/phpsysinfo and www.linuxguru.ca/phpsysinfo >>> >>> >>> --- On Sat, 5/9/09, Andreas Dilger <[email protected]> >> wrote: >>> >>>> From: Andreas Dilger <[email protected]> >>>> Subject: Re: [Lustre-discuss] tcp network load >> balancing understanding lustre 1.8 >>>> To: "Arden Wiebe" <[email protected]> >>>> Cc: [email protected], >> "Michael Ruepp" <[email protected]> >>>> Date: Saturday, May 9, 2009, 11:31 AM >>>> On May 09, 2009 09:18 -0700, >>>> Arden Wiebe wrote: >>>>> This might help answer some questions. >>>>> http://ioio.ca/Lustre-tcp-bonding/OST2.png which shows >>>> my mostly not >>>>> tuned OSS and OST's pulling 400+MiB/s over >> TCP Bonding >>>> provided by the >>>>> kernel complete with a cat of the >> modeprobe.conf >>>> file. You have the other >>>>> links I've sent you but the picture above is >> relevant >>>> to your questions. >>>> >>>> Arden, thanks for sharing this info. Any chance >> you >>>> could post it to >>>> wiki.lustre.org? It would seem there is one bit >> of >>>> info missing somewhere - >>>> how does bond0 know which interfaces to use? >>>> >>>> >>>> Also, another oddity - the network monitor is >> showing >>>> 450MiB/s Received, >>>> yet the disk is showing only about 170MiB/s going >> to the >>>> disk. Either >>>> something is wacky with the monitoring (e.g. it is >> counting >>>> Received for >>>> both the eth* networks AND bond0), or Lustre is >> doing >>>> something very >>>> wierd and retransmitting the bulk data like crazy >> (seems >>>> unlikely). >>>> >>>> >>>>> --- On Thu, 5/7/09, Michael Ruepp <[email protected]> >>>> wrote: >>>>> >>>>>> From: Michael Ruepp <[email protected]> >>>>>> Subject: [Lustre-discuss] tcp network >> load >>>> balancing understanding lustre 1.8 >>>>>> To: [email protected] >>>>>> Date: Thursday, May 7, 2009, 5:50 AM >>>>>> Hi there, >>>>>> >>>>>> I am configured a simple tcp lustre 1.8 >> with one >>>> mdc (one >>>>>> nic) and two >>>>>> oss (four nic per oss) >>>>>> As well as in the 1.6 documentation, >> the >>>> multihomed >>>>>> sections is a >>>>>> little bit unclear to me. >>>>>> >>>>>> I give every NID a IP in the same >> subnet, eg: >>>>>> 10.111.20.35-38 - oss0 >>>>>> and 10.111.20.39-42 oss1 >>>>>> >>>>>> Do I have to make modprobe.conf.local >> look like >>>> this to >>>>>> force lustre >>>>>> to use all four interfaces parallel: >>>>>> >>>>>> options lnet >> networks=tcp0(eth0,eth1,eth2,eth3) >>>>>> Because on Page 138 the 1.8 Manual >> says: >>>>>> "Note – In the case of TCP-only >> clients, the >>>> first >>>>>> available non- >>>>>> loopback IP interface >>>>>> is used for tcp0 since the interfaces >> are not >>>> specified. " >>>>>> >>>>>> or do I have to specify it like this: >>>>>> options lnet networks=tcp >>>>>> Because on Page 112 the lustre 1.6 >> Manual says: >>>>>> "Note – In the case of TCP-only >> clients, all >>>> available IP >>>>>> interfaces >>>>>> are used for tcp0 >>>>>> since the interfaces are not specified. >> If there >>>> is more >>>>>> than one, the >>>>>> IP of the first one >>>>>> found is used to construct the tcp0 >> ID." >>>>>> >>>>>> Which is the opposite of the 1.8 Manual >>>>>> >>>>>> My goal ist to let lustre utilize all >> four Gb >>>> Links >>>>>> parallel. And my >>>>>> Lustre Clients are equipped with two Gb >> links >>>> which should >>>>>> be utilized >>>>>> by the lustre clients as well (eth0, >> eth1) >>>>>> >>>>>> Or is bonding the better solution in >> terms of >>>> performance? >>>>>> >>>>>> Thanks very much for input, >>>>>> >>>>>> Michael Ruepp >>>>>> Schwarzfilm AG >>>>>> >>>>>> >>>>>> >> _______________________________________________ >>>>>> Lustre-discuss mailing list >>>>>> [email protected] >>>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>>>> >>>>> >>>>> >>>>> >>>>> >> _______________________________________________ >>>>> Lustre-discuss mailing list >>>>> [email protected] >>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>>> Cheers, Andreas >>>> -- >>>> Andreas Dilger >>>> Sr. Staff Engineer, Lustre Group >>>> Sun Microsystems of Canada, Inc. >>>> >>>> >>> >>> >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> [email protected] >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >> > > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
