Re: [Lustre-discuss] OST unavailable tests

2009-07-07 Thread Shantanu S Pavgi
Hello again, I am struggling to understand recovery procedure (or aborting the same) after OST becomes unavailable. I have tried deactivating corresponding OSC on MDS, however it hasn't worked so far. - How do I stop already connected client hanging from df command? If a new client (not connected

Re: [Lustre-discuss] Bonded client interfaces and 10GbE server

2009-07-07 Thread Isaac Huang
On Tue, Jul 07, 2009 at 11:44:39AM -0400, Isaac Huang wrote: > .. > > If I would attach the OSS with a single 10GbE link, could > > a client then use the second link, when striping over targets > > on same OSS? > > There's a rather complex way of static configuration to allow for > better over

Re: [Lustre-discuss] Bonded client interfaces and 10GbE server

2009-07-07 Thread Isaac Huang
On Tue, Jul 07, 2009 at 03:44:32PM +0200, Ralf Utermann wrote: > Dear list, > > we have setup of OSS and some clients with a dual Gigabit > trunk (miimon=100 mode=802.3ad xmit_hash_policy=layer3+4). If I understand it correctly, xmit_hash_policy=layer3+4 would not allow a single TCP connection to

[Lustre-discuss] One client node freezes at random

2009-07-07 Thread Jeremy Mann
Lustre 1.6.7.1 - Kernel 2.6.22.14 I have one client that randomly looses its lustre mount. I can still SSH to the client and df reports "df: `/lustre: Input/output error". However dmesg, /var/log/kern and /var/log/message do not show any kind of error that can tell me why its losing the lustre mou

[Lustre-discuss] Bonded client interfaces and 10GbE server

2009-07-07 Thread Ralf Utermann
Dear list, we have setup of OSS and some clients with a dual Gigabit trunk (miimon=100 mode=802.3ad xmit_hash_policy=layer3+4). If the clients stripe over targets on different OSS, they see a dual link bandwidth. If however, they stripe over targets on the same OSS, they only get the bandwith of o

Re: [Lustre-discuss] MDT crash: ll_mdt at 100%

2009-07-07 Thread Mag Gam
So, are you all good now? Thanks for the explanation, BTW! On Tue, Jul 7, 2009 at 7:42 AM, Thomas Roth wrote: > Hi, > > Mag Gam wrote: >> Exactly the symptoms I had. How long were you running this for?  Also, >> how easy is it for you to reproduce this error? > > the MDS-going-on-strike - insta

Re: [Lustre-discuss] MDT crash: ll_mdt at 100%

2009-07-07 Thread Thomas Roth
Hi, Mag Gam wrote: > Exactly the symptoms I had. How long were you running this for? Also, > how easy is it for you to reproduce this error? the MDS-going-on-strike - instances happened only twice since we upgraded the cluster from Lustre 1.6.5.1 to 1.6.7.1 end of April. Since last week everythi

Re: [Lustre-discuss] [ROMIO Req #940] a new Lustre ADIO driver]

2009-07-07 Thread pascal . deveze
Rob, >> 2) The parameter len_list_ptr has been modified in include/adioi.h, so I >> propose to change : >>int **len_list_ptr; >> to >>ADIO_Offset **len_list_ptr; >> >> in ad_lustre_aggregate.c and ad_lustre_wrcoll.c > No argument here. That's clearly the rig