Re: [Lustre-discuss] Problems with MDS Crashing

2010-05-20 Thread Andrew Godziuk
We have had another hang, but this time we had KVM access to the machine (and the screen blanker wasn't on). I took some screenshots, the first one is an error I got after reboot, the BMP one is what I saw when I first logged in to KVM, and the other ones are what I saw when trying to type 'root'

Re: [Lustre-discuss] Problems with MDS Crashing

2010-05-18 Thread Gregory Matthews
Gary Brooks wrote: Then, all of the sudden the MDS stops responding, ssh sessions die and only hard restart helps. After the restart, /var/log/messages contains normal information (some timeout chit-chat). is your hardware using the bnx2 NIC driver? We've just been seeing very similar

Re: [Lustre-discuss] Problems with MDS Crashing

2010-05-18 Thread Gary Brooks
Greg, We are using CENT and 3ware raid adapters.We had the MDS on Raid 0 (on accident) and changed it out for a raid 10 machine with 3ware card. We did update the firmware on the 3ware cards and changed the stripe size down to 64k. It was at 256. After changing out the equipment the

[Lustre-discuss] Problems with MDS Crashing

2010-05-12 Thread Gary Brooks
Help Needed: We're having trouble with our MDS server. Nothing suspicious in logs - at some point they are just not being created anymore. The scenario is as following: we're having a MDS running on DRBD, 2 OSS and ca. 10 clients. The traffic pattern is lots of small file reads and writes. We

Re: [Lustre-discuss] Problems with MDS Crashing

2010-05-12 Thread Andreas Dilger
On 2010-05-12, at 15:42, Gary Brooks wrote: We're having trouble with our MDS server. Nothing suspicious in logs - at some point they are just not being created anymore. The scenario is as following: we're having a MDS running on DRBD, 2 OSS and ca. 10 clients. The traffic pattern is lots