In parallel to working through MOS, I thought I would mention a serious problem 
we have encountered with QFS clients crashing and see if anyone else has 
experienced anything similar.

In particular, with fully patched S10x86 and SAM-QFS 5.2.2 we have found that a 
QFS shared client of an ma filesystem operating as an NFS server can hard lock, 
or kernel panic, when there is sustained NFS write traffic to that server in 
excess of the QFS disk bandwidth available to store those writes. What appears 
to happen is that excessive NSF write traffic uses up the memory on the QFS 
client until network drivers (and other kernel tasks) start complaining about 
unavailable memory and the system falls over.

We have seen this on more than one server, and have tried tuning some of the 
obvious QFS write parameters to no avail. If the NSF write traffic is 
temporarily throttled when the QFS client is observed to be running low on 
memory then the kernel free memory drifts back up and all is good when the 
writing is unthrottled (until it runs low on memory again). However, if the 
free memory (as reported by top) drops to near 0 then the kernel crashes (or 
deadlocks).

The writing is from NFSv3 Linux clients, but not yet sure if Linux or NFSv3 are 
necessary to expose the failure mode.

Note, there does not seem to be a corresponding problem with aggressive NFS 
reads overloading a QFS shared client, that simply saturates the network, disk, 
and/or CPU and everything keeps running as expected.

If anyone has any additional data readily available related to shared QFS 
clients/NFS servers crashing (or not) with aggressive NFS writes that would be 
appreciated.

Thanks.

--
Stuart Anderson  [email protected]
http://www.ligo.caltech.edu/~anderson



_______________________________________________
sam-qfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/sam-qfs-discuss

Reply via email to