Am Mittwoch 03 Juni 2009 schrieb Florian Philipp:
>
> Do you have a spare network adapter, maybe an older 100MBit PCI card?
> Maybe we should rule out a hardware fault on your ethernet chipset first.
>
I already thought on this, but the results of my tests dont indicate a 
hardware fault on the ethernet chipset, because:

* I can run a ping -f to the machine, it runs for hours without the 
slightest problem
* As long as files transfered are small enough (i.e. they fit in the cache 
buffer on the server) and the server has enough time to write back it to 
the disk, there is no problem
* If I explicitly force the ethernet link to be 100FD instead of gigabit, 
the is also no problem. So I don't expect any error using another 100MBit 
card.

For me it looks like as if the following is happening:

* Memory gets filled up with cached files, no problem so far
* If no more physical ram is available, the system tries to free some memory 
internally, e.g. by flushing the caches. 
*  If releasing cache entries and writing back data to their respective 
files does not perform fast enough, an internal memory allocation may not 
succeed, and I see the "page allocation failure" messages, with different 
processes/kernel threads in the first line.
* I assume that most of the internal kernel threads don't get a problem in 
this situation, but there may be some critical parts where we do. Hence, it 
might just be a matter of probability whether it encounters such a critical 
part, and the probabilty increases with the MB/s the data is put to the NFS 
server.

Greetings
    Alex

Reply via email to