Rick Macklem <rmack...@uoguelph.ca> wrote in <1914428061.1617223.1357133079421.javamail.r...@erie.cs.uoguelph.ca>:
rm> Hiroki Sato wrote: rm> > Hello, rm> > rm> > I have been in a trouble about my NFS server for a long time. The rm> > symptom is that it stops working in one or two weeks after a boot. I rm> > could not track down the cause yet, but it is reproducible and only rm> > occurred under a very high I/O load. rm> > rm> > It did not panic, just stopped working---while it responded to ping, rm> > userland programs seemed not working. I could break it into DDB and rm> > get a kernel dump. The following URLs are a log of ps, trace, and rm> > etc.: rm> > rm> > http://people.allbsd.org/~hrs/FreeBSD/pool.log.20130102 rm> > http://people.allbsd.org/~hrs/FreeBSD/pool.dmesg.20130102 rm> > rm> > Does anyone see how to debug this? I guess this is due to a deadlock rm> > somewhere. I have suffered from this problem for almost two years. rm> > The above log is from stable/9 as of Dec 19, but this have persisted rm> > since 8.X. rm> > rm> Well, I took a quick glance at the log and there are a lot of processes rm> sleeping on "pfault" (in vm_waitpfault() in sys/vm/vm_page.c). I'm no rm> vm guy, so I'm not sure when/why that will happen. The comment on the rm> function suggests they are waiting for free pages. rm> rm> Maybe something as simple as running out of swap space or a problem rm> talking to the disk(s) that has the swap partition(s) or ??? rm> (I'm talking through my hat here, because I'm not conversant with rm> the vm side of things.) rm> rm> I might take a closer look this evening and see if I can spot anything rm> in the log, rick rm> ps: I hope Alan and Kostik don't mind being added to the cc list. Thank you. This machine has 24GB RAM + 30GB swap. 16GB of them are used for ZFS ARC, and I can see 1.5GB free space on average. However, frequent swapouts happen in a regular basis even when the I/O load is low. The amount used in the swap was 20-30MB only regardless of the load. I checked vm.stats and the outputs of vmstat -z/-m every 10 sec until the freeze several times but vm.stats.vm.v_free_count was around 300,000 (>1GB) even just before the freeze. -- Hiroki
pgpt4cIux6h0I.pgp
Description: PGP signature