On 5 Mar 2010, at 01:20, eric.hagb...@morganstanley.com wrote:

I've found that if you run a program to generate tokens and pags frequently (about once per second), that fairly soon, the cpu system time on the machine will begin to swallow performance, though it takes a little while to observe it... but if you do that long enough, the machine will eventually grind to a halt. I found that this behavior started between openafs 1.4.1 and 1.4.2, where keyring support got enabled. Some experimentation has shown that the problem is related to the effective disabling of pag garbage collection when keyring support is compiled in.

I've put this in RT as #126669

Interestingly, just changing the bit of code to allow openafs w/ keyring support to do pag GC makes the problem go away, in that you don't get system time spikes/growing forever while afs.GCPAGs=1, but switching to afs.GCPAGs=0 makes the problem come back. So something about keyrings isn't really doing everything it should be if pag GC can make things better.

There's obviously something going awry here. In theory, you don't need to garbage collect keyring PAGs, because the keyrings are reference counted by the kernel, and our destructor is called when the keyring goes away. However, there are a number of known problems with this in 1.4, in particular involving races between the group information, and the establishment of the keyring.

Things are quite different in 1.5 - keyrings are the authoritative source of PAG information. If you have time, it would be great if you could do the same tests with 1.5, and see if you experience similar problems.

S.

_______________________________________________
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to