Hi Andrew, The patch seems to work-- I can't hang reproduce the issue as described initially.
I tested using the head of the openafs-stable-1_6_x branch as of commit 2b2b647e3299c2dfeb30d2986290e1121d6cb5f3 with your patch applied. Applying the patch to the 1.6.0pre6 caused the machine to kernel panic. Thanks! -Aaron On Wed, Jun 29, 2011 at 9:51 PM, Aaron Knister <[email protected]> wrote: > That's great Andrew, thank you! I'll try it out and report back. > > > On Wed, Jun 29, 2011 at 4:16 PM, Andrew Deason <[email protected]>wrote: > >> On Tue, 14 Jun 2011 17:56:44 -0400 >> Aaron Knister <[email protected]> wrote: >> >> > Good afternoon! >> > >> > I'm writing to report a deadlock issue I'm seeing on Solaris 10. >> >> This issue should be fixed by this: <http://gerrit.openafs.org/4896> >> which you can get the current version of in patch form here: >> < >> http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=94483f566ff624a8d7fd7455359703b4525ec05a >> > >> (Comments on that are welcome, too, for anyone familiar with the Solaris >> VM system) >> >> That should apply to a recent 1.6 and possibly 1.5. If it does in fact >> cause the system to not hang, you can verify you're actually hitting the >> problematic condition by running something like this: >> >> $ dtrace -n 'fbt::osi_VM_MultiPageConflict:return { @["conflict"] = >> quantize(arg1); }' >> >> Run that before the copy, and after the copy completes, ctrl-C the >> dtrace process and it should spit something like this out at you: >> >> conflict >> value ------------- Distribution ------------- count >> -1 | 0 >> 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 353 >> 1 | 0 >> >> which shows that osi_VM_MultiPageConflict returned '0' 353 times. You >> may get some 1 return values that show up: >> >> conflict >> value ------------- Distribution ------------- count >> -1 | 0 >> 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 344 >> 1 |@@@ 31 >> 2 | 0 >> >> But I could only get that to happen if I somewhat forced the client to >> choose the "wrong" entry to evict from the cache. If all of the 'count's >> are zero, you didn't trigger the condition that was causing the original >> problem. >> >> Can you let us know if that fixes the problem for you, or changes >> anything about it? >> >> -- >> Andrew Deason >> [email protected] >> >> _______________________________________________ >> OpenAFS-info mailing list >> [email protected] >> https://lists.openafs.org/mailman/listinfo/openafs-info >> > > > > -- > Aaron Knister > Systems Administrator > Division of Information Technology > University of Maryland, Baltimore County > [email protected] > -- Aaron Knister Systems Administrator Division of Information Technology University of Maryland, Baltimore County [email protected]
