Hi Andrew,

The patch seems to work-- I can't hang reproduce the issue as described
initially.

I tested using the head of the openafs-stable-1_6_x branch as of
commit 2b2b647e3299c2dfeb30d2986290e1121d6cb5f3 with your patch applied.
Applying the patch to the 1.6.0pre6 caused the machine to kernel panic.

Thanks!

-Aaron

On Wed, Jun 29, 2011 at 9:51 PM, Aaron Knister <[email protected]> wrote:

> That's great Andrew, thank you! I'll try it out and report back.
>
>
> On Wed, Jun 29, 2011 at 4:16 PM, Andrew Deason <[email protected]>wrote:
>
>> On Tue, 14 Jun 2011 17:56:44 -0400
>> Aaron Knister <[email protected]> wrote:
>>
>> > Good afternoon!
>> >
>> > I'm writing to report a deadlock issue I'm seeing on Solaris 10.
>>
>> This issue should be fixed by this: <http://gerrit.openafs.org/4896>
>> which you can get the current version of in patch form here:
>> <
>> http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=94483f566ff624a8d7fd7455359703b4525ec05a
>> >
>> (Comments on that are welcome, too, for anyone familiar with the Solaris
>> VM system)
>>
>> That should apply to a recent 1.6 and possibly 1.5. If it does in fact
>> cause the system to not hang, you can verify you're actually hitting the
>> problematic condition by running something like this:
>>
>> $ dtrace -n 'fbt::osi_VM_MultiPageConflict:return { @["conflict"] =
>> quantize(arg1); }'
>>
>> Run that before the copy, and after the copy completes, ctrl-C the
>> dtrace process and it should spit something like this out at you:
>>
>>  conflict
>>           value  ------------- Distribution ------------- count
>>              -1 |                                         0
>>               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 353
>>               1 |                                         0
>>
>> which shows that osi_VM_MultiPageConflict returned '0' 353 times. You
>> may get some 1 return values that show up:
>>
>>  conflict
>>           value  ------------- Distribution ------------- count
>>              -1 |                                         0
>>               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@    344
>>               1 |@@@                                      31
>>               2 |                                         0
>>
>> But I could only get that to happen if I somewhat forced the client to
>> choose the "wrong" entry to evict from the cache. If all of the 'count's
>> are zero, you didn't trigger the condition that was causing the original
>> problem.
>>
>> Can you let us know if that fixes the problem for you, or changes
>> anything about it?
>>
>> --
>> Andrew Deason
>> [email protected]
>>
>> _______________________________________________
>> OpenAFS-info mailing list
>> [email protected]
>> https://lists.openafs.org/mailman/listinfo/openafs-info
>>
>
>
>
> --
> Aaron Knister
> Systems Administrator
> Division of Information Technology
> University of Maryland, Baltimore County
> [email protected]
>



-- 
Aaron Knister
Systems Administrator
Division of Information Technology
University of Maryland, Baltimore County
[email protected]

Reply via email to