Hi,

I spent some time yesterday trying to reproduce a crash we have been
observing at work. While I failed in my objective, I do believe to have
found a separate, unrelated issue.

The scenario is simple: a machine gently swapping, with many processes
doing punctual, small file system transactions over a large set of
files. I wrote a tiny script to simulate these conditions:

find / -type f -print0 | xargs -0 -P 64 /bin/sh -c 'for i in $*; do
       tmpfile=$(mktemp -u /stage/XXXXXXXXXXXXXXXX)
       nc -Ul ${tmpfile} >/dev/null &
       cat $i | nc -UN ${tmpfile}
       pkill -P $$ ^nc
       rm ${tmpfile}
done' >>/tmp/x 2>&1

Running 10 instances of this script, my machine will invariably crash in
uvm_fault(), always in the same spot:

uvm_fault(0xffffff0010898d98, 0x0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at      uvmfault_unlockall+0x2f:        movq    0x10(%rsi),%rax
ddb{6}> tr
uvmfault_unlockall() at uvmfault_unlockall+0x2f [1]
uvm_fault() at uvm_fault+0x838 [2]
trap() at trap+0x463
--- trap (number 6) ---

The contents of %rsi change from crash to crash, but are always garbage.
In the particular case of the crash quoted above, %rsi was 0xd.

At this point it is quite clear that uvm_fault() is calling
uvmfault_unlockall() with an uninitialised 'oanon'. The correct fix,
however, goes a little bit further than initialising oanon to NULL.

Considering the case handled by the 'if' in line 1046 [3], there are two
subcases: one where a new anon is allocated and oanon is set (case A),
and another where anon is left untouched (case B).

For case A, we already unlocked oanon in line 1079 [4]. If the
pmap_enter() call in line 1088 [5] fails, there is no need to call
uvmfault_unlockall() on it. On the contrary: we need to call
uvmfault_unlockall() on the newly allocated anon, which is locked and
will be reused when we jump to ReFault to try again. Since anon is now
newly allocated, its reference count will be 1, and given that we have
added it to the amap in line 1073 [6], it will be picked up by the
amap_lookup() call in line 665 [7] _after_ we jump, and we will hit the
'else' case in line 1080 [8].

For case B, anon will remain the same, and needs to be unlocked by
uvmfault_unlockall() before we jump to ReFault if the aforementioned
pmap_enter() call [5] fails.

That said, we should always call uvmfault_unlockall(..., anon) in case
of pmap_enter() failure. It's a one line change which I have implemented
in pedro_uvm_fault:

https://github.com/bitrig/bitrig/compare/pedro_uvm_fault

With that fix, my machine no longer panics. The tests are able to run
until processes start to die due to failing page faults (very likely due
to pool_get() making that pmap_enter() call fail, but I haven't
verified).

-p.

[1] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1665
[2] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1098
[3] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1046
[4] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1079
[5] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1088
[6] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1073
[7] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L665
[8] https://github.com/bitrig/bitrig/blob/master/sys/uvm/uvm_fault.c#L1080

Reply via email to