[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2022-11-08 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

Kubilay Kocak  changed:

   What|Removed |Added

   Keywords||crash

-- 
You are receiving this mail because:
You are the assignee for the bug.


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-23 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

Leandro Lupori  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|In Progress |Closed

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-23 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

--- Comment #2 from commit-h...@freebsd.org ---
A commit references this bug:

Author: luporl
Date: Tue Apr 23 17:11:45 UTC 2019
New revision: 346600
URL: https://svnweb.freebsd.org/changeset/base/346600

Log:
  [PPC64] Fix wrong KASSERT in mphyp_pte_insert()

  As mphyp_pte_unset() can also remove PTE entries, and as this can
  happen in parallel with PTEs evicted by mphyp_pte_insert(), there
  is a (rare) chance the PTE being evicted gets removed before
  mphyp_pte_insert() is able to do so. Thus, the KASSERT should
  check wether the result is H_SUCCESS or H_NOT_FOUND, to avoid
  panics if the situation described above occurs.

  More details about this issue can be found in PR 237470.

  PR:   237470
  Reviewed by:  jhibbits
  Differential Revision:https://reviews.freebsd.org/D20012

Changes:
  head/sys/powerpc/pseries/mmu_phyp.c

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

Mark Linimon  changed:

   What|Removed |Added

 CC||lini...@freebsd.org
   Keywords||panic, patch

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

--- Comment #1 from Leandro Lupori  ---
Proposed fix: https://reviews.freebsd.org/D20012

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

Leandro Lupori  changed:

   What|Removed |Added

 Status|New |In Progress

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 237470] [ppc][pseries] panic: Error evicting page: -7

2019-04-22 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237470

Bug ID: 237470
   Summary: [ppc][pseries] panic: Error evicting page: -7
   Product: Base System
   Version: CURRENT
  Hardware: powerpc
OS: Any
Status: New
  Severity: Affects Only Me
  Priority: ---
 Component: kern
  Assignee: b...@freebsd.org
  Reporter: lup...@freebsd.org

I have seen this issue happening a couple of times. It is difficult to
reproduce. In my case, it started happening more often when using a clang built
with debug info, to perform large parallel builds.

This is the panic message observed:
panic: Error evicting page: -7
cpuid = 10
time = 131979
KDB: stack backtrace:
0xe00033634910: at .kdb_backtrace+0x5c
0xe00033634a40: at .vpanic+0x1b4
0xe00033634b00: at .panic+0x38
0xe00033634b90: at .mphyp_pte_insert+0x304
0xe00033634cb0: at .moea64_pvo_enter+0x164
0xe00033634d40: at .moea64_enter+0x520
0xe00033634e40: at .moea64_enter_object+0xa8
0xe00033634ef0: at .pmap_enter_object+0xa8
0xe00033634fa0: at .vm_map_pmap_enter+0x2d0
0xe00033635070: at .vm_map_insert+0x550
0xe00033635170: at .vm_map_fixed+0x134
0xe00033635240: at .vm_mmap_object+0x484
0xe00033635350: at .vn_mmap+0x190
0xe00033635430: at .kern_mmap+0x474
0xe00033635550: at .sys_mmap+0x30
0xe000336355d0: at .trap+0x654
0xe00033635770: at .powerpc_interrupt+0x290
0xe00033635810: user SC trap by 0x81004e768: srr1=0x8000d032
r1=0x3fffb790 cr=0x22024024 xer=0 ctr=0x81004e940
r2=0x810075d80 frame=0xe00033635840
KDB: enter: panic


This seems to indicate that the PTE to be evicted was not found.

After some debugging, it seems to be that there is a race condition, between
mphyp_pte_unset() and mphyp_pte_insert(), that may cause the page chosen for
eviction to be removed by mphyp_pte_unset() before mphyp_pte_insert().

This can be explained as following:
- mphyp_pte_insert() locks the pvo to be inserted
- mphyp_pte_insert() obtains read access to mphyp_eviction_lock
- mphyp_pte_insert() tries to insert the corresponding pte but fails
- mphyp_pte_insert() releases mphyp_eviction_lock
- mphyp_pte_insert() acquires mphyp_eviction_lock for write
- mphyp_pte_insert() chooses a pte to evict - let's call it p
- mphyp_pte_unset(), on another thread, locks the pvo that corresponds to pte p
- mphyp_pte_unset() removes p (without holding mphyp_eviction_lock)
- mphyp_pte_insert() tries to remove p, but fails, because it was already
removed by another thread
- the system panics

KDB's acttrace on this panic supports the hypothesis above:

Tracing command clang-8 pid 44763 tid 100504 td 0xcf22 (CPU 10)
0xe00033634a40: at .vpanic+0x1d4
0xe00033634b00: at .panic+0x38  
0xe00033634b90: at .mphyp_pte_insert+0x304
0xe00033634cb0: at .moea64_pvo_enter+0x164
0xe00033634d40: at .moea64_enter+0x520 
0xe00033634e40: at .moea64_enter_object+0xa8
0xe00033634ef0: at .pmap_enter_object+0xa8
0xe00033634fa0: at .vm_map_pmap_enter+0x2d0
0xe00033635070: at .vm_map_insert+0x550   
0xe00033635170: at .vm_map_fixed+0x134  
0xe00033635240: at .vm_mmap_object+0x484
0xe00033635350: at .vn_mmap+0x190
0xe00033635430: at .kern_mmap+0x474
0xe00033635550: at .sys_mmap+0x30 
0xe000336355d0: at .trap+0x654
0xe00033635770: at .powerpc_interrupt+0x290
0xe00033635810: user SC trap by 0x81004e768: srr1=0x8000d032
r1=0x3fffb790 cr=0x22024024 xer=0 ctr=0x81004e940
r2=0x810075d80 frame=0xe000336358
40

Tracing command clang-8 pid 44640 tid 100690 td 0xc0004913b000 (CPU 15)
0xe0003404fad0: at .intr_event_handle+0xf0
0xe0003404fb90: at .powerpc_dispatch_intr+0xf0
0xe0003404fc40: at .xicp_dispatch+0x274
0xe0003404fd00: at .powerpc_interrupt+0xc8
0xe0003404fda0: kernel EXI trap by .lock_delay+0x5c:
srr1=0x80009032
r1=0xe00034050050 cr=0x2000f084 xer=0 ctr=0x1b78
r2=0xc155e890 frame=0xe0003404fdd0
0xe00034050050: at 0xc000fa530c1c
0xe000340500e0: at .__mtx_lock_sleep+0x238
0xe000340501d0: at .__mtx_lock_flags+0x160
0xe00034050280: at .moea64_remove_pages+0x134
0xe00034050340: at .pmap_remove_pages+0x78
0xe000340503d0: at .vmspace_exit+0xf8
0xe00034050480: at .exit1+0x6d8
0xe00034050550: at .sys_sys_exit+0x1c
0xe000340505d0: at .trap+0x654
0xe00034050770: at .powerpc_interrupt+0x290
0xe00034050810: user SC trap by 0x8133eafd8: srr1=0x8000f032
r1=0x3fffd890 cr=0x22000242 xer=0x2000 ctr=0x8133eafd0
r2=0x8134bdbf0 frame=0xe00034050840


Process 44640 is exiting and removin