Re: UVMHIST, pmap_get_physpage panic
On Mon, Dec 17, 2018 at 08:29:47AM +0100, Maxime Villard wrote: > Do you also get out-of-memory when you disable UVMHIST? Adding UVMHIST was done for debugging the "X server being killed a lot" problem I posted about on current-users in the last months. I didn't have out-of-memory panics of the kernel before, when I was running with KASAN but without UVMHIST. Thomas
Re: UVMHIST, pmap_get_physpage panic
Le 17/12/2018 à 08:10, Thomas Klausner a écrit : On Mon, Dec 17, 2018 at 08:06:36AM +0100, Maxime Villard wrote: Le 16/12/2018 à 09:09, Thomas Klausner a écrit : [ 16674.534547] panic: pmap_get_physpage: out of memory Well, out of memory means out of memory. KASAN consumes a bit more than 1/8 of the KVA. So if in normal times your system would use 8GB of ram, KASAN adds an extra ~1.1GB. So why doesn't it kill userland processes? I don't believe my kernel needs all 32GB of RAM. I don't know. In fact I don't understand how it is normal to get this: [ 16674.544550] pmap_growkernel() at netbsd:pmap_growkernel [ 16674.544550] kasan_shadow_map() at netbsd:kasan_shadow_map+0xff [ 16674.544550] pmap_growkernel() at netbsd:pmap_growkernel+0x283 pmap_growkernel() does mutex_enter(kpm->pm_lock); So if it's called recursively I think we have a problem. The call path is: pmap_growkernel -> kasan_shadow_map -> pmap_get_physpage -> [somewhere we need to allocate KVA] -> pmap_growkernel This problem is not KASAN-specific, because KASAN just duplicates the existing logic: pmap_growkernel -> pmap_alloc_level -> pmap_get_physpage Maybe KASAN makes the problem more visible. Do you also get out-of-memory when you disable UVMHIST?
Re: UVMHIST, pmap_get_physpage panic
On Mon, Dec 17, 2018 at 08:06:36AM +0100, Maxime Villard wrote: > Le 16/12/2018 à 09:09, Thomas Klausner a écrit : > > [ 16674.534547] panic: pmap_get_physpage: out of memory > > Well, out of memory means out of memory. KASAN consumes a bit more than > 1/8 of the KVA. So if in normal times your system would use 8GB of ram, > KASAN adds an extra ~1.1GB. So why doesn't it kill userland processes? I don't believe my kernel needs all 32GB of RAM. Thomas
Re: UVMHIST, pmap_get_physpage panic
Le 16/12/2018 à 09:09, Thomas Klausner a écrit : [ 16674.534547] panic: pmap_get_physpage: out of memory Well, out of memory means out of memory. KASAN consumes a bit more than 1/8 of the KVA. So if in normal times your system would use 8GB of ram, KASAN adds an extra ~1.1GB.
UVMHIST, pmap_get_physpage panic
Hi! I've been adding UVMHIST to my kernel config (now its GENERIC + KASAN + UVMHIST). I noticed that UVMHIST slowed the machine down a bit (not by a factor of two, but in the ballpark, for bulk builds). And I had two panics since. The machine is doing a bulk build (in a tmpfs) and some file I/O (via NFS mostly). The first panic was the usual SPL NOT LOWERED gibberish (attached). The second was: [ 16674.534547] panic: pmap_get_physpage: out of memory [ 16674.534547] cpu10: Begin traceback... [ 16674.534547] vpanic() at netbsd:vpanic+0x221 [ 16674.534547] snprintf() at netbsd:snprintf [ 16674.544550] pmap_growkernel() at netbsd:pmap_growkernel [ 16674.544550] kasan_shadow_map() at netbsd:kasan_shadow_map+0xff [ 16674.544550] pmap_growkernel() at netbsd:pmap_growkernel+0x283 [ 16674.554553] uvm_map_prepare() at netbsd:uvm_map_prepare+0xe14 [ 16674.554553] uvm_map() at netbsd:uvm_map+0xec [ 16674.564557] uvm_km_alloc() at netbsd:uvm_km_alloc+0x466 [ 16674.564557] pool_grow() at netbsd:pool_grow+0xbb [ 16674.574561] pool_catchup() at netbsd:pool_catchup+0x46 [ 16674.574561] pool_get() at netbsd:pool_get+0x7e1 [ 16674.584564] allocbuf() at netbsd:allocbuf+0x119 [ 16674.584564] getblk() at netbsd:getblk+0x185 [ 16674.584564] bio_doread() at netbsd:bio_doread+0x1b [ 16674.594568] bread() at netbsd:bread+0x18 [ 16674.594568] ffs_init_vnode() at netbsd:ffs_init_vnode+0x1cd [ 16674.604572] ffs_loadvnode() at netbsd:ffs_loadvnode+0xc8 [ 16674.604572] vcache_get() at netbsd:vcache_get+0x4f4 [ 16674.604572] ufs_lookup() at netbsd:ufs_lookup+0x1320 [ 16674.614575] VOP_LOOKUP() at netbsd:VOP_LOOKUP+0xb6 [ 16674.614575] lookup_once() at netbsd:lookup_once+0x34b [ 16674.624579] namei_tryemulroot() at netbsd:namei_tryemulroot+0x87d [ 16674.624579] namei() at netbsd:namei+0x65 [ 16674.634583] fd_nameiat.isra.2() at netbsd:fd_nameiat.isra.2+0xd1 [ 16674.634583] do_sys_statat() at netbsd:do_sys_statat+0x111 [ 16674.644586] sys___lstat50() at netbsd:sys___lstat50+0x85 [ 16674.644586] syscall() at netbsd:syscall+0x308 [ 16674.644586] --- syscall (number 441) --- [ 16674.644586] 761a961145aa: [ 16674.644586] cpu10: End traceback... I have a kernel core dump for this one. Is this a bug or do I need to get more RAM? Comments on UVMHIST performance cost and the first panic are also appreciated. Thanks, Thomas panic.gz Description: application/gunzip