RE: [Bug 227404] UP FreeBSD VM always hangs on reboot since 20180329-r331740
> From: Bruce Evans> Sent: Tuesday, April 10, 2018 13:09 > > Here the bug is that UP FreeBSD VM hangs on reboot or power-off, and > > I'm sure this recent patch (which was committed by Jeff on Mar 26) caused > > this bug: > > r331561:Fix a bug introduced in r329612 that slowly invalidates all clean > > bufs. > > > > However, SMP VM with 2 or more CPUs doesn't hang on reboot/power-off > > according to our tests. > > Actually, r329612 is what causes this bug. I already did the bisection > to find almost this bug a couple of weeks ago. The hang occurs on amd64 > with 4 CPUs but not on amd64 with 8 CPUs or i386 with 4 or 8 CPUS. I > just checked that it occurs on i386 with 1 CPU. All on the same machine. > But r329611 doesn't hang for any of these cases. So, it looks to me that: r329612 introduced a hang issue, so Jeff made r331561, trying to fix the issue, but it looks the issue is not completely fixed (at least for me). I didn't test r329612. We noticed our amd64 VM (which has a single CPU) hung . The VM kernel was built with yesterday's latest kernel code + the default GENERIC kernel config. However, using the same kernel binary, if we configure 2 or more CPUs to the VM, the VM doesn't hang on reboot. If I use the latest code but manually remove the changes made by r331561, the hang issue with our single-CPU VM will go away. I hope the info is helpful. > I still think there is an older bug, but now think it is related. I > only tested with SCHED_4BSD. For SCHED_4BSD, I suspect that the bug > is from pinning a thread to a CPU and then stopping that CPU. Pure > UP has no problems since pinning is null for it. SCHED_4BSD has especially > special handing for SMP (a separate runq for each CPU. I have been > modifying > SCHED_4BSD and the separate queues mostly get in the way). > > Bruce I always use the default GENERIC kernel options, so I guess I'm using SCHED_4BSD(?).. Thanks, -- Dexuan ___ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"
RE: [Bug 227404] UP FreeBSD VM always hangs on reboot since 20180329-r331740
On Tue, 10 Apr 2018, Dexuan Cui wrote: From: Bruce Evans Sent: Tuesday, April 10, 2018 00:45 On Tue, 10 Apr 2018 a bug that doesn't want repl...@freebsd.org wrote: (The bug didn't even Cc freebsd-bugs for this followup.) Thanks for the reminder! I Cc'd bugs@ just now. --- Comment #4 from Dexuan Cui --- ... I think I saw this a few months before that. My only history of this is that I built a UP kernel on 17 Dec 2017 to see if UP kernels had the bug. So SMP kernels probably had the bug then. Bruce Here the bug is that UP FreeBSD VM hangs on reboot or power-off, and I'm sure this recent patch (which was commited by Jeff on Mar 26) caused this bug: https://github.com/freebsd/freebsd/commit/63a483ed5f4eaadb8979992c7a5de24c7a471c61 ("Fix a bug introduced in r329612 that slowly invalidates all clean bufs"). However, SMP VM with 2 or more CPUs doesn't hang on reboot/power-off according to our tests. Actually, r329612 is what causes this bug. I already did the bisection to find almost this bug a couple of weeks ago. The hang occurs on amd64 with 4 CPUs but not on amd64 with 8 CPUs or i386 with 4 or 8 CPUS. I just checked that it occurs on i386 with 1 CPU. All on the same machine. But r329611 doesn't hang for any of these cases. XX From b...@optusnet.com.au Fri Mar 23 20:06:40 2018 +1100 XX Date: Fri, 23 Mar 2018 20:06:39 +1100 (EST) XX From: Bruce EvansXX X-X-Sender: b...@besplex.bde.org XX To: j...@freebsd.org XX Subject: r329612 breaks sync for shutdown XX Message-ID: <20180323192409.f1...@besplex.bde.org> XX MIME-Version: 1.0 XX Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed XX Status: O XX X-Status: XX X-Keywords: XX X-UID: 7935 XX XX r329612 (with or without later changes) sometimes (consistently in one XX hardware coniguration) hangs in clean shutdowns by init: XX XX i386 with 8 or 4 CPUs: it hasn't failed yet XX amd64 with 8 CPUs:: it hasn't failed yet XX and64 with 4 CPUs (by turning off HTT): it always or almost always hangs XX XX The hang is usually in "Syncing disks, vnodes remaining... 0 ". Less than XX 10% of the time it hangs earlier in "Waiting ... for syncer' ...". XX It is just waiting for a wakeup that never arrives: In the working case, XX it prints about 2 more 0's, with about half a second between each. If it XX reaches the second 0, it always completes. XX XX This is with SCHED_4BSD. SCHED_ULE seems to work. I recently looked for XX missing wakeups and found some for idle threads in mwait. This affects XX both schedulers but fixing it makes little difference. The bug is invariant XX under other large changes in options and code. XX XX XX KDB: enter: Break to debugger XX XX [ thread pid 9 tid 100069 ] XX XX Stopped at kdb_enter+0x3a: movq$0,0x700c97(%rip) XX XX db> ps XX XX pid ppid pgrp uid state wmesg wchan cmd XX XX18 0 0 0 DL - 0x80be8462 [schedcpu] XX XX17 0 0 0 DL kpsusp 0xf80003e1a6e0 [vnlru] XX XX16 0 0 0 RL [syncer] XX XX 9 0 0 0 RL (threaded) [bufdaemon] XX XX 100059 RunQ[bufdaemon] XX XX 100064 Run CPU 1 [bufspacedaemon-0] XX XX 100065 Run CPU 3 [bufspacedaemon-1] XX XX 100066 RunQ [bufspacedaemon-2] XX XX 100067 RunQ [bufspacedaemon-3] XX XX 100068 Run CPU 0 [bufspacedaemon-4] XX XX 100069 Run CPU 2 [bufspacedaemon-5] XX XX 100070 CanRun [bufspacedaemon-6] XX XX 8 0 0 0 DL (threaded) [pagedaemon] XX XX 100058 D psleep 0x80c7b82d [pagedaemon] XX XX 100062 D launds 0x80c7b834 [laundry: dom0] XX XX 100063 D umarcl 0x80676967 [uma] XX XX 7 0 0 0 DL - 0x80c73dd4 [soaiod4] XX XX 6 0 0 0 DL - 0x80c73dd4 [soaiod3] XX XX 5 0 0 0 DL - 0x80c73dd4 [soaiod2] XX XX --More--4 0 0 0 DL - 0x80c73dd4 [soaiod1] XX XX15 0 0 0 DL cooling 0xf8000186a758 [acpi_cooling1] XX XX14 0 0 0 DL tzpoll 0x80aaa110 [acpi_thermal] XX XX 3 0 0 0 DL - 0x80aab218 [rand_harvestq] XX XX13 0 0 0 DL (threaded) [usb] XX XX 100023 D - 0xfe00839d4460 [usbus0] XX XX 100024 D - 0xfe00839d44b8
RE: [Bug 227404] UP FreeBSD VM always hangs on reboot since 20180329-r331740
> From: Bruce Evans > Sent: Tuesday, April 10, 2018 00:45 > > On Tue, 10 Apr 2018 a bug that doesn't want repl...@freebsd.org wrote: > > (The bug didn't even Cc freebsd-bugs for this followup.) Thanks for the reminder! I Cc'd bugs@ just now. > > --- Comment #4 from Dexuan Cui --- > ... > > I think I saw this a few months before that. > > My only history of this is that I built a UP kernel on 17 Dec 2017 to see > if UP kernels had the bug. So SMP kernels probably had the bug then. > > Bruce Here the bug is that UP FreeBSD VM hangs on reboot or power-off, and I'm sure this recent patch (which was commited by Jeff on Mar 26) caused this bug: https://github.com/freebsd/freebsd/commit/63a483ed5f4eaadb8979992c7a5de24c7a471c61 ("Fix a bug introduced in r329612 that slowly invalidates all clean bufs"). However, SMP VM with 2 or more CPUs doesn't hang on reboot/power-off according to our tests. Thanks, -- Dexuan ___ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"
Re: [Bug 227404] UP FreeBSD VM always hangs on reboot since 20180329-r331740
On Tue, 10 Apr 2018 a bug that doesn't want repl...@freebsd.org wrote: (The bug didn't even Cc freebsd-bugs for this followup.) https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227404 --- Comment #4 from Dexuan Cui--- I think the first bad patch is this one: https://github.com/freebsd/freebsd/commit/63a483ed5f4eaadb8979992c7a5de24c7a471c61 (Fix a bug introduced in r329612 that slowly invalidates all clean bufs.): Today's https://github.com/freebsd/freebsd/commit/66e8725e8d24141506bc4f458ec7d1a51e86304c is broken, but if I revert 63a483ed5f4eaadb8979992c7a5de24c7a471c61, the bug can not reproduce. Cc bde & jeff. I think I saw this a few months before that. My only history of this is that I built a UP kernel on 17 Dec 2017 to see if UP kernels had the bug. So SMP kernels probably had the bug then. Bruce ___ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"
Re: [Bug 227404] UP FreeBSD VM always hangs on reboot since 20180329-r331740
On Tue, 10 Apr 2018 a bug that doesn't want repl...@freebsd.org wrote: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227404 ... --- Comment #1 from Dexuan Cui--- When the issue happens, the cpu utilization of the UP VM is 100%. While we're trying to find the first bad revision, it would be great if somebody can report if the issue also happens to bare metal or other hypervisors. This has been happening for at least several months on real hardware too. SMP kernels hang on a 1-CPU system and on an 8-CPU systems with all except 1 CPU turned off in the BIOS. They don't hang on the 8-CPU system with at least 2 CPUs turned on. UP kernels don't hang. I use SCHED_4BSD. SCHED_ULE is apparently not much different for this. Bruce ___ freebsd-bugs@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"