Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Fri, 31 Mar 2017, Andrey Chernov wrote: On 30.03.2017 21:53, Bruce Evans wrote: I think it was the sizing. The non-updated mode is 80x25, so the row address can be out of bounds in the teken layer. I have text 80x30 mode set at rc stage, and _after_ that may have many kernel messages on console, all without causing reboot. How it is different from shutdown stage? Syscons mode is unchanged since rc stage. Probably just because their weren't enough messages to go past row 24. I had no difficulty reproducing the crash today for entering ddb and reboot starting 80x30 and rows > 24, after removing just the window size update in the fix. I missed seeing it the other day because I tested with 80x60 to see the smaller console window more clarly, but must have only tried rebooting with row <= 24. Another recent fix for sc reduced the problem a little. Mode changes are supposed to clear the screen and move the cursor to home, but they only clear the screen. You should have noticed the ugliness from that after the the switch to 80x30. There are enough boot messages to reach row 24 and messages continued from there. Now they start at the top of the screen again. Clearing the messages is not ideal, but syscons always did it. Syscons also has new and old bugs preserving colors across mode changes: - it never preserved changes to the palette (FBIO_SETPALETTE ioctl). Some mode changes should reset the palette, but some should not. Especially not ones for a vt switch - BIOSes should reset the palette for mode changes (even to the same mode). Some BIOSes are confused by syscons setting the DAC to 8 bit mode and reset to a garbage (dark) palette then. They always switch back to 6 bit mode - syscons used to maintain the current colors and didn't change them for mode changes. This was slightly broken, since for a mode change from a mode with full color to one with less color, the interpretation of the color indexes might change. The colors are now maintained by teken and syscons tells teken to do a full window size change which resets the entire teken state including colors. This bug is normally hidden by vidcontrol refreshing the colors. vidcontrol could be held responsible for refreshing or resetting everything after a mode change ioctl, but I think this is backwards since there are many low-level details that are better handled in the driver. Switching to graphics modes is already a complicated 2-ioctl process with not enough options and poor error handling. Like a too-simple wrapper for fork-exec. vt has some interesting related bugs. It doesn't support mode switches of course, and even changing the font seems to be unsupported in text mode. But in graphics mode, changing the font works and even redraws the screen where syscons would clear it for the mode change. But there are bugs redrawing the screen -- often old history is redrawn. This should work like in xterm or a general X window refresh where the redrawing must be done for lots of other events than resize (exposure, etc.). - sysctl debug.kdb.break_to_debugger. This is documented in ddb(4), but only as equivalent to the unbroken BREAK_TO_DEBUGGER. Thanx. Setting debug.kdb.break_to_debugger=1 makes both Ctrl-Alt-ESC and Ctrl-PrtScr works in sc only mode and "c" exit don't cause all chars beeps like in vt. I.e. it works. But I don't understand why debugging via serial involved in sc case while not involved in vt case and fear that some serial noise may provoke break. This is because only syscons has full conflation of serial line breaks with entering the debugger via a breakpoint instuction. Syscons does: kdb_break(); for its KDB keys, while vt does: kdb_enter(KDB_WHY_BREAK, ...) for its KDB keys. The latter bypasses KDB's permissions on entering the debugger with a BREAK. It is unclear if this is a layering violation in vt or incorrect use of kdb_break() in syscons. It is certainly wrong for vt to use the KDB_WHY_BREAK code if it is avoiding using kdb_break() to fix the conflation. Is there a chance to untie serial and sc console debuggers? This is easy to do by copying vt's arguable layering violation. A little more is necessary to unconflate serial breaks: - agree that kdb_break() and KDB_WHY_BREAK are only for serial line breaks - don't use kdb_break() and KDB_WHY_BREAK for console KDB keys of course. vt already has a string saying that the entry is a "manual escape to debugger". Here "to debugger" is redundant, "manual escape" means "DDB key hit manaually by the user" and the driver that saw the key is left out. "vt KDB key" would be a more useful message. syscons used to print a similar message, but it now calls kdb_break() which produces the conflated code KDB_WHY_BREAK and the consistently conflated message "Break to debugger". This is also used for serial line breaks. Capitalization is also inconsistent. - remove kdb_break(). The only
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 21:53, Bruce Evans wrote: > I think it was the sizing. The non-updated mode is 80x25, so the row > address can be out of bounds in the teken layer. I have text 80x30 mode set at rc stage, and _after_ that may have many kernel messages on console, all without causing reboot. How it is different from shutdown stage? Syscons mode is unchanged since rc stage. > - sysctl debug.kdb.break_to_debugger. This is documented in ddb(4), but > only as equivalent to the unbroken BREAK_TO_DEBUGGER. Thanx. Setting debug.kdb.break_to_debugger=1 makes both Ctrl-Alt-ESC and Ctrl-PrtScr works in sc only mode and "c" exit don't cause all chars beeps like in vt. I.e. it works. But I don't understand why debugging via serial involved in sc case while not involved in vt case and fear that some serial noise may provoke break. Is there a chance to untie serial and sc console debuggers? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 30.03.2017 18:13, Bruce Evans wrote: On Thu, 30 Mar 2017, Andrey Chernov wrote: ... Finally I have good news and bad news with today's -current: 1) It seems your latest commit r316136 fix premature reboot issue. Now I need to know how that helped. Do you used a non-default mode? Perhaps it isn't really helped, but just hide the problem, changing some another race time parameters. I use 80x30 text mode on all screens. I think it was the sizing. The non-updated mode is 80x25, so the row address can be out of bounds in the teken layer. 2) I still can't enter KDB using Ctrl-Alt-ESC, while booting, after booting, after login and while shutdown - nothing happens. boot -d enters KDB normally, but the keyboard sequence handler is broken, not boot -d. Try "~b". What? It just prints \n, new csh prompt and ~b This takes ALT_BREAK_TO_DEBUGGER. It is an old bug that Ctrl-Alt-ESC (and Ctrl-PrtScr) GENERIC is even more broken than I remembered. It doesn't even have ALT_BREAK_TO_DEBUGGER. In old versions, this didn't affect the syscons key. The key was controlled by the SC_DISABLE_DDBKEY option so defaulted to enabled. There was no tunable or sysctl to change the default. Serial consoles had a BREAK_TO_DEBUGGER option to control entering the debugger on a serial line break. This was not per-device or even per-driver. Things were broken by conflating serial line BREAKs with entering the debugger using a breakpoint instruction. Now there are many sysctls and tunable,s but the basic enable is the conflated BREAK_TO_DEBUGGER. This now gives the default setting for entering kdb using a breakpoint instruction. Syscons calls the function kdb_break() which calls kdb_enter() which does the breakpoint instruction. Arches that don't have such an instruction must have a virtual one. The default setting can be modified using a tunable or sysctl. So to have a chance of the syscons debugger keys working, you first have to configure this setting, using either: - BREAK_TO_DEBUGGER in static config file. This is documented in ddb(4), but only for its unbroken meaning for serial consoles - tunable debug.kdb.break_to_debugger. This seems to be undocumented - sysctl debug.kdb.break_to_debugger. This is documented in ddb(4), but only as equivalent to the unbroken BREAK_TO_DEBUGGER. You have to set the variable using 1 or more of these knobs if you want the syscons and vt debugger keys to work, but this also enables debugger entry for serial line breaks and thus breaks the reason for existence of the unbroken BREAK_TO_DEBUGGER option. Normally you don't want to enter the debugger for serial line breaks, since then unplugging the cable or noise on the cable may enter the debugger, and the option exists to enable the entry for the rare cases where it is safe. Next there are the sysctl and vt knobs to set, but these have correct defaults so are enabled automatically. SC_DISABLE_DDBKEY is now named SC_DISABLE_KDBKEY. It always disabled not only the key, but the code to enable it. It actually controls 2 keys and 1 sequence of keys. When it is not configured, the Ctrl-PtrScr and Ctrl-Alt-ESC keys are enabled by default. This can be changed by a sysctl but not by a tunable. The sysctl is confusingly named with "kbd" (keyboard) in its name, while the configu option has KDB (kerel debugger) in its name. The variable for this also controls the sequences of keys which are more than ddb keys and are controlled by the ALT_BREAK_TO_DEBUGGER option and its knobs. vt doesn't have a static config knob to enable the enables. It has a tunable as well as a sysctl. This sysctl only controls the keys, not key sequences. (There may be more than 2 debugger keys. keymap allows any key to be a debugger key.) syscons and/or vt also have knobs to control halt, poweroff, reboot and panic, bug not suspend. Many of these are defeated by the sequences enabled by ALT_BREAK_TO_DEBUGGER. This is a larger bug in vt. In vt, ALT_BREAK_TO_DEBUGGER is limited by the sysctl for the kdb keys. If kdb entry is allowed, then there is no point in disallowing anything since anything can be done using kdb if it has a backend. This complexity is not enough to give enough control. The control should be per-device. You might have 1 secure console and 1 insecure console. Then enable kdb on at most the secure console. Or 1 remote serial console with a good cable and serial console with a bad cable. Then enable kdb entry for serial line breaks on at most the one with the good cable. With per-device control, the 6 knobs for controlling entry at the kdb level would be sillier, but at least 1 knowb is needed there to prevent all ddb use. Ctrl-PrtScr does nothing too. But I think the misconfiguration is the same for vt. No, Ctrl-Alt-ESC works for vt at every phase of the system lifecycle. My point it that it is easy to misconfigure the maze of knobs. However, since sc used to work
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 18:13, Bruce Evans wrote: > On Thu, 30 Mar 2017, Andrey Chernov wrote: > >> On 30.03.2017 12:34, Andrey Chernov wrote: >>> On 30.03.2017 12:23, Andrey Chernov wrote: Yes, only for reboot/shutdown. The system does not do anythings wrong even under high load. On reboot or hang those lines are never printed: kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done kernel: Waiting (max 60 seconds) for system process `syncer' to stop... kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done kernel: All buffers synced. (it is from 10-stable sample, old -current samples are lost) Moreover, GELI swap deactivation lines are never printed too (I already mention that I change swap to normal, but nothing is changed). >>> >>> I start to have raw guess that _any_ kernel printf in shutdown mode >>> cause not printf but premature reboot. >> >> Finally I have good news and bad news with today's -current: >> >> 1) It seems your latest commit r316136 fix premature reboot issue. > > Now I need to know how that helped. Do you used a non-default mode? Perhaps it isn't really helped, but just hide the problem, changing some another race time parameters. I use 80x30 text mode on all screens. >> 2) I still can't enter KDB using Ctrl-Alt-ESC, while booting, after >> booting, after login and while shutdown - nothing happens. >> boot -d enters KDB normally, but the keyboard sequence handler is >> broken, not boot -d. > > Try "~b". What? It just prints \n, new csh prompt and ~b > It is an old bug that Ctrl-Alt-ESC (and Ctrl-PrtScr) Ctrl-PrtScr does nothing too. > But I think the misconfiguration is the > same for vt. No, Ctrl-Alt-ESC works for vt at every phase of the system lifecycle. I use Russian keymap for syscons, but Ctrl, Alt, ESC of course are not remapped. I surely remember it works for syscons long time ago. Just to not forget it, I use PS/2 keyboard and have no vt* lines in the kernel config. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 30.03.2017 12:34, Andrey Chernov wrote: On 30.03.2017 12:23, Andrey Chernov wrote: Yes, only for reboot/shutdown. The system does not do anythings wrong even under high load. On reboot or hang those lines are never printed: kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done kernel: Waiting (max 60 seconds) for system process `syncer' to stop... kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done kernel: All buffers synced. (it is from 10-stable sample, old -current samples are lost) Moreover, GELI swap deactivation lines are never printed too (I already mention that I change swap to normal, but nothing is changed). I start to have raw guess that _any_ kernel printf in shutdown mode cause not printf but premature reboot. Finally I have good news and bad news with today's -current: 1) It seems your latest commit r316136 fix premature reboot issue. Now I need to know how that helped. Do you used a non-default mode? The change had 2 parts and I should have split it for testing. It fixes the window sizing and constructors. 2) I still can't enter KDB using Ctrl-Alt-ESC, while booting, after booting, after login and while shutdown - nothing happens. boot -d enters KDB normally, but the keyboard sequence handler is broken, not boot -d. Try "~b". It is an old bug that Ctrl-Alt-ESC (and Ctrl-PrtScr) are misconfigured by default. But I think the misconfiguration is the same for vt. There are about 3 layers of options that have to be set to "enable" or not set to "disable" to enable these keys. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 30.03.2017 14:23, Andriy Gapon wrote: On 30/03/2017 12:34, Andrey Chernov wrote: On 30.03.2017 12:23, Andrey Chernov wrote: Yes, only for reboot/shutdown. The system does not do anythings wrong even under high load. On reboot or hang those lines are never printed: kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done kernel: Waiting (max 60 seconds) for system process `syncer' to stop... kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done kernel: All buffers synced. (it is from 10-stable sample, old -current samples are lost) Moreover, GELI swap deactivation lines are never printed too (I already mention that I change swap to normal, but nothing is changed). I start to have raw guess that _any_ kernel printf in shutdown mode cause not printf but premature reboot. This sounds somewhat familiar... I vaguely recall an opposite issue that happened in the past. After one of my changes the reboot started hanging for one user. Turned out that the actual bug was always there, but previously the system rebooted because of a printf that caused a LOR (between spinlocks, AFAIR), witness tried to report it... using printf, and that recursed and there was a triple fault in the end. Let me try to dig some details, maybe the current issue is related in some ways. By chance, do you have WITNESS but not WITNESS_SKIPSPIN in your kernel config? No, I don't have WITNESS* I think removing all vt* lines from the kernel confing (and leaving sc) will be enough to reproduce it, but I am not sure. INVARIANTS with WITNESS is not a bad way to debug problems :-). I just remembered to try it with recent changes. It didn't find any problems for rebooting. The problems reported in Andriy's 2012 threads are almost exactly the ones that I have mostly fixed in syscons -- LORs and deadlocks, and endless recursion in WITNESS to report the problem. Syscons now detects and handles most LORs and deadlocks in itself, but I haven't committed the fixes for upper layers yet, so syscons mostly doesn't get called. cnputs() was "fixed" to silently drop the output. There is still an annoying LOR for devfs vs ufs in reboot. This is reported with no problems since it is not related to consoles. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: We don't understand the bug yet. It might not even be in sc. Do you only see problems for shutdown? The shutdown environment is special for locking. Yes, only for reboot/shutdown. The system does not do anythings wrong even under high load. On reboot or hang those lines are never printed: kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done kernel: Waiting (max 60 seconds) for system process `syncer' to stop... kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done kernel: All buffers synced. (it is from 10-stable sample, old -current samples are lost) Moreover, GELI swap deactivation lines are never printed too (I already mention that I change swap to normal, but nothing is changed). A hang in sc means that deadlock occurred and sc's new deadlock detection didn't work. Hangs are rare. Most common are premature reboots. Check that ddb works before shutdown, or just put a lot of printfs in I can't check it ddb because I can't enter ddb in sc mode, as I already write, nothing happens. Only vt mode allows Ctrl-Alt-ESC, but the bug does not exist in vt mode, so it is pointless. That is signficant. My changes were initially all about making ddb work almost perfectly with sc. ddb is entered by kdb first calling cngrab(), which does much the same things as cnputc(), but more to set up for using the keyboard. If the sc part of cngrab() detects a problem, it should return and then the sc part of cnputc() should detect the same problem and do emergency output which might be just to buffer it. Nothing at all happening looks like a simpler problem, with Ctrl-Alt-ESC not being recognized. There are too many ways to enable/disable this entry, but I didn't change this. You might have entered ddb in a context which used to race or deadlock. No. I try about 20 times on machine which does nothing and can't enter KDB in sc only mode, but got one dead hang instead, when start to repeat it too fast. Even earlier than shutdown, and when booting? I mean in normal operation mode after booting, earlier than shutdown. Shutdown with premature reboot is too fast to press anything at the right time. I don't try to enter ddb when booting yet, but tell you results later. Look early in kern_reboot(), where it does print_uptime() then cngrab(). Console output before this cngrab() should work normally, and I suspect that something in cngrab() reboots. But syncing the file systems is done before this. I think they are unmounted later, so are fscked but don't need more than fsck -p if they have been synced. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 12:34, Andrey Chernov wrote: > On 30.03.2017 12:23, Andrey Chernov wrote: >> Yes, only for reboot/shutdown. The system does not do anythings wrong >> even under high load. On reboot or hang those lines are never printed: >> >> kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done >> kernel: Waiting (max 60 seconds) for system process `bufdaemon' to >> stop...done >> kernel: Waiting (max 60 seconds) for system process `syncer' to stop... >> kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done >> kernel: All buffers synced. >> (it is from 10-stable sample, old -current samples are lost) >> >> Moreover, GELI swap deactivation lines are never printed too (I already >> mention that I change swap to normal, but nothing is changed). > > I start to have raw guess that _any_ kernel printf in shutdown mode > cause not printf but premature reboot. Finally I have good news and bad news with today's -current: 1) It seems your latest commit r316136 fix premature reboot issue. 2) I still can't enter KDB using Ctrl-Alt-ESC, while booting, after booting, after login and while shutdown - nothing happens. boot -d enters KDB normally, but the keyboard sequence handler is broken, not boot -d. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 14:23, Andriy Gapon wrote: > On 30/03/2017 12:34, Andrey Chernov wrote: >> On 30.03.2017 12:23, Andrey Chernov wrote: >>> Yes, only for reboot/shutdown. The system does not do anythings wrong >>> even under high load. On reboot or hang those lines are never printed: >>> >>> kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done >>> kernel: Waiting (max 60 seconds) for system process `bufdaemon' to >>> stop...done >>> kernel: Waiting (max 60 seconds) for system process `syncer' to stop... >>> kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done >>> kernel: All buffers synced. >>> (it is from 10-stable sample, old -current samples are lost) >>> >>> Moreover, GELI swap deactivation lines are never printed too (I already >>> mention that I change swap to normal, but nothing is changed). >> >> I start to have raw guess that _any_ kernel printf in shutdown mode >> cause not printf but premature reboot. > > This sounds somewhat familiar... > I vaguely recall an opposite issue that happened in the past. After one of my > changes the reboot started hanging for one user. Turned out that the actual > bug > was always there, but previously the system rebooted because of a printf that > caused a LOR (between spinlocks, AFAIR), witness tried to report it... using > printf, and that recursed and there was a triple fault in the end. > > Let me try to dig some details, maybe the current issue is related in some > ways. > > By chance, do you have WITNESS but not WITNESS_SKIPSPIN in your kernel config? No, I don't have WITNESS* I think removing all vt* lines from the kernel confing (and leaving sc) will be enough to reproduce it, but I am not sure. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 30.03.2017 9:51, Andrey Chernov wrote: On 30.03.2017 8:53, Bruce Evans wrote: The escape sequences in dmesg are very interesting. You should debug those. I'll send you them a bit later. Since I don't want vt at all, I don't want to debug or fix it, let it die. Here it is: kernel: allscreens_kbd cursor^[[=0A^[[=7F^[[=0G^[[=0H^[[=7Ividcontrol: setting cursor type: Inappropriate ioctl for device It is caused by vidcontrol call which left from previous sc setup. This turns out to be uninteresting then. I think you have to configure something specially to get console messages in dmesg, but I get then in console.log, which also requires special configuration (turn this on in syslog.conf). In my configuration, vidcontrol only does ioctls in rc.d, so there are no escape sequences for vidcontrol in console.log, and only 1 error message (for changing the font to a syscons font). There should be more failures, but some ioctls are null instead of working. "vidcontrol show >/dev/console" works to show the colors and also to show that escape sequences end up in console.log. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30/03/2017 14:23, Andriy Gapon wrote: > On 30/03/2017 12:34, Andrey Chernov wrote: >> On 30.03.2017 12:23, Andrey Chernov wrote: >>> Yes, only for reboot/shutdown. The system does not do anythings wrong >>> even under high load. On reboot or hang those lines are never printed: >>> >>> kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done >>> kernel: Waiting (max 60 seconds) for system process `bufdaemon' to >>> stop...done >>> kernel: Waiting (max 60 seconds) for system process `syncer' to stop... >>> kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done >>> kernel: All buffers synced. >>> (it is from 10-stable sample, old -current samples are lost) >>> >>> Moreover, GELI swap deactivation lines are never printed too (I already >>> mention that I change swap to normal, but nothing is changed). >> >> I start to have raw guess that _any_ kernel printf in shutdown mode >> cause not printf but premature reboot. > > This sounds somewhat familiar... > I vaguely recall an opposite issue that happened in the past. After one of my > changes the reboot started hanging for one user. Turned out that the actual > bug > was always there, but previously the system rebooted because of a printf that > caused a LOR (between spinlocks, AFAIR), witness tried to report it... using > printf, and that recursed and there was a triple fault in the end. > > Let me try to dig some details, maybe the current issue is related in some > ways. Here they are: https://lists.freebsd.org/pipermail/freebsd-hackers/2012-May/038812.html Turns out I remembered them quite wrong. > By chance, do you have WITNESS but not WITNESS_SKIPSPIN in your kernel config? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30/03/2017 12:34, Andrey Chernov wrote: > On 30.03.2017 12:23, Andrey Chernov wrote: >> Yes, only for reboot/shutdown. The system does not do anythings wrong >> even under high load. On reboot or hang those lines are never printed: >> >> kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done >> kernel: Waiting (max 60 seconds) for system process `bufdaemon' to >> stop...done >> kernel: Waiting (max 60 seconds) for system process `syncer' to stop... >> kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done >> kernel: All buffers synced. >> (it is from 10-stable sample, old -current samples are lost) >> >> Moreover, GELI swap deactivation lines are never printed too (I already >> mention that I change swap to normal, but nothing is changed). > > I start to have raw guess that _any_ kernel printf in shutdown mode > cause not printf but premature reboot. This sounds somewhat familiar... I vaguely recall an opposite issue that happened in the past. After one of my changes the reboot started hanging for one user. Turned out that the actual bug was always there, but previously the system rebooted because of a printf that caused a LOR (between spinlocks, AFAIR), witness tried to report it... using printf, and that recursed and there was a triple fault in the end. Let me try to dig some details, maybe the current issue is related in some ways. By chance, do you have WITNESS but not WITNESS_SKIPSPIN in your kernel config? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 29.03.2017 6:29, Bruce Evans wrote: ... I just found the cause, it is new syscons bug (bde@ cc'ed). I never compile vt driver into kernel, i.e. I don't have this lines in the kernel config: devicevt devicevt_vga devicevt_efifb When I add them, the bug described is gone. It seems syscons goes off to early, provoking reboot. Bah, I only have vt and vt_vga to check that I didn't break them. Unfortunately, syscons still works right when I remove these lines. Maybe two will be enough too, I don't check. I just don't need _any_ of vt lines. What is matter it is that syscons only mode (without any vt) was recently broken, causing shutdown problems and file system damage each time. Syscons only mode works for years until you break it recently. Actually, I fixed it not so recently (over the last few months), partly with much older local fixes. Kernel messages in syscons are now supposed to be colorized by CPU. The It looks really crazy on 8-core CPU and should not be default. And I don't see colors in vt mode (which should be parallel at that point, at least), but what about invisible escapes on vidcontrol errors (f.e. invalid argument) in vt mode? It is tuned for an 8-core CPU :-). 16 CPUs don't get unique colors by default, but could get 16 unique foreground ones and 1 reverse video (reverse video indeed looks crazier for short messages). 2 CPUs don't get the best choice of colors by default. More than 16 CPUs woold need to use lots of reverse video, except in graphics mode I'm considering expanding to 256 or 64K colors. vt doesn't support colorized kernel messages since I don't want to touch it more than necessary. See subr_terminal.c:termcn_putc(). This is almost exactly the same as scteken_puts() where the color change and some bugs were. It has to switch to the kernel color, and does this by abusing the user state. User escape sequences get corrupted by kernel output, and kernel escape sequence to change the color change the user's color but not the kernel's if they are atomic and not part of a user escape sequence. The escape sequences in dmesg are very interesting. You should debug those. They might be caused by misparsing of kernel escape sequences, or more likely by corruption of user escape sequences. This might happen when: - user prints foo" and ther terminal parses - kernel interrupts this and prints "bar"; "foo" is a supported sequence but "bar" isn't - the error handling is to print the entire escape sequence (that would be the interleaved message "bar" up to the point where the error is detected. Kernel console drivers seem to discard the entire mess. Userland xterm seems to print the entire message. Usually there aren't enough kernel messages interleaved with user ones to make the problem obvious. My changes should fix the problem for syscons, not cause it. But if they are slightly wrong, then they might cause it. Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode anymore - nothing happens. In the vt mode I can, but can't exit via "c" properly, all chars typed after "c" produce beep unless I switch to another screen and back. Try backing out r315984 only. This is supposed to fix parsing of output. I'll try. thanx. But most dangerous new syscons bug is the first one, damaging file system on each reboot. I try to go to KDB to debug it, but seeing that I can't even enter KDB I understand that all that bugs, including nasty one, are introduced by your syscons changes, it was a hint to add completely unneeded and unused vt to my kernel config file. It's normal to have a slightly damaged file system after a panic. You might have entered ddb in a context which used to race or deadlock. It might have seemed to work if it only raced. After the fix, when in this mode the following happens: - in graphics mode, no output is done. The races and deadlocks are not all fixed in the keyboard driver, and it might work in this mode. - in text mode, output is done specially, direct to the frame buffer, in a horizontal window 2/3 of the screen size. This doesn't use a full terminal driver so is hard to use at first. Even the reduced window causes problems. The colorization was originally to make this mode more usable. This mode is rarely active, except for debugging the console driver itself, or for low-level trap handlers. Put a breakpoint almost anywhere in the console driver to see it. sc_puts() is a good choice. vt is real downgrade. Its default console font is plain ugly, it is impossible to work with it. I can't find proper TERM for it to make function keys and pseudographics works in ncurses apps (not with xterm, a little better with xterm-sco), lynx can't display all things properly, etc. I agree, but started testing vt a few years ago, and have workarounds for some of its deficiencies. After committing only half of my fixes for low-level console drivers, I
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Thu, 30 Mar 2017, Andrey Chernov wrote: On 30.03.2017 8:53, Bruce Evans wrote: Maybe two will be enough too, I don't check. I just don't need _any_ of vt lines. What is matter it is that syscons only mode (without any vt) was recently broken, causing shutdown problems and file system damage each time. Syscons only mode works for years until you break it recently. Actually, I fixed it not so recently (over the last few months), partly with much older local fixes. Please commit your fix as soon as possible. Committing it is what broke things for you. vt is broken as designed in many aspects (I even mention not all of them), It is not that bad. It is much cleaner, but 10-20 times slower and too simple to have as many features or preserve old features, and I don't like rewrites than remove or move features. vt does well to be as compatible as it is, so only annoys people who use the more arcane syscons features (I don't use most of them, but find them in regression tests). Syscons looks ugly, but much better when you look at the details. but from other hand I can't allow dirty filesystem (or hang) on each reboot using sc only mode as always. It is dangerous, and fsck takes big time. Moreover, using sc while keeping vt bloat compiled in the kernel just as the bug workaround is the best demotivator for perfectionist. We don't understand the bug yet. It might not even be in sc. Do you only see problems for shutdown? The shutdown environment is special for locking. A hang in sc means that deadlock occurred and sc's new deadlock detection didn't work. sc is supposed to either drop the output or do it specially when it detects deadlock. Deadlocks can also occur in upper layers of the console driver, but even more rarely. I haven't committed fixes for this yet. cnputs() detects some deadlocks and handles them by dropping the output. This loses WITNESS output when you need it for debugging the deadlock. The escape sequences in dmesg are very interesting. You should debug those. I'll send you them a bit later. Since I don't want vt at all, I don't want to debug or fix it, let it die. :-) I'll try. thanx. But most dangerous new syscons bug is the first one, damaging file system on each reboot. I try to go to KDB to debug it, but seeing that I can't even enter KDB I understand that all that bugs, including nasty one, are introduced by your syscons changes, it was a hint to add completely unneeded and unused vt to my kernel config file. It's normal to have a slightly damaged file system after a panic. In sc only mode I have no kernel panic, i.e panic with trace on console or entering KDB. I have silent reboot in the middle or end of shutdown sequence or rare dead hang on reboot (which absolutely not acceptable for remote machine). There's not much that sc does which can cause that. Maybe a wrong pointer for the frame buffer access in emergency ouput. I saw reboots when I broke this during booting. Check that ddb works before shutdown, or just put a lot of printfs in the shutdown sequence to see where it stops working. I usually sprinkle ddb breakpoints instead of printf()s. This requires more console code to work. Both should work until the final shutdown message from a working version. ddb breakpoints don't work properly under SMP. If all CPUs hit the same one, then the first one corrupts the state for the others. Shutdown should be mostly on a single CPU or with not all CPUs running the shutdown code, so most won't hit breakpoints in shutdown code, so it is fairly safe to put them there. You might have entered ddb in a context which used to race or deadlock. No. I try about 20 times on machine which does nothing and can't enter KDB in sc only mode, but got one dead hang instead, when start to repeat it too fast. Even earlier than shutdown, and when booting? booting with -d gives a simpler environment until sc is completely attached. Try testing that first. Also, do tests before mounting file systems so that nothing needs fsck'ing. In vt mode I can enter each time, but there are exit problems I already mention. I use text mode in sc. Strings for function keys: - these are just broken in both sc and vt I have all function keys working in sc only mode with TERM=cons25 and similar ones. Pseudographics: - I don't use it enough to see problems in it. Even finding the unicode glyph for the block character took me some time. Even cp437 have it and dialog library use it for all windows frames, f.e. all ports config windows use pseudographics if it is available and working (replaced by +-| etc poor looking ASCII otherwise). I call this line-drawing characters for cp437, and use them occasionally, but I don't know the termcap method for using them very well. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 12:23, Andrey Chernov wrote: > Yes, only for reboot/shutdown. The system does not do anythings wrong > even under high load. On reboot or hang those lines are never printed: > > kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done > kernel: Waiting (max 60 seconds) for system process `bufdaemon' to > stop...done > kernel: Waiting (max 60 seconds) for system process `syncer' to stop... > kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done > kernel: All buffers synced. > (it is from 10-stable sample, old -current samples are lost) > > Moreover, GELI swap deactivation lines are never printed too (I already > mention that I change swap to normal, but nothing is changed). I start to have raw guess that _any_ kernel printf in shutdown mode cause not printf but premature reboot. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
> We don't understand the bug yet. It might not even be in sc. Do you only > see problems for shutdown? The shutdown environment is special for > locking. Yes, only for reboot/shutdown. The system does not do anythings wrong even under high load. On reboot or hang those lines are never printed: kernel: Waiting (max 60 seconds) for system process `vnlru' to stop...done kernel: Waiting (max 60 seconds) for system process `bufdaemon' to stop...done kernel: Waiting (max 60 seconds) for system process `syncer' to stop... kernel: Syncing disks, vnodes remaining...5 3 0 1 0 0 done kernel: All buffers synced. (it is from 10-stable sample, old -current samples are lost) Moreover, GELI swap deactivation lines are never printed too (I already mention that I change swap to normal, but nothing is changed). > A hang in sc means that deadlock occurred and sc's new deadlock detection > didn't work. Hangs are rare. Most common are premature reboots. > Check that ddb works before shutdown, or just put a lot of printfs in I can't check it ddb because I can't enter ddb in sc mode, as I already write, nothing happens. Only vt mode allows Ctrl-Alt-ESC, but the bug does not exist in vt mode, so it is pointless. >>> You might have entered ddb in a context which used to race or deadlock. >> >> No. I try about 20 times on machine which does nothing and can't enter >> KDB in sc only mode, but got one dead hang instead, when start to repeat >> it too fast. > > Even earlier than shutdown, and when booting? I mean in normal operation mode after booting, earlier than shutdown. Shutdown with premature reboot is too fast to press anything at the right time. I don't try to enter ddb when booting yet, but tell you results later. > I call this line-drawing characters for cp437, and use them occasionally, > but I don't know the termcap method for using them very well. See ac, as, ae. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 9:51, Andrey Chernov wrote: > On 30.03.2017 8:53, Bruce Evans wrote: >>> Maybe two will be enough too, I don't check. I just don't need _any_ of >>> vt lines. What is matter it is that syscons only mode (without any vt) >>> was recently broken, causing shutdown problems and file system damage >>> each time. Syscons only mode works for years until you break it recently. >> >> Actually, I fixed it not so recently (over the last few months), partly >> with much older local fixes. > > Please commit your fix as soon as possible. vt is broken as designed in > many aspects (I even mention not all of them), but from other hand I > can't allow dirty filesystem (or hang) on each reboot using sc only mode > as always. It is dangerous, and fsck takes big time. Moreover, using sc > while keeping vt bloat compiled in the kernel just as the bug workaround > is the best demotivator for perfectionist. > >> The escape sequences in dmesg are very interesting. You should debug >> those. > > I'll send you them a bit later. Since I don't want vt at all, I don't > want to debug or fix it, let it die. Here it is: kernel: allscreens_kbd cursor^[[=0A^[[=7F^[[=0G^[[=0H^[[=7Ividcontrol: setting cursor type: Inappropriate ioctl for device It is caused by vidcontrol call which left from previous sc setup. >>> I'll try. thanx. But most dangerous new syscons bug is the first one, >>> damaging file system on each reboot. I try to go to KDB to debug it, but >>> seeing that I can't even enter KDB I understand that all that bugs, >>> including nasty one, are introduced by your syscons changes, it was a >>> hint to add completely unneeded and unused vt to my kernel config file. >> >> It's normal to have a slightly damaged file system after a panic. > > In sc only mode I have no kernel panic, i.e panic with trace on console > or entering KDB. I have silent reboot in the middle or end of shutdown > sequence or rare dead hang on reboot (which absolutely not acceptable > for remote machine). > >> You might have entered ddb in a context which used to race or deadlock. > > No. I try about 20 times on machine which does nothing and can't enter > KDB in sc only mode, but got one dead hang instead, when start to repeat > it too fast. In vt mode I can enter each time, but there are exit > problems I already mention. > I use text mode in sc. > >> Strings for function keys: >> - these are just broken in both sc and vt > > I have all function keys working in sc only mode with TERM=cons25 and > similar ones. > >> Pseudographics: >> - I don't use it enough to see problems in it. Even finding the unicode >> glyph for the block character took me some time. > > Even cp437 have it and dialog library use it for all windows frames, > f.e. all ports config windows use pseudographics if it is available and > working (replaced by +-| etc poor looking ASCII otherwise). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 30.03.2017 8:53, Bruce Evans wrote: >> Maybe two will be enough too, I don't check. I just don't need _any_ of >> vt lines. What is matter it is that syscons only mode (without any vt) >> was recently broken, causing shutdown problems and file system damage >> each time. Syscons only mode works for years until you break it recently. > > Actually, I fixed it not so recently (over the last few months), partly > with much older local fixes. Please commit your fix as soon as possible. vt is broken as designed in many aspects (I even mention not all of them), but from other hand I can't allow dirty filesystem (or hang) on each reboot using sc only mode as always. It is dangerous, and fsck takes big time. Moreover, using sc while keeping vt bloat compiled in the kernel just as the bug workaround is the best demotivator for perfectionist. > The escape sequences in dmesg are very interesting. You should debug > those. I'll send you them a bit later. Since I don't want vt at all, I don't want to debug or fix it, let it die. >> I'll try. thanx. But most dangerous new syscons bug is the first one, >> damaging file system on each reboot. I try to go to KDB to debug it, but >> seeing that I can't even enter KDB I understand that all that bugs, >> including nasty one, are introduced by your syscons changes, it was a >> hint to add completely unneeded and unused vt to my kernel config file. > > It's normal to have a slightly damaged file system after a panic. In sc only mode I have no kernel panic, i.e panic with trace on console or entering KDB. I have silent reboot in the middle or end of shutdown sequence or rare dead hang on reboot (which absolutely not acceptable for remote machine). > You might have entered ddb in a context which used to race or deadlock. No. I try about 20 times on machine which does nothing and can't enter KDB in sc only mode, but got one dead hang instead, when start to repeat it too fast. In vt mode I can enter each time, but there are exit problems I already mention. I use text mode in sc. > Strings for function keys: > - these are just broken in both sc and vt I have all function keys working in sc only mode with TERM=cons25 and similar ones. > Pseudographics: > - I don't use it enough to see problems in it. Even finding the unicode > glyph for the block character took me some time. Even cp437 have it and dialog library use it for all windows frames, f.e. all ports config windows use pseudographics if it is available and working (replaced by +-| etc poor looking ASCII otherwise). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 29.03.2017 6:29, Bruce Evans wrote: > Using rc_debug=yes I see that it is the kernel problem, not rc > problem. > Sometimes rc backward sequence executed even fully, sometimes only > partly, but in unpredictable moment inside rc sequence the kernel > decide > to reboot quickly (or even deadly hang in rare cases). Always without > any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal > UFS, > no EFI, no GPT. > I change GELI swap to normal one, but it does not help. The same > untouched config works for years, I see this bug for the first time in > FreeBSD. I forget to mention that typescript and dmesg does not survive after this reboot (or rare hang). >>> >>> Good to note. >>> The simple explanation to the problem might be r307755, depending on >>> when you last synced/built ^/head. >>> >>> I have a few more questions (if reverting that doesn't pan out): >> >> I just found the cause, it is new syscons bug (bde@ cc'ed). I never >> compile vt driver into kernel, i.e. I don't have this lines in the >> kernel config: >> >> devicevt >> devicevt_vga >> devicevt_efifb >> >> When I add them, the bug described is gone. It seems syscons goes off to >> early, provoking reboot. > > Bah, I only have vt and vt_vga to check that I didn't break them. > > Unfortunately, syscons still works right when I remove these lines. Maybe two will be enough too, I don't check. I just don't need _any_ of vt lines. What is matter it is that syscons only mode (without any vt) was recently broken, causing shutdown problems and file system damage each time. Syscons only mode works for years until you break it recently. > Kernel messages in syscons are now supposed to be colorized by CPU. The It looks really crazy on 8-core CPU and should not be default. And I don't see colors in vt mode (which should be parallel at that point, at least), but what about invisible escapes on vidcontrol errors (f.e. invalid argument) in vt mode? >> Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode >> anymore - nothing happens. In the vt mode I can, but can't exit via "c" >> properly, all chars typed after "c" produce beep unless I switch to >> another screen and back. > > Try backing out r315984 only. This is supposed to fix parsing of output. I'll try. thanx. But most dangerous new syscons bug is the first one, damaging file system on each reboot. I try to go to KDB to debug it, but seeing that I can't even enter KDB I understand that all that bugs, including nasty one, are introduced by your syscons changes, it was a hint to add completely unneeded and unused vt to my kernel config file. vt is real downgrade. Its default console font is plain ugly, it is impossible to work with it. I can't find proper TERM for it to make function keys and pseudographics works in ncurses apps (not with xterm, a little better with xterm-sco), lynx can't display all things properly, etc. All we need is KMS integration alone and not vt. > But I suspect it is a usb keyboard problem. No, I have PS/2 keyboard. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Tue, 28 Mar 2017, Ngie Cooper wrote: On Mar 28, 2017, at 21:40, Bruce Evanswrote: On Wed, 29 Mar 2017, Bruce Evans wrote: On Wed, 29 Mar 2017, Andrey Chernov wrote: ... Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode anymore - nothing happens. In the vt mode I can, but can't exit via "c" properly, all chars typed after "c" produce beep unless I switch to another screen and back. All it means that syscons becomes very broken now by itself and even damages the kernel operations. I found a bug in screen resizing (the console context doesn't get resized). This doesn't cause any keyboard problems. ... But I suspect it is a usb keyboard problem. Syscons now does almost correct locking for the screen, but not for the keyboard, and the usb keyboard is especially fragile, especially in ddb mode. Console input is not used in normal operation except for checking for characters on reboot. Try using vt with syscons unconfigured. Syscons shouldn't be used when vt is selected, but unconfigure it to be sure. vt has different bugs using the usb keyboard. I haven't tested usb keyboards recently. ... I tested usb keyboards again. They sometimes work, much the same as a few months ago after some fixes: ... The above testing is with a usb keyboard, no ps/2 keyboard, and no kbdmux. Other combinations and dynamic switching move the bugs around, and a serial console is needed to recover in cases where the bugs prevent any keyboard input. I filed a bug a few years ago about USB keyboards and usability in ddb. If you increase the timeout so the USB hubs have enough time to probe/attach, they will work. Is that for user mode or earlier? ukb has some other fixes for ddb now, but of course it can't work before it finds the device. I recently found that usb boot drives sometimes don't have enough time to probe/attach before they are used in mountroot, and the mount -a prompt does locking that doesn't allow them enough time if they are not ready before it. The usb maintainers already know about this. I haven't taken the time to follow up on that and fix the issue, or at least propose a bit more functional workaround. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Wed, 29 Mar 2017, Bruce Evans wrote: On Wed, 29 Mar 2017, Andrey Chernov wrote: ... Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode anymore - nothing happens. In the vt mode I can, but can't exit via "c" properly, all chars typed after "c" produce beep unless I switch to another screen and back. All it means that syscons becomes very broken now by itself and even damages the kernel operations. ... But I suspect it is a usb keyboard problem. Syscons now does almost correct locking for the screen, but not for the keyboard, and the usb keyboard is especially fragile, especially in ddb mode. Console input is not used in normal operation except for checking for characters on reboot. Try using vt with syscons unconfigured. Syscons shouldn't be used when vt is selected, but unconfigure it to be sure. vt has different bugs using the usb keyboard. I haven't tested usb keyboards recently. I tested usb keyboards again. They sometimes work, much the same as a few months ago after some fixes: - after booting with -d, they never work (give no input) at the ddb prompt with either sc or vt. usb is not initialized then, and no usb keyboard is attached to sc or vt - after booting without loader with -a, sc rarely or never works (gives no input) at the mountroot prompt - after booting with loader with -a, vt works at the mountroot prompt. I don't normally use loader but need to use it to change the configuration. This might be better than before. There used to be a screen refresh bug. - after booting with loader with -a, sc works at the mountroot prompt too. I previously debugged that vt worked better because it attaches the keyboard before this point, while sc attaches it after. Booting with loader apparently fixes the order. - after any booting, sc works for user input (except sometimes after a too-soft hard reset, the keyboard doesn't even work in the BIOS, and it takes unplugging the keyboard to fix this) - after almost any booting, vt doesn't work for user input (gives no input). However, if ddb is entered using a serial console, vt does work! A few months ago, normal input was fixed by configuring kbdmux (the default in GENERIC). It is not fixed by unplugging the keyboard. kbdmux has a known bug of not doing nested switching for the keyboard state. Perhaps this "fixes" ddb mode. But I would have expected it to break ddb mode. - I didn't test sc after entering ddb, except early when it doesn't work. The above testing is with a usb keyboard, no ps/2 keyboard, and no kbdmux. Other combinations and dynamic switching move the bugs around, and a serial console is needed to recover in cases where the bugs prevent any keyboard input. Bruce ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On Wed, 29 Mar 2017, Andrey Chernov wrote: On 29.03.2017 0:46, Ngie Cooper (yaneurabeya) wrote: On Mar 28, 2017, at 14:27, Andrey Chernovwrote: ??? Using rc_debug=yes I see that it is the kernel problem, not rc problem. Sometimes rc backward sequence executed even fully, sometimes only partly, but in unpredictable moment inside rc sequence the kernel decide to reboot quickly (or even deadly hang in rare cases). Always without any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal UFS, no EFI, no GPT. I change GELI swap to normal one, but it does not help. The same untouched config works for years, I see this bug for the first time in FreeBSD. I forget to mention that typescript and dmesg does not survive after this reboot (or rare hang). Good to note. The simple explanation to the problem might be r307755, depending on when you last synced/built ^/head. I have a few more questions (if reverting that doesn't pan out): I just found the cause, it is new syscons bug (bde@ cc'ed). I never compile vt driver into kernel, i.e. I don't have this lines in the kernel config: device vt device vt_vga device vt_efifb When I add them, the bug described is gone. It seems syscons goes off to early, provoking reboot. Bah, I only have vt and vt_vga to check that I didn't break them. Unfortunately, syscons still works right when I remove these lines. I also find some lines of the kernel messages strange colored instead of white in the syscons only mode. Even in vt mode vidcontrol errors have invisible escapes prepended (although visible through /var/log/messages). Kernel messages in syscons are now supposed to be colorized by CPU. The boot messages should show all the colors. Shutdown and ddb are normally done by a single random CPU, so are shown in a single random color. The colors are bright (light) 8-15 foreground, except bright black (8) is not so bright. Configure with a non-default KERNEL_SC_CONS_ATTR (maybe yellow on black instead of lightwhite on black) to turn of the colorization. I haven't tested this recently. There is also a sysctl for setting all the colors. Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode anymore - nothing happens. In the vt mode I can, but can't exit via "c" properly, all chars typed after "c" produce beep unless I switch to another screen and back. All it means that syscons becomes very broken now by itself and even damages the kernel operations. Try backing out r315984 only. This is supposed to fix parsing of output. It switches to a state indexed by the CPU for every character, and switches back. Screen switching does a different switch and would fix any bug in switching back. But I suspect it is a usb keyboard problem. Syscons now does almost correct locking for the screen, but not for the keyboard, and the usb keyboard is especially fragile, especially in ddb mode. Console input is not used in normal operation except for checking for characters on reboot. Try using vt with syscons unconfigured. Syscons shouldn't be used when vt is selected, but unconfigure it to be sure. vt has different bugs using the usb keyboard. I haven't tested usb keyboards recently. Bruce___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
> On Mar 29, 2017, at 01:26, Bruce Evanswrote: > > On Tue, 28 Mar 2017, Ngie Cooper wrote: > >>> On Mar 28, 2017, at 21:40, Bruce Evans wrote: >>> On Wed, 29 Mar 2017, Bruce Evans wrote: > On Wed, 29 Mar 2017, Andrey Chernov wrote: > ... > Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode > anymore - nothing happens. In the vt mode I can, but can't exit via "c" > properly, all chars typed after "c" produce beep unless I switch to > another screen and back. > All it means that syscons becomes very broken now by itself and even > damages the kernel operations. > > I found a bug in screen resizing (the console context doesn't get resized). > This doesn't cause any keyboard problems. > ... But I suspect it is a usb keyboard problem. Syscons now does almost correct locking for the screen, but not for the keyboard, and the usb keyboard is especially fragile, especially in ddb mode. Console input is not used in normal operation except for checking for characters on reboot. Try using vt with syscons unconfigured. Syscons shouldn't be used when vt is selected, but unconfigure it to be sure. vt has different bugs using the usb keyboard. I haven't tested usb keyboards recently. >>> >>> ... >>> I tested usb keyboards again. They sometimes work, much the same as >>> a few months ago after some fixes: >>> ... >>> >>> The above testing is with a usb keyboard, no ps/2 keyboard, and no kbdmux. >>> Other combinations and dynamic switching move the bugs around, and a >>> serial console is needed to recover in cases where the bugs prevent any >>> keyboard input. >> >> I filed a bug a few years ago about USB keyboards and usability in ddb. If >> you increase the timeout so the USB hubs have enough time to probe/attach, >> they will work. > > Is that for user mode or earlier? ukb has some other fixes for ddb now, but > of course it can't work before it finds the device. > > I recently found that usb boot drives sometimes don't have enough time to > probe/attach before they are used in mountroot, and the mount -a prompt > does locking that doesn't allow them enough time if they are not ready > before it. The usb maintainers already know about this. Ah, I misremembered my filing the bug — someone else did it: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=133989 (it happens at mountroot, for example, because of probing order being what it is). -Ngie signature.asc Description: Message signed with OpenPGP using GPGMail
Re: New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
> On Mar 28, 2017, at 21:40, Bruce Evanswrote: > >> On Wed, 29 Mar 2017, Bruce Evans wrote: >> >>> On Wed, 29 Mar 2017, Andrey Chernov wrote: >>> ... >>> Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode >>> anymore - nothing happens. In the vt mode I can, but can't exit via "c" >>> properly, all chars typed after "c" produce beep unless I switch to >>> another screen and back. >>> All it means that syscons becomes very broken now by itself and even >>> damages the kernel operations. >> >> ... >> But I suspect it is a usb keyboard problem. Syscons now does almost >> correct locking for the screen, but not for the keyboard, and the usb >> keyboard is especially fragile, especially in ddb mode. Console input >> is not used in normal operation except for checking for characters on >> reboot. >> >> Try using vt with syscons unconfigured. Syscons shouldn't be used when >> vt is selected, but unconfigure it to be sure. vt has different bugs >> using the usb keyboard. I haven't tested usb keyboards recently. > > I tested usb keyboards again. They sometimes work, much the same as > a few months ago after some fixes: > - after booting with -d, they never work (give no input) at the ddb > prompt with either sc or vt. usb is not initialized then, and no usb > keyboard is attached to sc or vt > - after booting without loader with -a, sc rarely or never works (gives > no input) at the mountroot prompt > - after booting with loader with -a, vt works at the mountroot prompt. > I don't normally use loader but need to use it to change the configuration. > This might be better than before. There used to be a screen refresh bug. > - after booting with loader with -a, sc works at the mountroot prompt too. > I previously debugged that vt worked better because it attaches the keyboard > before this point, while sc attaches it after. Booting with loader > apparently fixes the order. > - after any booting, sc works for user input (except sometimes after a > too-soft hard reset, the keyboard doesn't even work in the BIOS, and it > takes unplugging the keyboard to fix this) > - after almost any booting, vt doesn't work for user input (gives no input). > However, if ddb is entered using a serial console, vt does work! A few > months ago, normal input was fixed by configuring kbdmux (the default in > GENERIC). It is not fixed by unplugging the keyboard. kbdmux has a known > bug of not doing nested switching for the keyboard state. Perhaps this > "fixes" ddb mode. But I would have expected it to break ddb mode. > - I didn't test sc after entering ddb, except early when it doesn't work. > > The above testing is with a usb keyboard, no ps/2 keyboard, and no kbdmux. > Other combinations and dynamic switching move the bugs around, and a > serial console is needed to recover in cases where the bugs prevent any > keyboard input. I filed a bug a few years ago about USB keyboards and usability in ddb. If you increase the timeout so the USB hubs have enough time to probe/attach, they will work. I haven't taken the time to follow up on that and fix the issue, or at least propose a bit more functional workaround. HTH, -Ngie ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
New syscons bugs: shutdown -r doesn't execute rc.d sequence and others
On 29.03.2017 0:46, Ngie Cooper (yaneurabeya) wrote: > >> On Mar 28, 2017, at 14:27, Andrey Chernovwrote: > > … > >>> Using rc_debug=yes I see that it is the kernel problem, not rc problem. >>> Sometimes rc backward sequence executed even fully, sometimes only >>> partly, but in unpredictable moment inside rc sequence the kernel decide >>> to reboot quickly (or even deadly hang in rare cases). Always without >>> any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal UFS, >>> no EFI, no GPT. >>> I change GELI swap to normal one, but it does not help. The same >>> untouched config works for years, I see this bug for the first time in >>> FreeBSD. >>> >> >> I forget to mention that typescript and dmesg does not survive after >> this reboot (or rare hang). > > Good to note. > The simple explanation to the problem might be r307755, depending on when you > last synced/built ^/head. > > I have a few more questions (if reverting that doesn't pan out): I just found the cause, it is new syscons bug (bde@ cc'ed). I never compile vt driver into kernel, i.e. I don't have this lines in the kernel config: device vt device vt_vga device vt_efifb When I add them, the bug described is gone. It seems syscons goes off to early, provoking reboot. I also find some lines of the kernel messages strange colored instead of white in the syscons only mode. Even in vt mode vidcontrol errors have invisible escapes prepended (although visible through /var/log/messages). Moreover, I can't enter KDB via Ctrl-Alt-ESC in the syscons only mode anymore - nothing happens. In the vt mode I can, but can't exit via "c" properly, all chars typed after "c" produce beep unless I switch to another screen and back. All it means that syscons becomes very broken now by itself and even damages the kernel operations. signature.asc Description: OpenPGP digital signature
Re: shutdown -r doesn't execute rc.d sequence
> On Mar 28, 2017, at 14:27, Andrey Chernovwrote: … >> Using rc_debug=yes I see that it is the kernel problem, not rc problem. >> Sometimes rc backward sequence executed even fully, sometimes only >> partly, but in unpredictable moment inside rc sequence the kernel decide >> to reboot quickly (or even deadly hang in rare cases). Always without >> any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal UFS, >> no EFI, no GPT. >> I change GELI swap to normal one, but it does not help. The same >> untouched config works for years, I see this bug for the first time in >> FreeBSD. >> > > I forget to mention that typescript and dmesg does not survive after > this reboot (or rare hang). Good to note. The simple explanation to the problem might be r307755, depending on when you last synced/built ^/head. I have a few more questions (if reverting that doesn't pan out): - What make/model of x86_64 are you running AMD (Bulldozer, etc) or Intel (Conroe/Nehalem/Westmere/Sandybridge/Haswell)? - Is this a custom built machine? If so, what is your motherboard? If not, who’s the vendor? - Is your system firmware up to date? - What does your make.conf/src.conf look like? - Are you running GENERIC or a custom kernel? If custom, could you please include the config somewhere? - Are you loading any drivers in loader.conf? - Are you using Linux emulation? - Are you running any blob drivers, like nvidia? Thanks! -Ngie signature.asc Description: Message signed with OpenPGP using GPGMail
Re: shutdown -r doesn't execute rc.d sequence
On 29.03.2017 0:15, Andrey Chernov wrote: > On 28.03.2017 22:33, Ngie Cooper (yaneurabeya) wrote: >> >>> On Mar 28, 2017, at 12:30, Andrey Chernovwrote: >>> >>> With latest -current amd64, reboot happens almost immediately, leaving >>> FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence >>> execution is shown. No deactivating GELI swap too. >> >> Hi Andrey, >> Do you have a typescript demonstrating this? Adding rc_debug=yes to >> /etc/rc.conf would be super helpful, along with `boot -v`, to see whether >> the issue is in userspace or rc(5). >> I’ll double check my amd64/i386 VMs too (if possible, redirect the >> output over serial to a typescript for analysis). >> Are you using vanilla FreeBSD, a fork, or a packaged variant (mfsbsd, >> nanobsd, etc)? >> Thanks! >> -Ngie >> > > Using rc_debug=yes I see that it is the kernel problem, not rc problem. > Sometimes rc backward sequence executed even fully, sometimes only > partly, but in unpredictable moment inside rc sequence the kernel decide > to reboot quickly (or even deadly hang in rare cases). Always without > any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal UFS, > no EFI, no GPT. > I change GELI swap to normal one, but it does not help. The same > untouched config works for years, I see this bug for the first time in > FreeBSD. > I forget to mention that typescript and dmesg does not survive after this reboot (or rare hang). signature.asc Description: OpenPGP digital signature
Re: shutdown -r doesn't execute rc.d sequence
On 28.03.2017 22:33, Ngie Cooper (yaneurabeya) wrote: > >> On Mar 28, 2017, at 12:30, Andrey Chernovwrote: >> >> With latest -current amd64, reboot happens almost immediately, leaving >> FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence >> execution is shown. No deactivating GELI swap too. > > Hi Andrey, > Do you have a typescript demonstrating this? Adding rc_debug=yes to > /etc/rc.conf would be super helpful, along with `boot -v`, to see whether the > issue is in userspace or rc(5). > I’ll double check my amd64/i386 VMs too (if possible, redirect the > output over serial to a typescript for analysis). > Are you using vanilla FreeBSD, a fork, or a packaged variant (mfsbsd, > nanobsd, etc)? > Thanks! > -Ngie > Using rc_debug=yes I see that it is the kernel problem, not rc problem. Sometimes rc backward sequence executed even fully, sometimes only partly, but in unpredictable moment inside rc sequence the kernel decide to reboot quickly (or even deadly hang in rare cases). Always without any "Syncing buffers..." leaving FS dirty. No zfs etc. just normal UFS, no EFI, no GPT. I change GELI swap to normal one, but it does not help. The same untouched config works for years, I see this bug for the first time in FreeBSD. signature.asc Description: OpenPGP digital signature
Re: shutdown -r doesn't execute rc.d sequence
On Tue, Mar 28, 2017 at 10:38:53PM +0300, Andrey Chernov wrote: > ... > >> With latest -current amd64, reboot happens almost immediately, leaving > >> FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence > >> execution is shown. No deactivating GELI swap too. > > > > Hi Andrey, > > Do you have a typescript demonstrating this? Adding rc_debug=yes to > > /etc/rc.conf would be super helpful, along with `boot -v`, to see whether > > the issue is in userspace or rc(5). > > I’ll double check my amd64/i386 VMs too (if possible, redirect the > > output over serial to a typescript for analysis). > > Are you using vanilla FreeBSD, a fork, or a packaged variant (mfsbsd, > > nanobsd, etc)? > > Thanks! > > -Ngie > > I don't have serial, so typescript may not work treating as dirty, but > I'll try. As I say, it is today's -current, vanilla. It looks like > regression because few weeks ago all things works. > FWIW, I did not see an issue either for my build machine of my laptop at r316082: FreeBSD g1-252.catwhisker.org 12.0-CURRENT FreeBSD 12.0-CURRENT #298 r316082M/316093:1200027: Tue Mar 28 06:37:15 PDT 2017 r...@g1-252.catwhisker.org:/common/S4/obj/usr/src/sys/CANARY amd64 Peace, david -- David H. Wolfskill da...@catwhisker.org Who would have thought that a "hotelier" would be so ... unwelcoming? Sad. See http://www.catwhisker.org/~david/publickey.gpg for my public key. signature.asc Description: PGP signature
Re: shutdown -r doesn't execute rc.d sequence
On 28.03.2017 22:33, Ngie Cooper (yaneurabeya) wrote: > >> On Mar 28, 2017, at 12:30, Andrey Chernovwrote: >> >> With latest -current amd64, reboot happens almost immediately, leaving >> FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence >> execution is shown. No deactivating GELI swap too. > > Hi Andrey, > Do you have a typescript demonstrating this? Adding rc_debug=yes to > /etc/rc.conf would be super helpful, along with `boot -v`, to see whether the > issue is in userspace or rc(5). > I’ll double check my amd64/i386 VMs too (if possible, redirect the > output over serial to a typescript for analysis). > Are you using vanilla FreeBSD, a fork, or a packaged variant (mfsbsd, > nanobsd, etc)? > Thanks! > -Ngie I don't have serial, so typescript may not work treating as dirty, but I'll try. As I say, it is today's -current, vanilla. It looks like regression because few weeks ago all things works. signature.asc Description: OpenPGP digital signature
Re: shutdown -r doesn't execute rc.d sequence
> On Mar 28, 2017, at 12:30, Andrey Chernovwrote: > > With latest -current amd64, reboot happens almost immediately, leaving > FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence > execution is shown. No deactivating GELI swap too. Hi Andrey, Do you have a typescript demonstrating this? Adding rc_debug=yes to /etc/rc.conf would be super helpful, along with `boot -v`, to see whether the issue is in userspace or rc(5). I’ll double check my amd64/i386 VMs too (if possible, redirect the output over serial to a typescript for analysis). Are you using vanilla FreeBSD, a fork, or a packaged variant (mfsbsd, nanobsd, etc)? Thanks! -Ngie signature.asc Description: Message signed with OpenPGP using GPGMail
shutdown -r doesn't execute rc.d sequence
With latest -current amd64, reboot happens almost immediately, leaving FS dirty. No proper backward rc.d or /usr/local/etc/rc.d sequence execution is shown. No deactivating GELI swap too. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"