Re: [osol-discuss] [b 133] Catching messages on reboot from failure
Mike Gerdts mger...@gmail.com writes: In which case you can edit the kernel line in grub to add a -k option to the boot options. If the system panics, you will be dropped to a kmdb prompt. You can manually enter kmdb with F1-A or shift-break from a text console. You can use Ctrl-Alt-F1 to shift to the text console if you aren't there already. Once at the kmdb prompt, you can use ::msgbuf to see the things that have scrolled off the screen. You should get the output a page at a time. If you need to provide this Thanks,, very helpful info there I'm a bit confused though is all of this dependent on having the system panic, or do you mean you can enter kmbd from a text console any time? For example, if I were to edit /rpool/boot/grub/menu.lst at this line: kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS # console=graphics by adding `-k' before the octothorpe above. On reboot, and from a text console I could enter kmbd as you describe? Oh, and by the way, what is `shift-break'... is it on a standard us qwerty keyboard? Maybe the `pause' key? ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
Marion Hakanson hakan...@ohsu.edu writes: As someone else mentioned, most of this stuff will end up in the /var/adm/messages file. Except, of course, for the very interesting lowest-level boot problems. yeah... I got a little carried away there... I had long ago setup a /var/adm/debug.log that catches everything log message issued. with *.debug /var/adm/debug.log in /etc/syslog.conf Then forgot I had done so... The messages are there alright but now I'm finding they are not all that helpful... an example at the end. The reason why it's so primitive is that at this point (in Solaris, OpenSolaris, Linux, or Windows), you're still pretty much dependent on whatever features the BIOS provides you, which limits you just to the old-fashioned, low-tech text-only console. There's just not enough software running yet to give you a GUI. I'm not sure what you mean here With the same bios, linux manages to provide a text console that is vastly more useful... ie, it has gpm running. I wasn't asking for a gui, I like a text console... but do like to be able to mouse copy from the buffer. About the console messages; they were concerning how to find the output using fmdump, and using the `Event ID' to do so. Now armed with the Event IDs that had scrolled off the screen: I still learn nothing very useful, that is, unless by pounding away on man fmdump and probably a number of other man pages this can be made sense of. But not at all obvious what any of this might mean: fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa TIME UUID SUNW-MSG-ID Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH 100% fault.fs.zfs.vdev.checksum Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
On Wed, Oct 27, 2010 at 9:16 AM, Harry Putnam rea...@newsguy.com wrote: Mike Gerdts mger...@gmail.com writes: In which case you can edit the kernel line in grub to add a -k option to the boot options. If the system panics, you will be dropped to a kmdb prompt. You can manually enter kmdb with F1-A or shift-break from a text console. You can use Ctrl-Alt-F1 to shift to the text console if you aren't there already. Once at the kmdb prompt, you can use ::msgbuf to see the things that have scrolled off the screen. You should get the output a page at a time. If you need to provide this Thanks,, very helpful info there I'm a bit confused though is all of this dependent on having the system panic, or do you mean you can enter kmbd from a text console any time? For example, if I were to edit /rpool/boot/grub/menu.lst at this line: kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS # console=graphics by adding `-k' before the octothorpe above. On reboot, and from a text console I could enter kmbd as you describe? Yes Oh, and by the way, what is `shift-break'... is it on a standard us qwerty keyboard? Maybe the `pause' key? Quite likely. Your initial description implied to me that the system was in some state where you were stuck with just the text console, potentially locked up. The same text that is available with ::msgbuf in mdb is also available from the dmesg command. The dmesg command can be run from a more capable terminal than the console - such as from a GUI login session or remotely through ssh. Most of the output that goes to the console will also be logged by syslog - which I see you found in a message you sent just a couple minutes ago. If the system is behaving well and you want to poke around in it without pausing the kernel, you can use mdb -k as well. This can be run from any root (or otherwise appropriately privileged) shell - you don't have to be on the console. The use of mdb -k on a live system does not require that it was booted with the -k option. Also, if you can find services that were unable to start with svcs -xv. Each of them will have a log file (see output of svcs -xv) that may have more useful information. -- Mike Gerdts http://mgerdts.blogspot.com/ ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
Let's turn the problem around: What's the problem, you try to get some infos on? Perhaps this way, we can help you in describing or explaining... Matthias You (Harry Putnam) wrote: Marion Hakanson hakan...@ohsu.edu writes: As someone else mentioned, most of this stuff will end up in the /var/adm/messages file. Except, of course, for the very interesting lowest-level boot problems. yeah... I got a little carried away there... I had long ago setup a /var/adm/debug.log that catches everything log message issued. with *.debug /var/adm/debug.log in /etc/syslog.conf Then forgot I had done so... The messages are there alright but now I'm finding they are not all that helpful... an example at the end. The reason why it's so primitive is that at this point (in Solaris, OpenSolaris, Linux, or Windows), you're still pretty much dependent on whatever features the BIOS provides you, which limits you just to the old-fashioned, low-tech text-only console. There's just not enough software running yet to give you a GUI. I'm not sure what you mean here With the same bios, linux manages to provide a text console that is vastly more useful... ie, it has gpm running. I wasn't asking for a gui, I like a text console... but do like to be able to mouse copy from the buffer. About the console messages; they were concerning how to find the output using fmdump, and using the `Event ID' to do so. Now armed with the Event IDs that had scrolled off the screen: I still learn nothing very useful, that is, unless by pounding away on man fmdump and probably a number of other man pages this can be made sense of. But not at all obvious what any of this might mean: fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa TIME UUID SUNW-MSG-ID Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH 100% fault.fs.zfs.vdev.checksum Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org -- Matthias Pfützner | Tel.: +49 700 PFUETZNER | Gerhard Schröder ist der Lichtenbergstr.73 | mailto:matth...@pfuetzner.de | charakterloseste Heraus- D-64289 Darmstadt | AIM: pfuetz, ICQ: 300967487 | forderer, den ich kenne. Germany | http://www.pfuetzner.de/matthias/ | Helmut Kohl, 1.3.1998 ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
On Wed, Oct 27, 2010 at 9:31 AM, Harry Putnam rea...@newsguy.com wrote: Marion Hakanson hakan...@ohsu.edu writes: As someone else mentioned, most of this stuff will end up in the /var/adm/messages file. Except, of course, for the very interesting lowest-level boot problems. yeah... I got a little carried away there... I had long ago setup a /var/adm/debug.log that catches everything log message issued. with *.debug /var/adm/debug.log in /etc/syslog.conf Then forgot I had done so... The messages are there alright but now I'm finding they are not all that helpful... an example at the end. The reason why it's so primitive is that at this point (in Solaris, OpenSolaris, Linux, or Windows), you're still pretty much dependent on whatever features the BIOS provides you, which limits you just to the old-fashioned, low-tech text-only console. There's just not enough software running yet to give you a GUI. I'm not sure what you mean here With the same bios, linux manages to provide a text console that is vastly more useful... ie, it has gpm running. I wasn't asking for a gui, I like a text console... but do like to be able to mouse copy from the buffer. About the console messages; they were concerning how to find the output using fmdump, and using the `Event ID' to do so. Now armed with the Event IDs that had scrolled off the screen: I still learn nothing very useful, that is, unless by pounding away on man fmdump and probably a number of other man pages this can be made sense of. But not at all obvious what any of this might mean: fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa TIME UUID SUNW-MSG-ID Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH 100% fault.fs.zfs.vdev.checksum Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - It looks to me like you got a checksum error in a zpool named z3. Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - It was repaired Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved 100% fault.fs.zfs.vdev.checksum Repair Attempted Problem in: zfs://pool=z3/vdev=e35825af18775fc2 Affects: zfs://pool=z3/vdev=e35825af18775fc2 FRU: - Location: - And it has been resolved - you don't have to worry about it any more. But, I would run zpool status z3 to confirm that it says everything is healthy. It may also be advisable to run zpool scrub z3 to have zfs look for any more problems (hopefully all correctable) before they pile up to a point where they aren't correctable. If you keep having problems, it may point to hardware problems (disk, cable, disk controller, memory, etc.). -- Mike Gerdts http://mgerdts.blogspot.com/ ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
On Wed, Oct 27, 2010 at 7:39 AM, Mike Gerdts mger...@gmail.com wrote: The dmesg command can be run from a more capable terminal than the console - such as from a GUI login session or remotely through ssh. If you need to use the text console and don't have another machine handy, the screen utility can also be quite useful. It offers multiple virtual terminals, scrollback, and cut-and-paste capabilities. The commands take a little effort to learn (they're not terribly intuitive) but this gives you most of the functionality of the multiple text consoles you'd find on FreeBSD or Linux. Screen also lets you detach from a session and reattach to it later from another terminal, which can be very useful sometimes. -- David Brodbeck System Administrator, Linguistics University of Washington ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
David Brodbeck bro...@uw.edu writes: On Wed, Oct 27, 2010 at 7:39 AM, Mike Gerdts mger...@gmail.com wrote: The dmesg command can be run from a more capable terminal than the console - such as from a GUI login session or remotely through ssh. If you need to use the text console and don't have another machine handy, the screen utility can also be quite useful. It offers multiple virtual terminals, scrollback, and cut-and-paste capabilities. The commands take a little effort to learn (they're not terribly intuitive) but this gives you most of the functionality of the multiple text consoles you'd find on FreeBSD or Linux. Screen also lets you detach from a session and reattach to it later from another terminal, which can be very useful sometimes. Yes, I know and use screen often. Not a superuser by any means but I'm familiar with it. The trouble is messages like those I referred too pop up as soon as you longin (in console (text) mode). Is it possible to make `screen' your login shell? I've never looked into that, but I've never heard of anyone doing it either. ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
[osol-discuss] [b 133] Catching messages on reboot from failure
Why is the console interface left so primitive?. Seems it would at least have a usable mouse so one could have some chance of copy paste when there are what appear to be important messages written to console. This is only a problem, of course if you cannot manage a gui boot for whatever reason, you are fairly dim witted and not very knowledgeable about where to find the problems enumerated in the console messages, or the solaris logging, error tracking etc scheme. I have messages on my console that came following a complete power failure and reboot, I'm told to use fmdump to find information from the `EVENT-ID' but of course those IDs have scrolled past the visible console screen. I see no way to dump the screen or copy in any way the messages now scrolled past... further even the ones still visible would require taking them down in longhand and moving to a different machine to relog in via ssh or such. Seems ridiculous on the face of it to have such a primitive console in 2010. And finally there is one helpful bit in the messages that tells me to find more info at: sun.com/msg/FMD-8000-6U That gets me redirected to: https://identity.sun.com/amserver/UI/Login?org=self_registered_usersgoto=http://sunsolve.sun.com/search/document.do?assetkey=1-67-FMD-8000-6U-1 Which appears to be pretty useless for finding what these messages might mean. And is apparently some mess done by the new oracle crew. It seems these console messages would at least conclude with a final line telling user where to find the messages that have been written to console... Well, yeah I suppose I'm supposed to know that already but to my mind the logging and error records are quite complicated on solaris, or at least it seems so given my linux background. ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
On Tue, 26 Oct 2010 17:14:24 -0500 Harry Putnam rea...@newsguy.com wrote: Why is the console interface left so primitive?. Seems it would at least have a usable mouse so one could have some chance of copy paste when there are what appear to be important messages written to console. (stuff deleted) Hi Harry, I'm not sure if you were venting or asking a question, and if you were asking a question I may or may not have a good answer for you. But I also have somewhat of a linux background. I think that Osol also has a dmesg file like linux does, but I'm not positive. (That's where boot-up messages would go, just type dmesg in a terminal.) Otherwise, general system log messages in osol go in /var/adm/messages (as opposed to /var/log/messages in linux). Or poke around in /var and/or /var/adm to see if there are any other interesting log files... I found the /var/adm location by poking around because I knew there had to be some system log files around somewhere. Then you can open the log file in your favorite text editor (gui or vi or whichever) HTH. Cia W ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
rea...@newsguy.com said: Why is the console interface left so primitive?. Seems it would at least have a usable mouse so one could have some chance of copy paste when there are what appear to be important messages written to console. Hi Harry, As someone else mentioned, most of this stuff will end up in the /var/adm/messages file. Except, of course, for the very interesting lowest-level boot problems. The reason why it's so primitive is that at this point (in Solaris, OpenSolaris, Linux, or Windows), you're still pretty much dependent on whatever features the BIOS provides you, which limits you just to the old-fashioned, low-tech text-only console. There's just not enough software running yet to give you a GUI. What you need to solve such scrolling off the screen problems is something even lower tech: a serial (RS232, COM-port) console. Not all desktop PC BIOS'es can redirect their BIOS text to a COM port, but Solaris Linux can be told to do so for their system console input output. And grub itself can be told to do this as well. Then you hook up your troublesome machine's serial (COM) port to a working machine's serial port, fire up a terminal emulator (Windows hyperterm will work; On Linux/Unix I would use conserver, but tip will do in a pinch), and watch the console messages that way. Of course, getting the serial ports wired correctly is a bit of an art (you may need a null modem cable, for example); And there are some boot-time flags you enter via grub to tell whatever kernel you're booting to temporarily use a tty console. Telling Google something like solaris boot serial console turns up quite a few references. Regards, Marion ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] [b 133] Catching messages on reboot from failure
On Tue, Oct 26, 2010 at 9:09 PM, Marion Hakanson hakan...@ohsu.edu wrote: rea...@newsguy.com said: Why is the console interface left so primitive?. Seems it would at least have a usable mouse so one could have some chance of copy paste when there are what appear to be important messages written to console. Hi Harry, As someone else mentioned, most of this stuff will end up in the /var/adm/messages file. Except, of course, for the very interesting lowest-level boot problems. In which case you can edit the kernel line in grub to add a -k option to the boot options. If the system panics, you will be dropped to a kmdb prompt. You can manually enter kmdb with F1-A or shift-break from a text console. You can use Ctrl-Alt-F1 to shift to the text console if you aren't there already. Once at the kmdb prompt, you can use ::msgbuf to see the things that have scrolled off the screen. You should get the output a page at a time. If you need to provide this information to mailing lists, take a somewhat low resolution picture of it (e.g. a 1 megapixel picture from your mobile phone is most likely quite adequate). kmdb allows you to do many other interesting things, such as looking at running processes (::ptree, ::ps, ...) looking as to why a particular process is hung (::pgrep hungprocess | ::walk thread | ::findstack -v). If you are inclined to dig into this further, I would suggest perusing the mdb manual. Pretty much everything that works with mdb -k works with kmdb. The key exception that I've noticed is the lack of the ! operator to pipe dcmd output to a shell command. Considering that the OS is stopped while you are at a kmdb prompt, it's not surprising that ! doesn't work. The serial console advice below is also quite helpful if you have suitable hardware. Unfortunately, many systems these days lack a serial port. I doubt (without testing - I may be quite wrong) that a serial port hanging off of a USB port will be a very poor/fragile console. What you need to solve such scrolling off the screen problems is something even lower tech: a serial (RS232, COM-port) console. Not all desktop PC BIOS'es can redirect their BIOS text to a COM port, but Solaris Linux can be told to do so for their system console input output. And grub itself can be told to do this as well. Then you hook up your troublesome machine's serial (COM) port to a working machine's serial port, fire up a terminal emulator (Windows hyperterm will work; On Linux/Unix I would use conserver, but tip will do in a pinch), and watch the console messages that way. Of course, getting the serial ports wired correctly is a bit of an art (you may need a null modem cable, for example); And there are some boot-time flags you enter via grub to tell whatever kernel you're booting to temporarily use a tty console. Telling Google something like solaris boot serial console turns up quite a few references. Regards, Marion ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org -- Mike Gerdts http://mgerdts.blogspot.com/ ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org