Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Harry Putnam
Mike Gerdts mger...@gmail.com writes:

 In which case you can edit the kernel line in grub to add a -k option
 to the boot options.  If the system panics, you will be dropped to a
 kmdb prompt.  You can manually enter kmdb with F1-A or shift-break
 from a text console.  You can use Ctrl-Alt-F1 to shift to the text
 console if you aren't there already.  Once at the kmdb prompt, you can
 use ::msgbuf to see the things that have scrolled off the screen.  You
 should get the output a page at a time.  If you need to provide this

Thanks,, very helpful info there

  I'm a bit confused though is all of this dependent on having the
  system panic, or do you mean you can enter kmbd from a text console
  any time?

  For example, if I were to edit /rpool/boot/grub/menu.lst at this
  line:

  kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS # console=graphics
 
 by adding `-k' before the octothorpe above.  

 On reboot, and from a text console I could enter kmbd as you
 describe?

Oh, and by the way, what is `shift-break'... is it on a standard us
qwerty keyboard?   Maybe the `pause' key?
 

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Harry Putnam
Marion Hakanson hakan...@ohsu.edu writes:

 As someone else mentioned, most of this stuff will end up in the
 /var/adm/messages file.  Except, of course, for the very interesting
 lowest-level boot problems.

yeah... I got a little carried away there... I had long ago setup a
/var/adm/debug.log that catches everything log message issued. with

  *.debug  /var/adm/debug.log

in /etc/syslog.conf

Then forgot I had done so... The messages are there alright but now
I'm finding they are not all that helpful... an example at the end.

 The reason why it's so primitive is that at this point (in Solaris,
 OpenSolaris, Linux, or Windows), you're still pretty much dependent
 on whatever features the BIOS provides you, which limits you just
 to the old-fashioned, low-tech text-only console.  There's just not
 enough software running yet to give you a GUI.

I'm not sure what you mean here With the same bios, linux manages
to provide a text console that is vastly more useful... ie, it has gpm
running. I wasn't asking for a gui,  I like a text console... but do
like to be able to mouse copy from the buffer.

About the console messages; they were concerning how to find the
output using fmdump, and using the `Event ID' to do so.

Now armed with the Event IDs that had scrolled off the screen:

I still learn nothing very useful, that is, unless by pounding away on
man fmdump and probably a number of other man pages this can be made
sense of.  But not at all obvious what any of this might mean:

 fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa

TIME UUID SUNW-MSG-ID
Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH
  100%  fault.fs.zfs.vdev.checksum

Problem in: zfs://pool=z3/vdev=e35825af18775fc2
   Affects: zfs://pool=z3/vdev=e35825af18775fc2
   FRU: -
  Location: -

Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired
  100%  fault.fs.zfs.vdev.checksum  Repair Attempted

Problem in: zfs://pool=z3/vdev=e35825af18775fc2
   Affects: zfs://pool=z3/vdev=e35825af18775fc2
   FRU: -
  Location: -

Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved
  100%  fault.fs.zfs.vdev.checksum  Repair Attempted

Problem in: zfs://pool=z3/vdev=e35825af18775fc2
   Affects: zfs://pool=z3/vdev=e35825af18775fc2
   FRU: -
  Location: -


___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Mike Gerdts
On Wed, Oct 27, 2010 at 9:16 AM, Harry Putnam rea...@newsguy.com wrote:
 Mike Gerdts mger...@gmail.com writes:

 In which case you can edit the kernel line in grub to add a -k option
 to the boot options.  If the system panics, you will be dropped to a
 kmdb prompt.  You can manually enter kmdb with F1-A or shift-break
 from a text console.  You can use Ctrl-Alt-F1 to shift to the text
 console if you aren't there already.  Once at the kmdb prompt, you can
 use ::msgbuf to see the things that have scrolled off the screen.  You
 should get the output a page at a time.  If you need to provide this

 Thanks,, very helpful info there

  I'm a bit confused though is all of this dependent on having the
  system panic, or do you mean you can enter kmbd from a text console
  any time?

  For example, if I were to edit /rpool/boot/grub/menu.lst at this
  line:

  kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS # console=graphics

  by adding `-k' before the octothorpe above.

  On reboot, and from a text console I could enter kmbd as you
  describe?

Yes


 Oh, and by the way, what is `shift-break'... is it on a standard us
 qwerty keyboard?   Maybe the `pause' key?

Quite likely.

Your initial description implied to me that the system was in some
state where you were stuck with just the text console, potentially
locked up.  The same text that is available with ::msgbuf in mdb is
also available from the dmesg command.  The dmesg command can be run
from a more capable terminal than the console - such as from a GUI
login session or remotely through ssh.  Most of the output that goes
to the console will also be logged by syslog - which I see you found
in a message you sent just a couple minutes ago.

If the system is behaving well and you want to poke around in it
without pausing the kernel, you can use mdb -k as well.  This can be
run from any root (or otherwise appropriately privileged) shell - you
don't have to be on the console.  The use of mdb -k on a live system
does not require that it was booted with the -k option.

Also, if you can find services that were unable to start with svcs
-xv. Each of them will have a log file (see output of svcs -xv)
that may have more useful information.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Matthias Pfützner
Let's turn the problem around:

What's the problem, you try to get some infos on?

Perhaps this way, we can help you in describing or explaining...

   Matthias

You (Harry Putnam) wrote:
 Marion Hakanson hakan...@ohsu.edu writes:
 
  As someone else mentioned, most of this stuff will end up in the
  /var/adm/messages file.  Except, of course, for the very interesting
  lowest-level boot problems.
 
 yeah... I got a little carried away there... I had long ago setup a
 /var/adm/debug.log that catches everything log message issued. with
 
   *.debug  /var/adm/debug.log
 
 in /etc/syslog.conf
 
 Then forgot I had done so... The messages are there alright but now
 I'm finding they are not all that helpful... an example at the end.
 
  The reason why it's so primitive is that at this point (in Solaris,
  OpenSolaris, Linux, or Windows), you're still pretty much dependent
  on whatever features the BIOS provides you, which limits you just
  to the old-fashioned, low-tech text-only console.  There's just not
  enough software running yet to give you a GUI.
 
 I'm not sure what you mean here With the same bios, linux manages
 to provide a text console that is vastly more useful... ie, it has gpm
 running. I wasn't asking for a gui,  I like a text console... but do
 like to be able to mouse copy from the buffer.
 
 About the console messages; they were concerning how to find the
 output using fmdump, and using the `Event ID' to do so.
 
 Now armed with the Event IDs that had scrolled off the screen:
 
 I still learn nothing very useful, that is, unless by pounding away on
 man fmdump and probably a number of other man pages this can be made
 sense of.  But not at all obvious what any of this might mean:
 
  fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa
 
 TIME UUID SUNW-MSG-ID
 Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH
   100%  fault.fs.zfs.vdev.checksum
 
 Problem in: zfs://pool=z3/vdev=e35825af18775fc2
Affects: zfs://pool=z3/vdev=e35825af18775fc2
FRU: -
   Location: -
 
 Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired
   100%  fault.fs.zfs.vdev.checksum  Repair Attempted
 
 Problem in: zfs://pool=z3/vdev=e35825af18775fc2
Affects: zfs://pool=z3/vdev=e35825af18775fc2
FRU: -
   Location: -
 
 Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved
   100%  fault.fs.zfs.vdev.checksum  Repair Attempted
 
 Problem in: zfs://pool=z3/vdev=e35825af18775fc2
Affects: zfs://pool=z3/vdev=e35825af18775fc2
FRU: -
   Location: -
 
 
 ___
 opensolaris-discuss mailing list
 opensolaris-discuss@opensolaris.org
 

-- 
Matthias Pfützner | Tel.: +49 700 PFUETZNER  | Gerhard Schröder ist der
Lichtenbergstr.73 | mailto:matth...@pfuetzner.de | charakterloseste Heraus-
D-64289 Darmstadt | AIM: pfuetz, ICQ: 300967487  | forderer, den ich kenne.
Germany  | http://www.pfuetzner.de/matthias/ | Helmut Kohl, 1.3.1998
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Mike Gerdts
On Wed, Oct 27, 2010 at 9:31 AM, Harry Putnam rea...@newsguy.com wrote:
 Marion Hakanson hakan...@ohsu.edu writes:

 As someone else mentioned, most of this stuff will end up in the
 /var/adm/messages file.  Except, of course, for the very interesting
 lowest-level boot problems.

 yeah... I got a little carried away there... I had long ago setup a
 /var/adm/debug.log that catches everything log message issued. with

  *.debug              /var/adm/debug.log

 in /etc/syslog.conf

 Then forgot I had done so... The messages are there alright but now
 I'm finding they are not all that helpful... an example at the end.

 The reason why it's so primitive is that at this point (in Solaris,
 OpenSolaris, Linux, or Windows), you're still pretty much dependent
 on whatever features the BIOS provides you, which limits you just
 to the old-fashioned, low-tech text-only console.  There's just not
 enough software running yet to give you a GUI.

 I'm not sure what you mean here With the same bios, linux manages
 to provide a text console that is vastly more useful... ie, it has gpm
 running. I wasn't asking for a gui,  I like a text console... but do
 like to be able to mouse copy from the buffer.

 About the console messages; they were concerning how to find the
 output using fmdump, and using the `Event ID' to do so.

 Now armed with the Event IDs that had scrolled off the screen:

 I still learn nothing very useful, that is, unless by pounding away on
 man fmdump and probably a number of other man pages this can be made
 sense of.  But not at all obvious what any of this might mean:

  fmdump -v -u 4021154c-2d57-c54e-ae82-ea27fc2d19fa

 TIME                 UUID                                 SUNW-MSG-ID
 Oct 14 04:14:02.1292 4021154c-2d57-c54e-ae82-ea27fc2d19fa ZFS-8000-GH
  100%  fault.fs.zfs.vdev.checksum

        Problem in: zfs://pool=z3/vdev=e35825af18775fc2
           Affects: zfs://pool=z3/vdev=e35825af18775fc2
               FRU: -
          Location: -

It looks to me like you got a checksum error in a zpool named z3.


 Oct 25 17:06:24.7775 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-4M Repaired
  100%  fault.fs.zfs.vdev.checksum      Repair Attempted

        Problem in: zfs://pool=z3/vdev=e35825af18775fc2
           Affects: zfs://pool=z3/vdev=e35825af18775fc2
               FRU: -
          Location: -

It was repaired


 Oct 25 17:06:24.8987 4021154c-2d57-c54e-ae82-ea27fc2d19fa FMD-8000-6U Resolved
  100%  fault.fs.zfs.vdev.checksum      Repair Attempted

        Problem in: zfs://pool=z3/vdev=e35825af18775fc2
           Affects: zfs://pool=z3/vdev=e35825af18775fc2
               FRU: -
          Location: -

And it has been resolved - you don't have to worry about it any more.

But, I would run zpool status z3 to confirm that it says everything
is healthy.  It may also be advisable to run zpool scrub z3 to have
zfs look for any more problems (hopefully all correctable) before they
pile up to a point where they aren't correctable.  If you keep having
problems, it may point to hardware problems (disk, cable, disk
controller, memory, etc.).

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread David Brodbeck
On Wed, Oct 27, 2010 at 7:39 AM, Mike Gerdts mger...@gmail.com wrote:
 The dmesg command can be run
 from a more capable terminal than the console - such as from a GUI
 login session or remotely through ssh.

If you need to use the text console and don't have another machine
handy, the screen utility can also be quite useful.  It offers
multiple virtual terminals, scrollback, and cut-and-paste
capabilities.  The commands take a little effort to learn (they're not
terribly intuitive) but this gives you most of the functionality of
the multiple text consoles you'd find on FreeBSD or Linux.  Screen
also lets you detach from a session and reattach to it later from
another terminal, which can be very useful sometimes.

-- 
David Brodbeck
System Administrator, Linguistics
University of Washington
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-27 Thread Harry Putnam
David Brodbeck bro...@uw.edu writes:

 On Wed, Oct 27, 2010 at 7:39 AM, Mike Gerdts mger...@gmail.com wrote:
 The dmesg command can be run
 from a more capable terminal than the console - such as from a GUI
 login session or remotely through ssh.

 If you need to use the text console and don't have another machine
 handy, the screen utility can also be quite useful.  It offers
 multiple virtual terminals, scrollback, and cut-and-paste
 capabilities.  The commands take a little effort to learn (they're not
 terribly intuitive) but this gives you most of the functionality of
 the multiple text consoles you'd find on FreeBSD or Linux.  Screen
 also lets you detach from a session and reattach to it later from
 another terminal, which can be very useful sometimes.

Yes, I know and use screen often.  Not a superuser by any means but
I'm familiar with it.

The trouble is messages like those I referred too pop up as soon as
you longin (in console (text) mode).

Is it possible to make `screen' your login shell?  I've never looked
into that, but I've never heard of anyone doing it either.

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


[osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-26 Thread Harry Putnam
Why is the console interface left so primitive?.
  Seems it would at
least have a usable mouse so one could have some chance of copy paste
when there are what appear to be important messages written to
console.

This is only a problem, of course if you cannot manage a gui boot for
whatever reason, you are fairly dim witted and not very knowledgeable
about where to find the problems enumerated in the console messages,
or the solaris logging, error tracking etc scheme.

I have messages on my console that came following a complete power
failure and reboot, I'm told to use fmdump to find information from
the `EVENT-ID' but of course those IDs have scrolled past the visible
console screen.

I see no way to dump the screen or copy in any way the messages now
scrolled past... further even the ones still visible would require
taking them down in longhand and moving to a different machine to
relog in via ssh or such.  Seems ridiculous on the face of it to have
such a primitive console in 2010.

And finally there is one helpful bit in the messages that tells me to
find more info at:

  sun.com/msg/FMD-8000-6U 

  That gets me redirected to:

https://identity.sun.com/amserver/UI/Login?org=self_registered_usersgoto=http://sunsolve.sun.com/search/document.do?assetkey=1-67-FMD-8000-6U-1

Which appears to be pretty useless for finding what these messages
might mean.  And is apparently some mess done by the new oracle crew. 

It seems these console messages would at least conclude with a final
line telling user where to find the messages that have been written to
console...


Well, yeah I suppose I'm supposed to know that already but to my mind
the logging and error records are quite complicated on solaris, or at
least it seems so given my linux background.

___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-26 Thread Cia Watson
On Tue, 26 Oct 2010 17:14:24 -0500
Harry Putnam rea...@newsguy.com wrote:

 Why is the console interface left so primitive?.
   Seems it would at
 least have a usable mouse so one could have some chance of copy paste
 when there are what appear to be important messages written to
 console.
 (stuff deleted)
Hi Harry,

I'm not sure if you were venting or asking a question, and if you were
asking a question I may or may not have a good answer for you. But I also
have somewhat of a linux background. I think that Osol also has a dmesg
file like linux does, but I'm not positive. (That's where boot-up messages
would go, just type dmesg in a terminal.) Otherwise, general system log
messages in osol go in /var/adm/messages (as opposed to /var/log/messages
in linux). Or poke around in /var and/or /var/adm to see if there are any
other interesting log files... I found the /var/adm location by poking
around because I knew there had to be some system log files around
somewhere. Then you can open the log file in your favorite text editor
(gui or vi or whichever)

HTH.

Cia W
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-26 Thread Marion Hakanson
rea...@newsguy.com said:
 Why is the console interface left so primitive?.
   Seems it would at least have a usable mouse so one could have some chance
 of copy paste when there are what appear to be important messages written to
 console. 

Hi Harry,

As someone else mentioned, most of this stuff will end up in the
/var/adm/messages file.  Except, of course, for the very interesting
lowest-level boot problems.

The reason why it's so primitive is that at this point (in Solaris,
OpenSolaris, Linux, or Windows), you're still pretty much dependent
on whatever features the BIOS provides you, which limits you just
to the old-fashioned, low-tech text-only console.  There's just not
enough software running yet to give you a GUI.

What you need to solve such scrolling off the screen problems is
something even lower tech:  a serial (RS232, COM-port) console.
Not all desktop PC BIOS'es can redirect their BIOS text to a COM
port, but Solaris  Linux can be told to do so for their system
console input  output.  And grub itself can be told to do this
as well.

Then you hook up your troublesome machine's serial (COM) port to
a working machine's serial port, fire up a terminal emulator (Windows
hyperterm will work;  On Linux/Unix I would use conserver, but tip
will do in a pinch), and watch the console messages that way.

Of course, getting the serial ports wired correctly is a bit of an
art (you may need a null modem cable, for example);  And there are
some boot-time flags you enter via grub to tell whatever kernel you're
booting to temporarily use a tty console.  Telling Google something
like solaris boot serial console turns up quite a few references.

Regards,

Marion


___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] [b 133] Catching messages on reboot from failure

2010-10-26 Thread Mike Gerdts
On Tue, Oct 26, 2010 at 9:09 PM, Marion Hakanson hakan...@ohsu.edu wrote:
 rea...@newsguy.com said:
 Why is the console interface left so primitive?.
   Seems it would at least have a usable mouse so one could have some chance
 of copy paste when there are what appear to be important messages written to
 console.

 Hi Harry,

 As someone else mentioned, most of this stuff will end up in the
 /var/adm/messages file.  Except, of course, for the very interesting
 lowest-level boot problems.

In which case you can edit the kernel line in grub to add a -k option
to the boot options.  If the system panics, you will be dropped to a
kmdb prompt.  You can manually enter kmdb with F1-A or shift-break
from a text console.  You can use Ctrl-Alt-F1 to shift to the text
console if you aren't there already.  Once at the kmdb prompt, you can
use ::msgbuf to see the things that have scrolled off the screen.  You
should get the output a page at a time.  If you need to provide this
information to mailing lists, take a somewhat low resolution picture
of it (e.g. a 1 megapixel picture from your mobile phone is most
likely quite adequate).

kmdb allows you to do many other interesting things, such as looking
at running processes (::ptree, ::ps, ...) looking as to why a
particular process is hung (::pgrep hungprocess | ::walk thread |
::findstack -v).  If you are inclined to dig into this further, I
would suggest perusing the mdb manual.  Pretty much everything that
works with mdb -k works with kmdb.  The key exception that I've
noticed is the lack of the ! operator to pipe dcmd output to a shell
command.  Considering that the OS is stopped while you are at a kmdb
prompt, it's not surprising that ! doesn't work.

The serial console advice below is also quite helpful if you have
suitable hardware.  Unfortunately, many systems these days lack a
serial port.  I doubt (without testing - I may be quite wrong) that a
serial port hanging off of a USB port will be a very poor/fragile
console.

 What you need to solve such scrolling off the screen problems is
 something even lower tech:  a serial (RS232, COM-port) console.
 Not all desktop PC BIOS'es can redirect their BIOS text to a COM
 port, but Solaris  Linux can be told to do so for their system
 console input  output.  And grub itself can be told to do this
 as well.

 Then you hook up your troublesome machine's serial (COM) port to
 a working machine's serial port, fire up a terminal emulator (Windows
 hyperterm will work;  On Linux/Unix I would use conserver, but tip
 will do in a pinch), and watch the console messages that way.

 Of course, getting the serial ports wired correctly is a bit of an
 art (you may need a null modem cable, for example);  And there are
 some boot-time flags you enter via grub to tell whatever kernel you're
 booting to temporarily use a tty console.  Telling Google something
 like solaris boot serial console turns up quite a few references.

 Regards,

 Marion


 ___
 opensolaris-discuss mailing list
 opensolaris-discuss@opensolaris.org




-- 
Mike Gerdts
http://mgerdts.blogspot.com/
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org