reassign 416098 linux-image-2.6.18-4-amd64
retitle 416098 MegaRAID SAS adapter locks up
quit

* Steinar H. Gunderson

I'm unsure if this is a munin bug or a kernel bug; I'm filing against
munin, but it should perhaps be reassigned.

I installed munin on a Dell PowerEdge 2950:

  Setting up munin-node (1.2.5-1) ...
  Initializing plugins..

At this point, ssh froze. I connected through the remote admin console,
which was filled with messages like:

  megasas: [65] waiting for 140 requests to complete
  megasas: [70] waiting for 140 commands to complete
  megasas: [75] waiting for 140 commands to complete
  megasas: [80] waiting for 140 commands to complete

  I googled this error message and it seems a lot of people have seen
 lockups with this SAS adapter (without there being any mention of
 Munin).  Unless you can reproduce this I agree with Stephen that the
 kernel (or hardware) is at the prime suspect, and that it happened
 during a Munin installation was just random chance.  Therefore I'm
 reassigning the bug.

  To repeat the step the init script did at the time of the crash, you
 run the following command:

    munin-node-configure --shell --debug | sh -x

  That does the same, but with more debug information.

The server never recovered, and I had to (remote) reset it. In other
words, some plugin in munin-node kills the megasas driver, which in turn
kills the entire server.

  Well, I'm not aware of any plugins that could possibly do this.  The
 only plugins I know that are run with elevated privileges and that have
 something to do with block devices is smart_ and hddtemp_smartctl, but
 they both rely on the helper program "smartctl" so the bug has would
 have had to be in that program anyway.

* Stephen Gran

Disclaimer: I am not a munin maintainer.

  Would you like to be?  :-)  I really need a comaintainer or for
 someone to take over the package completely - way too little free time
 these days.

Regards
--
Tore Anderson


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to