I reported:

>> I'm puzzling at an odd performance behavior that I see with various
versions
>> of the OpenIPMI drivers.
>>
>> This is on a Dell PowerEdge 1800.  The OS is VMware ESX 3.0.1,
specifically
>> its Console OS portion which is a modified Linux 2.4.21 (based on RHEL3).
>> The IPMI drivers are various tweaked versions based on Corey's v35, v37
and
>> v39 releases.
>>
>> I don't believe the driver is using IPMI interrupts in any incarnation.
Some
>> of the drivers I'm poking at have the "kipmi0" kernel thread, some do not.
>> Performance differs greatly between those versions, but in all cases there
is
>> an anomaly.
>>
>> This anomaly is visible when watching the output of `ipmitool sdr` or
>> especially `ipmitool sensor`.  For "sdr", what I see is that it takes an
>> exceptionally long time to read one particular sensor (ECC Corr Err).  For
>> "sensor", that sensor _and_ all subsequent ones are slow.

Matt Domsch wrote:

> The hardware doesn't have an interrupt line, so no, it's not using
> interrupts. :-)
>
> The kernel thread is there exactly to trade off spare CPU cycles for
> faster response time from the BMC when interrupts aren't present.
> Ugly, but functional.

Right, I understand what it's for, just trying to puzzle out its
performance characteristics on this hardware...

> It's not that unusual for some devices to take a long time to
> respond.  That's purely a function of the BMC routine responsible for
> reporting that data.  If it needs to walk the SEL counting entries,
> that could take a while. :-)  If you want, I can try to get a
> definitive answer from the BMC firmware team.

Well, I suspect you may be talking to them after my further results; see
below.

Corey Minyard wrote:

> I really doubt the driver is the problem here, at least directly.

There's some driver complicity, as I will describe...

> My guess is that reading an ECC sensor requires sending a machine check 
> to the main processor, then the main processor reads the value and 
> returns it.  Unless the BMC has some way to directly read the registers 
> in the northbridge (JTAG maybe?).  But either way, it will be a slower 
> process than most other sensor reading, I would guess.

You're probably right about this.  There's a qualitative difference
between the sensors that are fast (fans & temperatures -- sensors that
the BMC should have fairly direct access to) and slow (various CPU and/or
chipset counters like ECC error counts, parity error counts etc.)

So that may explain the _fact_ of the performance difference, if not the
_magnitude_.

> The driver is just a conduit.  The messages are all the same size, so it 
> seems unlikely that it is the driver.
>
> If you can do a LAN connection to the box, then you can bypass the 
> driver and test it that way.  Otherwise, you will need to instrument the 
> driver to know what is going on.

I don't know how to operate IPMI via LAN, I'm sure it's possible in my
setup but I haven't made any attempt in that direction.

...

So.  One reason I was pursuing this anomaly was that, as I said, it got
_much_ worse with some driver changes.  I have now gone back and serially
layered on all the patches I'm trying to integrate.

The cause of the extra slowdowns is the "Retryable return codes" patch,

 
http://www.mail-archive.com/[email protected]/msg00451
.html

i.e.:

ipmi_msghandler.c:ipmi_smi_msg_received():

                if ((msg->rsp_size >= 3) && (msg->rsp[2] != 0)
                    && (msg->rsp[2] != IPMI_NODE_BUSY_ERR)
-                   && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR))
+                   && (msg->rsp[2] != IPMI_LOST_ARBITRATION_ERR)
+                   && (msg->rsp[2] != IPMI_BUS_ERR)
+                   && (msg->rsp[2] != IPMI_NAK_ON_WRITE_ERR))

I instrumented this and found that the driver is getting lots of
IPMI_NAK_ON_WRITE_ERRs.  No IPMI_BUS_ERRs.  Each of the slow sensors
hits 5 (exactly 5) IPMI_NAK_ON_WRITE_ERRs before completing.  This
number 5 corresponds to ipmi_msghandler.c:i_ipmi_request():

                    if (addr->addr_type == IPMI_IPMB_BROADCAST_ADDR_TYPE)
                        retries = 0; /* Don't retry broadcasts. */
                    else
  -->                   retries = 4;

It's retrying the command 4 times (== 5 total) before failing.

Even if I set retries = 0 here, it's still much slower than without
checking for IPMI_NAK_ON_WRITE_ERR.  Total runtime for `ipmitool
sensor` goes from 5s (no IPMI_NAK_ON_WRITE_ERR checking) to 24s
(checking + 0 retries) to 112s (checking + 4 retries).

Is it right that there's a 1s delay on the failure path, even when
it's on its last (re-)try?

Anyway, for my setup, that patch is very harmful to performance.

Meanwhile, on the trail of what's happening with the hardware, I
return to a slice of my original output:

##PS Redundancy           |0x0   |discrete|0x0080|na|na|na|na|na|na
##Drive                   |0x0   |discrete|0x0080|na|na|na|na|na|na
##############ECC Corr Err|na    |discrete|na    |na|na|na|na|na|na
#####ECC Uncorr Err       |na    |discrete|na    |na|na|na|na|na|na
#####I/O Channel Chk      |na    |discrete|na    |na|na|na|na|na|na

I think those "na" outputs in column 4 mean that we're not getting
any information about those sensors.  I should have noticed that
earlier...

According to `ipmitool sdr list -v all`, all of the problematic
sensors correspond to "Entity ID: 34.6 (BIOS)".  None of the happy
sensors are provided by the BIOS.

For the moment I am omitting the "Retryable return codes" patch
from my working environment; then these sensors fail immediately
instead of suffering 5-each 1s timeouts.

Matt, is it expected that the BMC on a PE1800 can't get any sensor
readings from the BIOS?

>Bela<

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Openipmi-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/openipmi-developer

Reply via email to