from:"Tim Small"

Re: [Linux-PowerEdge] CVE-2020-5344

2020-04-10 Thread Tim Small

[EXTERNAL EMAIL] 

Hi,

BTW, I use this type of chroot solution to deploy updates which only
target other Linux OS versions (e.g. RHEL6) on servers which run Debian
10 and Ubuntu LTS.

This will generally work, but some updates which rely on a specific
kernel version (e.g. because they ship use "out-of-tree" kernel modules)
may still fail.  In some cases if the "out-of-tree" kernel modules have
since been "upstreamed" and included in the kernel you are running on
the server, you can instead just use modprobe to load these kernel
modules (e.g. "dell_rbu") from outside the chroot, before running the
update.

I use the Debian "schroot" tool (which takes care of bind-mounting /proc
/dev /sys /home etc. - schroot is also availabe for Redhat I believe),
and pre-generated root archives from https://images.linuxcontainers.org/

HTH,

Tim.

On 09/04/2020 21:37, Yannick PALANQUE wrote:
> [EXTERNAL EMAIL] 
>
> Hello,
>
> Le 09/04/2020 22:12, miguel.cha...@dell.com a écrit :
>> Is there a solution?
>
> I think maybe running the DUP from a chrooted installation of CentOS 7 
> could work? (you should copy a big tar.gz or something like that)
>
> But it must be like a using a truck to move a cup of tea one meter away...
> ___
> Linux-PowerEdge mailing list
> Linux-PowerEdge@dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

Re: [Linux-PowerEdge] iDRAC6 2.92 on PowerEdge R210 II

2020-03-13 Thread Tim Small

On 13/03/2020 17:16, josh.mo...@dell.com wrote:
> This is probably a miss

Upgrading directly worked - thanks.

It'd be good to get this fixed for the other R210 II upgrade methods. 
Are you able to raise a bug for that?

Cheers,

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

[Linux-PowerEdge] iDRAC6 2.92 on PowerEdge R210 II

2020-03-13 Thread Tim Small

Hello,

I wanted to update some Dell R210 II servers from iDRAC6 firmware 2.90
to 2.92.

Strangely I get:

# ./ESM_Firmware_KPCCC_LN32_2.92_A00.BIN
This Update Package is not compatible with your system

Your system: PowerEdge R210 II

System(s) supported by this package: R710, R815, T410, R715, R210, R510,
T310, R310, T610, R610, R410


Since the fix is purely a security update for remote access, I can't see
why it wouldn't be applicable to the R210 II (especially as the R210 is
listed).  I haven't tried exploiting the security problems that 2.90 ->
2.92 addresses, but I would be extremely surprised if they aren't
present on R210 II and v2.90.

Is this exclusion a mistake?

Thanks,

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

[Linux-PowerEdge] Poweredge R230 IPMI BMC bug triggers frequent error messages on RHEL 7 kernel and others

2016-07-15 Thread Tim Small

Hello,

Our EL7 machines get about 5000 messages per day saying:

ipmi_si ipmi_si.0: Could not set the global enables: 0xcc.

The OpenIPMI developers say:

"Some BMCs don't let you clear the receive irq bit in the global
enables.  This is kind of silly, but they give an error if you
try to clear it."

Ubuntu 16.04LTS (and other distros with kernel >4.0 or a backported
patch) say:

"The BMC does not support clearing the recv irq bit, compensating, but
the BMC needs to be fixed."

Was wondering if this was a known bug on the Dell BMCs, and if-so when a
fix was planned?

# ipmitool bmc info
Device ID : 32
Device Revision   : 1
Firmware Revision : 2.30
IPMI Version  : 2.0
Manufacturer ID   : 674
Manufacturer Name : DELL Inc
Product ID: 256 (0x0100)



Cheers,

Tim.



-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge

Re: Oracle Enterprise Linux

2010-11-24 Thread Tim Small

On 24/11/10 11:38, Nick Lunt wrote:

 is Oracle Enterprise Linux supported on all Dell servers, along with
 open manage, firmware updates, disk array drivers etc ?


It's not supported AFAIK, but since it's essentially a
rebuilt-from-source Redhat Enterprise Linux with some kernel performance
tweaks - you are unlikely to see any issues with OEL that you won't also
see when running RHEL, CentOS etc. on Dell hardware.

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Oracle Enterprise Linux

2010-11-24 Thread Tim Small

On 24/11/10 12:07, Tim Small wrote:

 It's not supported AFAIK, but since it's essentially a
 rebuilt-from-source Redhat Enterprise Linux with some kernel
 performance tweaks - you are unlikely to see any issues with OEL that
 you won't also see when running RHEL, CentOS etc. on Dell hardware.


Err, actually since the latest OEL 5 release uses the 2.6.32 kernel
(same as RHEL6), but the userspace is largely RHEL5, you are likely to
hit some packaging/repository issues when using the Dell RHEL6 on it,
I'd guess.

http://lwn.net/Articles/406242/

... but since Dell support both RHEL5, and RHEL6, I would have thought
things will largely work.  There is also some sort of agreement between
Dell / Oracle, however it looks like you need to go to Oracle for the
actual support, and it doesn't indicated which servers are supported:

http://www.oracle.com/us/corporate/press/161333

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PERC 4e/Di errors on RHEL3 server

2010-11-04 Thread Tim Small

On 04/11/10 19:55, Eric Wood wrote:
 Can I get some help on what replacement controller to buy?
  
 I have a PE 2800 with a RAID-5 with three seagate 36gig drives on a
 PERC 4e/Di.Currently all drives are ONLINE and working.  But the
 server has been up since 2004 and has crashed three times in recent
 days with various error messages. Currently all drives are ONLINE and
 working but
  


Oof, up since 2004 it'll be a like swiss cheese from a security
vulnerability point of view

More likely to be the drives than the controller I think - maybe the
controller is not coping very well with various failure conditions on
the drives.  I would try:

modprobe sg

then smartctl -a /dev/sgX

(if you can get smartctl installed on this box) - where X is the
relevent letter for each drive.  You may find one or more drives with
reallocated sectors (grown defect list etc.), also check the
offline-corrected ECC count etc.

After that you might want to look into upgrading the firmware on the
controller, or possibly replacing the entire machine with a new one
(which will use considerably less electricity - the 6 month electricity
and cooling cost of a server of that vintage often exceeds it's total
value by several times).


Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Perc6/i does not want to upgrade firmware. suggestions?

2010-10-31 Thread Tim Small

On 31/10/10 13:09, Arno van der Veen wrote:
 Hello all,

 I upgraded all firmware manually as written earlier, but I really can't
 get the perc6/i upgraded in it's firmware.. :-(
   

Don't use Dell's buggy, overly-complex scripts (self-extracting shell
scripts, which then install RPMs - makes me feel nauseous just thinking
about the concept - what do you think this is?  Microsoft Windows?) -
get the file out of them, and run the update manually instead using megactl?

Cheers,

Tim.


p.s. If anyone from Dell happens to be reading - if you do insist on
these nauseating byzantine scripts, don't assume /bin/sh is a link to
/bin/bash cos it often isn't

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ   
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: also OT:: Import Raid Config from Perc6/i to SAS 6i/R

2010-10-28 Thread Tim Small

I believe MegaRAID (and hence PERC 5 / 6) uses DDF - which is an open 
standard - but SAS6 (LSI 106x/107x SAS etc.) don't (BICBW).

dmraid and recent mdadm will allow you to read/modify the raw metadata.

If you are just looking to move the drives and don't mind about using 
the SAS6's raid features, then dmraid, or a recent mdadm may be the 
easiest solution - the drives would just be a JBOD, with the RAID 
implemented in the Linux kernel (using either dm, or md).

Tim.



On 28/10/10 11:59, Gregor Friedrich wrote:
 Hello

 also OT, sorry

 is there a way to import (or modify and import)the raid config metadata
 on disk form Perc6i to SAS 6i/R

 Thanks Gregor

 ___
 Linux-PowerEdge mailing list
 Linux-PowerEdge@dell.com
 https://lists.us.dell.com/mailman/listinfo/linux-poweredge
 Please read the FAQ at http://lists.us.dell.com/faq



-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: using a PERC6/i (MegaRaid SAS 1078) in JBOD mode?

2010-10-25 Thread Tim Small

On 22/10/10 13:52, Louis-David Mitterrand wrote:

 On my PowerEdge 2900 III I have a PERC6/i (MegaRaid SAS 1078). From the
 controller bios is seems I have to create a raid0 virtual drive for
 each physical disk or in order for them to appear in Linux.

 As I intend to use none of that controller's raid features, is it at all
 possible to switch it to plain JBOD mode?


AFAIK no, but you could script the creation of the 
whole-disk-single-drive-raid0 from within Linux, and if you need to plug 
the drives into another Linux box, dmraid (and also recent mdadm) will 
understand the meta data format, and so allow you to access the data.

Cheers,

Tim.



-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: problems with LSISAS2008 6Gb/s SAS kernel mpt2sas driver

2010-10-21 Thread Tim Small

On 21/10/10 08:31, Louis-David Mitterrand wrote:
 Hi,

 I am setting up a new Dell T610 server with 8 WD Black Caviar sata3 1TB
 disks on a LSISAS2008 controller:

   Oct 21 09:12:37 grml kernel: [   83.377388] mpt2sas0: LSISAS2008: 
 FWVersion(02.1
 5.63.00), ChipRevision(0x02), BiosVersion(07.01.09.00)

 My layout is as follows:

 - small un-encrypted raid1 boot partition on /dev/md0

 - dm-crypt main partition on /dev/md1 (actuallly /dev/mapper/cmd1)

 A recent grml64 is used to create the partitions, install the system and
 run lilo.

 When running lilo I get these errors from the controller:

   Oct 21 08:57:11 grml kernel: [40832.015207] mpt2sas0: 
 fault_state(0x265d)!
   Oct 21 08:57:11 grml kernel: [40832.015210] mpt2sas0: sending diag 
 reset !!



 Any suggestion on fixing that problem would be welcome. I can send more
 complete logs.


Looks like a firmware bug - do you have the latest firmware?  Drive 
firmwares?  Anything in the drive error logs (using smartctl)?

If not, then try opening a bug on the kernel bugzilla - LSI engineers 
read that (and sometimes even fix things).

Otherwise, you could try replacing with a straight SATA contoller, if 
that box doesn't have a SAS backplane - I've not been to impressed by 
the quality of engineering for LSI contollers, and SATA-on-SAS in 
general hasn't been very reliable IMO.  Just go for a well supported 
SATA controller (e.g. Sil 3132 etc.).

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: EDAC on Dell servers

2010-10-20 Thread Tim Small

On 20/10/10 19:11, Alexander Dupuy wrote:
 This is the first time I have heard of this. When you refer to Dell
 ESM are you talking about OMSA, or the onboard firmware (ESM = embedded
 system management?) of the BMC/DRAC?
   


At the moment everything is racey when it comes to the EDAC registers -
it'd be nice to have a firmware API, so that EDAC etc. could tell the
firmware to leave those registers alone (if that's what the user wants).

On some newer Intel chips, I believe the EDAC registers are only visible
from the CPU System Management mode, so Linux doesn't even get a look
in.  Bah, yet more closed sourceness...

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: R310 BMC woes

2010-10-15 Thread Tim Small


On 15/10/10 20:53, Drew Weaver wrote:


We don't want the management NIC or traffic to be visible to the 
operating system for security reasons.




It won't be - the OS never sees traffic destined for the BMC, it goes 
directly from the NIC chip to the BMC, and doesn't hit the PCIe bus 
which is attached to the main computer, so it's pretty much invisible 
(and it'd be encrypted anyway, and you can enforce encrypted-only 
coms).  The BMC gets its own MAC and IP address (these days, older IPMI 
implementations did it differently).


I use IPMI and SoL exclusive - don't bother with the DRACs - don't need 
them (and you can't get them on Tyans or Intels or HPs, whereas I can 
use ipmitool on everything).


If you're looking for low cost, did you consider the R210s?

Tim.

--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: perc6i alignment?

2010-10-14 Thread Tim Small

On 14/10/10 02:27, Eugene Vilensky wrote:
 One more question...is there anything to be concerned about regarding
 on disk geometry or does the PERC do the right thing automatically
 when using OEM drives?


Nearly all drives have 512byte sectors, so these won't be a problem.  WD 
have been shipping 1.5TB and 2TB drives with 4k sectors for a while, and 
Hitachi is now also (just) doing so.  In the case of the WD drives, they 
lie that they have a 512byte physical sectors, because if they don't 
various BIOS and software breaks (dunno about the Hitachis).

So, if you're using those drives, then the PERC had better align it's 
user-visible data on 4096 byte boundries, otherwise write performance 
will go down the toilet (unaligned write of 4k of data will result in 2 
reads and 2 writes instead of a single write).

Dunno if it does or not, I guess you could pull a drive an use dmraid to 
work out what it's doing (the RAID metadata it uses is an open standard).

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: 2.5 vs 3.5 drive performance?

2010-10-11 Thread Tim Small

On 08/10/10 23:23, Dave Sparks wrote:
 Anyone still buying their servers with 3.5 drives?


Everyone who needs large capacity storage?  If you need performance, for 
most applications SSDs would seem to be a better idea than 2.5 drives, no?

This is based on real-world prices for the drives, I haven't checked 
Dell's comedy figures recently...

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: ordering a T610: what options?

2010-09-30 Thread Tim Small

On 30/09/10 14:00, Louis-David Mitterrand wrote:
 - what Raid Connectivity (C0 to C16) should I select?


Probably doesn't matter, you can just reconfig it when you get it.
 - which Raid Controller is the best in plain JBOD mode? Which will
allow 'smartctl' to monitor the individual disks' health?



I believe the PERC H200 and SAS6 both use the same chip, but the SAS6 
may be more straightforward, however be aware of:

http://bugzilla.kernel.org/show_bug.cgi?id=14831

personally, I've not had good experiences of using SATA drives with 
these controllers under Linux - I'd prefer a straight AHCI controller 
like the Intel ICH10, or SiI3132 etc. instead of SATA-on-SAS...

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: 3.5in SAS flash drive for R900?

2010-09-10 Thread Tim Small


On 09/09/10 18:57, Philip Tait wrote:


Will the PERC-6i do TRIM passthrough?  If not, you'll need to use an
AHCI controller such as the server's onboard Intel SATA controller
etc.
(this is what we are using).

Does your kernel and filesystem support TRIM too?



Is the availability of TRIM support critical to the operation of these 
drives?


I believe all SSDs will suffer degraded performance over time, and/or 
excessive wear without TRIM - they end up doing extra writes in order to 
(unnecessarily) preserve the contents of deleted logical blocks.  Same 
goes for running them pretty-much full all the time with TRIM.  However 
the Intel drives suffer less from a lack of TRIM support than some other 
designs do (i.e. performance degrades, but doesn't drop through the floor).


This is basically due to the current generation of SSDs pretending a 
hard disk - when really they are flash (which has some very different 
physical properties).  Probably the best solution from an engineering 
point of view would be to use a flash-file-system (Linux has several - 
and they are designed to suite the physical properties of flash 
storage), and have the SSDs expose the raw underlying flash.  This 
however, is not the route the industry is taking  TRIM is a piece of 
gaffer/duct-tape to fix this problem by providing a mechanism for the OS 
to tell the SSD which logical blocks it no-longer needs to work to 
preserve


Without TRIM a workaround is to periodically discard all data by doing 
an ATA security-erase-unit, but this might not fit in with your 
anticipated usage.


bcache also looks very interesting, but is currently alpha-quality  
http://bcache.evilpiepirate.org/


BTW, the machine I'm using currently uses TRIM with both ext4, and btrfs 
on an Intel SSD, with an AHCI controller (Intel ICH10).


Tim.


--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Advice for a debian server

2010-09-06 Thread Tim Small

On 06/09/10 13:57, Emmanuel Lesouef wrote:
 Reading this list, I sometimes see threads complaining about
 compatibility issue between recent Dell servers and Debian GNU/Linux.

 What is the most stable and compatible model in order to install Debian
 Stable (Lenny) ?


Anything without a SASx controller (LSI 1068 / LSI 1068E etc.), in my 
experience (just spent the morning recovering a Lenny server following 
data corruptions with these crappy controllers).

I'm quite happy with the option of using software RAID along with the 
onboard SATA controllers on the R210, R310, R410.  Not sure if they do a 
hot-swap option with SATA-only (i.e. non-SAS) configurations, but if 
they don't and you need this, then I can recommend the Intel Server 
Systems instead (e.g. Intel SR1630HGPRX,  SR1695GPRX etc.) - they are 
engineered to a similar quality, and you don't end up paying through the 
nose for large hard disks.

Make sure you use a recent Lenny kernel for the bnx2 NICs in the R210 
etc. to work - you'll need to have the firmware-bnx2 package installed.  
The Lenny installer images probably have the kernel patch in by now, but 
if they don't, just install using a USB NIC, or similar and then update 
to the latest kernel post-install.

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: 16tb filesystems on linux

2010-08-27 Thread Tim Small


On 27/08/10 09:18, Andrew Robert Nicols wrote:
As I say, we're primarily a Debian shop and Solaris did used to feel 
like a bit of a thorn in the side but things have improved.


Did you consider/try ZFS on Debian-kFreeBSD instead of OpenSolaris to 
try and make things less painful?


http://packages.debian.org/sid/zfsutils


Cheers,

Tim.

--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Dell OpenManage 6.3 for Ubuntu

2010-08-23 Thread Tim Small

On 23/08/10 15:05, Johan Sjöberg wrote:
 I then tried to install it on Debian testing (Squeeze). On that version, it 
 was only the libsmbios-utils package that stopped me from installing.

The Ubuntu smbios-utils package has the same contents as the Debian 
libsmbios-bin package, AFAIK.

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309


___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: external ESATA drives in a R610

2010-08-23 Thread Tim Small

On 19/08/10 22:50, Bond Masuda wrote:

 ave you actually tested hot plugging a eSATA drive from a controller based
 on a Silicon Image chip that also has a PCI-E to PCI-X bridge?

Nope.

 i have one
 such card, which works as long as I don't hot plug. if I do hot plug, I get
 a machine check and instant shutdown/restart...

Not seen anything like that myself, sounds like that might be an 
electrical issue, or a design fault on that card maybe?  If you still 
have access to that setup - maybe try connecting ground on the drive 
chassis, and the PE first...  Were you able to decode the machine-check?

Cheers,

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: external ESATA drives in a R610

2010-08-19 Thread Tim Small

The Intel ICH10R (as used in the R210 R310 R410 R510) does support SATA
port multipliers (in which case I'd suggest a simple header to bring the
motherboard SATA out to the back-panel), but the R610 seems to use the
ICH9, which doesn't.  The Intel ICH are very good SATA chips, and will
get you a very high throughput, but as Silicon Image make most of the
port multiplier chips, they may have better compatibility.

The Sil3132s work well, but have somewhat limited throughput (300M per
second each I think), but this may not be an issue to you depending on
your application.  The Sil3124s may be a bit better they are 4 port
chips, and although this is a PCI-X chipset you can find implementations
which put them behind a PCIe to PCIX bridge chip - have a look on ebay
or elsewhere, they are around $70 or so, I think.

It looks like the card you found is one of those (3124 + pcie bridge),
although the fan+HS on it looks like a load of BS to me (it's on the PCI
bridge chip unless I'm mistaken - and is almost certainly there to make
it look cool only), and the price is very high..

You can find more detailed info including port multiplier (PM) support
here: https://ata.wiki.kernel.org/index.php/Hardware,_driver_status

Also ask on the linux-ide mailing list...


 I'm interested in adding hotplug ESATA capability with port multiplier
 for backup purposes to something like an R610.   Is this supported by
 any of the Dell external SAS controllers (don't need raid for this).


I don' think any SAS multipliers support SATA port multipliers.  You can
use SAS multipliers instead, but I don't like LSIs SAS cards, and have
had loads of trouble with them in the past (e.g. hardware bugs in the
LSI1068E etc.) - just stick with plain commodity (well supported, and
cheaper) SATA IMO.

Let us know what you come up with...

Tim.


p.s. I found that the R300s have a bug whereby they don't reliably
detect Sil3132s.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PowerEdge 2800: megaraid/scsi errors (PERC 4e/di)

2010-08-05 Thread Tim Small

On 05/08/10 08:00, Marc Petitmermet wrote:
 megaraid: aborting-12854 cmd=2ac=2 t=0 I=0
 megaraid abort: [255:128], driver owner
 megaraid: resetting the host...


 What do the above errors mean? Are the disks failing or is this an other 
 hardware issue?

 Any advise would be greatly appreciated.


I'd want to take a closer look at the general health of the drives 
themselves (grown defect list, ECC correction count, uncorrectable error 
count and the like) using a tool like smartctl - recent smartmontools 
releases have support for looking at drives behind perc 4s - search for 
megaraid in:

http://smartmontools.sourceforge.net/man/smartctl.8.html

alternatively if getting smartctl onto this box is fiddly (and you can 
easily take the drives offline) it might be easier to plug the drives 
into a plain SCSI controller on a more modern box...

HTH,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: LCD Display on R710

2010-08-05 Thread Tim Small

On 17/12/09 19:00, Alexander Dupuy wrote:
 you can use [...] the Dell-modified ipmitool 
 (http://linux.dell.com/files/openipmi/ipmitool/ipmitool-1.8.6-13.4.DELL.13.src.rpm
  or
 https://launchpad.net/ubuntu/karmic/+source/ipmitool):


 # ipmitool delloem lcd help

 lcd set {mode}|{lcdqualifier}|{errordisplay}

 lcd set mode {none}|{modelname}|{ipv4address}|{macaddress}|
 {systemname}|{servicetag}|{ipv6address}|{ambienttemp}
 {systemwatt }|{assettag}|{userdefined}text  

 [...and lots more...]



Sorry to resurrect an ancient thread, but out of interest which version 
are you running to get those options?  I used the binaries from 
launchpad (1.8.9+patches and 1.8.11+patches), and also compiled the 
1.8.6 version from the src rpm, and I only get:

ipmitool-with-dell-hacks/ipmitool-1.8.6$ ./src/ipmitool delloem lcd help

lcd set {none}|{default}|{custom text}
   Set LCD text displayed during non-fault conditions

lcd info
   Show LCD text that is displayed during non-fault conditions


... and that's all.

Tim.

p.s. Any word on when Dell is going to sort their patches out and get 
them in upstream ipmitool so we don't have to put up with this bollocks?

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Chassis identify light status checking

2010-08-05 Thread Tim Small

Hi,

I was wondering if there's a way to get the status of the chassis 
identify light on recent poweredges?  If memory serves me correctly, 
this pops out in amongst the output from ipmitool chassis status on 
Intel SRxxx servers, and although I can turn the light on with ipmitool 
chassis identify force - I can't see a way of checking the lights' 
status, and no IPMI events seem to be generated...

Any ideas?

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Safe to run smartctl on PowerEdge 2950 Dell PERC 5/E?

2010-07-29 Thread Tim Small

dellpowere...@semantico.com wrote:
 Hi List,

 I've been reading about the smartctl problems people have been having. I 
 couldn't see my controller listed in any of the threads. OS is RHEL5.4 
 smartmontools is smartmontools-5.38-2.el5
   

Safe as far as I know, but if you have a test system which you can
hammer using smartctl whilst under I/O load for a few hours, I'd
recommend doing that first.  I'd be interested to see the results, and
have been meaning to do this myself.

To be on the safe side, perhaps you'd want to review the controler and
disk firmware changelogs (from the Dell firmware update packages) too to
see if there have been any passthrough, or SMART related bug fixes since
the versions that you have installed (assuming they aren't the latest) -
however the risk of a manual one-off run of smartctl is likely to be far
lower than that of running smartd continuously...

Whilst I'm at it DELL:

PLEASE can you include the full release history in your changelogs, not
just the changes since the last release - chasing through 10 package
releases to piece together all the changes from an oldish release is
just PAINFUL.  Other vendors (e.g. Intel) include the full history, and
it makes my life a LOT easier.

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Intel VT-d (IOMMU) support on poweredge servers

2010-07-22 Thread Tim Small

Hi,

I'm trying to find out which poweredges support Intel VT-d (Directed 
I/O) - so that I can pass through a PCIe SCSI controller directly to a 
virtualised W2K3 instance using KVM?  It looks to me like the PE1950 
doesn't, but I can't seem to find a definitive answer...

Anyone have that info?

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: [Fwd: [Bug 616517] libvirt should not use the MAC address assigned to tap devices/vnet interfaces by the TAP/TUN driver.]

2010-07-21 Thread Tim Small

Andreas Rogge wrote:
  The bridge-interface will lose network
 connectivity because it changes its MAC.
   


Perhaps this is a bug in the Linux bridging code?  Shouldn't brctl
setportprio be used to force the MAC of the bridge to be the same MAC
as the physical port - then it wouldn't change  Didn't work last
time I tried this tho (which I thought was probably a bug), and it's not
particularly well documented, so I may have the wrong end of the stick here.

Tim.



-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Serial over LAN

2010-07-16 Thread Tim Small

Rahul Nabar wrote:
 ** First thing you should do in BMC setup is reset to default. The BMCs
 often ship with a weird non-default setting that will cause lots of
 serial port feedback if you try to run a getty on the serial console.
 

 Would ipmitool, or syscfg or another equivalent tool have the way of
 setting the BMC SOL? I already set the BMC's IP and pasword using
 ipmitool.
   
 I have ~300 machines so doing this step manually via Ctrl+E is tedious.




SoL can be setup and used quite easily using ipmitool - disabling
IPMI-over-serial is another matter (without rebooting, and entering
the BMC-control bit of the BIOS) - this seemed to work on the
PE1950s which I have access to:

ipmitool raw 0x06 0x40 0x02 0xb8 0x84
ipmitool raw 0x06 0x40 0x02 0x78 0x44

AFAIK this issue was only present on some PE1950s, and was probably a
problem with Dell's production-time automated testing system (IPMI over
serial - not SoL - was set up to carry out various tests, but not turned
off again prior to shipping).

HTH,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: 1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?

2010-07-09 Thread Tim Small

Peter Kjellstrom wrote:
 We've seen both Hitachi and Seagate 2T drives from Dell.
   

Great, thanks.  Any ideas what model and firmware numbers (preferably
hdparm -I   or otherwise  cat /proc/scsi/scsi  or  lsscsi etc.).

Cheers!

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: 1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?

2010-07-08 Thread Tim Small

Blake Hudson wrote:
 Tim Small wrote:
   
 I know they've previously been shipping Seagate ST31000340NS 1TB drives,
 but I've no idea which vendor/model 2TB drives they're using?
   
 
 I recently purchased a couple Dell branded 2TB SATA drives. They are
 Hitachi's - HDS72202A28A.
   

Interesting - thanks for that - any idea if they were pulls from
server-class, or desktop-class Dell hardware?  The reason I ask is that
whilst Google come up with a blank for that part number, the non-Dell
2TB Hitachi Deskstar (i.e. desktop-class drive) is the HDS722020ALA330,
whereas their Ultrastar drives have the part number HUA722020ALA330...

Anyone else had any 1TB or 2TB SATA drives in Poweredges?

Cheers,

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?

2010-07-07 Thread Tim Small

Hi,

I'm planning on purchasing a couple of 11G servers (prob R210 / R310 /
R410) for a client, and was wondering which 1-2TB drives Dell are
shipping with these boxes?

I know they've previously been shipping Seagate ST31000340NS 1TB drives,
but I've no idea which vendor/model 2TB drives they're using?

Also:

If you order the On-board SATA Controller, Min. 1 Max. 4 SATA Only
Cabled Drives type options (which is what I'm planning on doing given
all the bollocks I've had from LSI controllers recently - give me a
plain old just-works-and-fast Intel ICH10 any day) - do the servers
actually ship with all the cables and mounting hardware for adding bare
drives to non-populated non-hotswap drive bays?

... i.e. does the R310 with that option selected plug just a single SATA
drive actually ship with the brackets and cables to fit a bare 3rd party
SSD of my choice, such as the Intel ones at a later date (I realise that
I might need to supply a suitable 2.5 to 3.5 bracket, and maybe a few
screws myself), or would I have to hunt around the net at great
length/expense to do that (or just buy the min-size SATA drives with the
server, and discard them when I needed to put in the SSD)?

The online configurator for the R410 no longer seems to have the
on-board sata controller option - anyone know if it's still actually
available (it definitely used to be there, and is in the R410 technical
guide book)?

Thanks!

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: dell 2850 initrd problem.

2010-06-08 Thread Tim Small

Ron Croonenberg wrote:
 So if I issue: 
 e2fsck -f -b 32768 /dev/mapper/VolGroup00-LogVol01
   


Hi,

If you're running filesystem checks - I'd strongly suggest running them
on a backup image, or (more easy to achieve and quicker) running them
against a copy-on-write snapshot (see my post of about a month ago about
RAID recovery).

You might want to try the ext3-users mailing list or similar if you need
more help putting the filesystem back together.

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: how to find which server to buy

2010-06-08 Thread Tim Small

Tapas Mishra wrote:
 I want to know how to decide as which server of Dell will be able to
 sustain request to a particular application.
 That is what should be CPU frequency Quad Core or 2 Cpu or RAM .If it
 makes sense then stress testing of a Dell Server for my application.
 Assuming I know the  maximum load of my application.
   

Err - simulate your application on some other hardware - a desktop PC is
often fine as a starting point, and see when the following things
saturate, and the overall application performance becomes unacceptable
(e.g. latency or whatever is appropriate to measure your application
performance):

. CPU
. RAM
. disk
. network bandwidth

... then do some rough calculations using:

. The figures that you've just got from your benchmarks.

. The approximate difference between your test system, and the system
that you are thinking of buying.


... that should get you pretty close



Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Serial over LAN

2010-05-21 Thread Tim Small

Adam Nielsen wrote:
 Perhaps you can answer something that's been bugging me for some time. 
 How does this actually work?  I mean, what gets sent over the wire when 
 you redirect a serial port?

 It's always bugged me that there's not much information about how this 
 is done, and it seems to use a bit too much magic for my liking.  I 
 mean, is it TCP?  Can you restrict access to it with a firewall?  How 
 does it share the network card with the host OS, in the cases where you 
 use the one NIC for both?
   


The IPMI BMC is a complete autonomous embedded computer on the
motherboard. It has various connections to the main computer, but is
otherwise distinct from it (runs all it's own code, and has its own CPU
and RAM). For shared LAN access it also (typically) has a backdoor
into the NIC chip, so that it can tell the NIC to - for example - get
all traffic destined for a certain MAC address (earlier implementations
were even more strange - in that they could set the NIC up to do things
like steal all UDP traffic to the IPMI port).

The IPMI over LAN protocol is implemented as UDP (on port 623) - look at
the LAN INTERFACE and LANPLUS INTERFACE entries in a recent ipmitool
manual page for details

With SOL, Linux sends serial data to the serial port - the output of
this serial port is then connected to the BMC which receives the traffic
on its own serial port, encapsulates it as IPMI lanplus SOL UDP packets,
and sends it out via the NIC backdoor...

Because of the way that the BMC goes straight-to-the-NIC, any iptables
firewalls under Linux aren't going to see the traffic - so you'd need to
do any firewalling before the traffic hits the NIC (i.e. outside of the
box). Another alternative is to configure the BMC to only communicate on
a separate VLAN, so that you can isolate it from other traffic using
that mechanism instead (e.g. ipmitool lan set X vlan id 888).

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Serial over LAN

2010-05-21 Thread Tim Small

Jefferson Ogata wrote:
 Hmm, my impression was that Redirection After Boot only affects comms 
 up to the point of getting a bootloader running, but I'm not 100% sure.
   

I believe (but again, I could be wrong), that with redirection after
boot enabled, the BIOS polls the VGA character buffers using a timer
interrupt. This mechanism continues to work until Linux switches the CPU
to protected mode very early in the kernel boot process - i.e. it works
in DOS and the boot loader - and (as has been mentioned already) it
relies on things like the boot loader NOT being configured to
communicate with the serial port directly themselves.

Personally, I normally enable redirection after boot, and disable the
native serial coms in grub.

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Serial over LAN

2010-05-21 Thread Tim Small

Case van Rij wrote:
 I have this configured on 45 R410s with iDRAC Express and have tried
 it in the past on 50 R210s and a handful of 1950s and I have to say,
 even once configured it's actually rather frustrating to use and it's
 no substitute for a cyclades-like serial console server.

 I initially tried to use this with the ethernet port on my regular
 switch (even though the ethernet port was dedicated to PXE and IPMI,
 all normal traffic is using an add-on 10G card) and the management
 port would simply stop responding to network traffic on a daily basis.
 I've since moved the ethernet to an isolated switch and it's
 marginally better, but the serial-over-LAN is still so unresponsive
 that remote management automation frequently times out while trying to
 manage the server).
   

I've found performance on R210s to be good.  1950s and PE860s less so.
I've found reliability pretty good using recent ipmitool builds.  Here's
how I have an R210 set up (I think these were just the defaults).

# ipmitool sol info 1

Character Accumulate Level (ms) : 50
Character Send Threshold: 255
Retry Count : 7
Retry Interval (ms) : 480
Volatile Bit Rate (kbps): 115.2
Non-Volatile Bit Rate (kbps): 115.2
Payload Channel : 1 (0x01)
Payload Port: 623

You may want to try:

1. Using a recent ipmitool, if you're not already.
2. Fiddling with the first 4 values.  I'm guessing that things like the
retry interval could come down on a LAN - half a second to retry a
dropped packet seems like a long time at first consideration  It
would perhaps have been nice if IPMI did stuff over TCP rather than UDP,
but when it was defined it didn't to SoL, so I suppose UDP was good
enough, and nice and light-weight for doing other (non-interactive) IPMI
comms.

HTH,

Cheers,

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Serial over LAN

2010-05-20 Thread Tim Small

Jefferson Ogata wrote:
 On 2010-05-20 16:14, Marc Moreau wrote:
   
 I'm looking to setup Serial over Lan on my cluster of PowerEdge 1950's.   
 Does anyone have this setup?
 

 Sure. Use it all the time.
   

Yes - here too - using ipmitool on Debian. I use pretty much the same
setup with Dell PE 860/1950/R210, Intel SR1500/SSR212MC2, and a couple
of different Tyan boxes too.

Dell's IPMI implementation is reasonable, but for some weird reason,
they want to develop proprietary tools to manage it - what's wrong with
just working with the existing open source tools? They seem to work
pretty well, and have less wacky interfaces than the Dell-proprietary
stuff. I *really* don't get this policy.

 2. In BMC setup (control-E during POST), enable serial over LAN, set IP
 and password.

 ** First thing you should do in BMC setup is reset to default. The BMCs
 often ship with a weird non-default setting that will cause lots of
 serial port feedback if you try to run a getty on the serial console

Actually, a guy I work with ( David from Positive Internet ) recently
fixed that on poweredge 1950s with:

ipmitool raw 0x06 0x40 0x02 0xb8 0x84
ipmitool raw 0x06 0x40 0x02 0x78 0x44

He got this from tracing what the Dell proprietary binary did.

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)

2010-05-12 Thread Tim Small

On 06/05/10 08:47, Support @ Technologist.si wrote:
 Hi tim,
 You gave yourself a hell of a job..
 Below here are some links.. the last 2 links are linux ways to go..

 http://forum.synology.com/enu/viewtopic.php?f=9t=10346
 http://www.diskinternals.com/raid-recovery/
 http://www.chiark.greenend.org.uk/~peterb/linux/raidextract/
 http://www.intelligentedu.com/how_to_recover_from_a_broken_raid5.html


Ta for those who sent along some tips...

In the end, I did manage to persuade the controller to put the array 
back together (succeeded on the second attempt, after restoring the 
drive metadata from the backups I'd taken).  Part of the reason that I 
didn't try this originally is that I didn't have access to any spare 
SCSI/SCA drives, or the original RAID controller either!

Once I had access to the original block device, I created a COW snapshot 
in order to run fsck.ext3 on the filesystem without actually triggering 
any writes to the array (I think a write caused by replaying the journal 
killed the array the first time around).

Here are some handy instructions on using dmsetup to do this:

http://www.thelinuxsociety.org.uk/content/device-mapper-copy-on-write-filesystems

... which would also be handy in the case of any other file-system 
corruption, and is a lot faster than copying around image files!



Before that I tried the following method using Linux software RAID to 
reconstruct the array (which nearly worked):

. Take images of the 5 drives
. Work out how big the metadata is (assuming it's at the beginning of 
the drives):

for i in {0..1024} ; do dd if=/mnt/tmp/raid_0 skip=$i | file - ; done

... etc. for all 5 drive images.

. Create read-only loop-back devices from the drives using:

losetup -r -o 65536 /dev/loop0 /mnt/tmp/raid_0

... having found a valid MBR 64k into one of the drives - so assuming 
the Adaptec aacraid controller metadata was on the first 64k of the 
disk.  The loop device skips over this first 64k using the offset 
argument above.

. Create a set of 5 empty files (to hold the Linux md metadata) using 
dd, and set these up as loopX as well.
. Create a set of RAID appends (without metadata) using:

./mdadm --build /dev/md0 --force -l linear -n 2 /dev/loop0 /dev/loop10

etc. - with the idea that a to-be-created-later md RAID5 device will put 
their (version 0.9) metadata into the (read/write) files which make up 
the end of these RAID append arrays.  It would be handy if you could 
create software RAID5s without metadata, but you can't - they wouldn't 
be much practical use except for this soft of data-recovery purpose, I 
suppose

. Create a set of degraded md RAID5s using commands like:

./mdadm --create  /dev/md5 -e 0.9 --assume-clean -l 5 -n 5 /dev/md0 
/dev/md1 /dev/md2 /dev/md3 missing

... for all possible permutations of 4 out-of the 5 drives, plus one 
missing (actually it tried the all-5-drives running layouts as well, but 
I disregarded these to be on the safe side).

http://www.perlmonks.org/?node_id=29374

perl permutations.pl /dev/md0 /dev/md1 /dev/md2 /dev/md3 /dev/md4 
missing | xargs -n 6 ./attempt.sh  21 | tee output2.txt

Where attempt.sh look like this:

#!/bin/bash

lev=5
for layout in ls la rs ra
do  for c in 64
do echo
 echo
 echo
 echo  echo level: $lev  alg: $layout chunk: $c  order: $1 $2 
$3 $4 $5
 echo y | ./mdadm-3.1.2/mdadm --create  /dev/md5 -e 0.9 
--chunk=${c} -l $lev -n 5 --layout=${layout} --assume-clean $1 $2 $3 $4 
$5  /dev/null 21
 sfdisk -d /dev/md5 21 | grep 'Id=82'   sleep 4  fsck.ext3 
-v -n /dev/md5p1
 mdadm -S /dev/md5
done
  done


... so this assembles a v0.9 metadata md array (which puts its metadata 
at the end), and then looks for a Linux swap partition in the partition 
table, and tries a read-only fsck of the data partition.

A chunk size of 64 seemed to be the default for the BIOS but I did 
originally try others.  Anyway, this came up with two layouts which 
looked kind-of-OK (which is what I was expecting, as I assume that first 
one drive failed, then a second), both used left-asymetric parity layout.

... but e2fsck came up with loads of errors, and although the directory 
structure ended-up largely intact, the contents of most files were wrong 
- so there must be something else which is a bit different about the way 
that these aacraids layout their data - maybe something discontinuous 
about the array or something?  After I'd completed the job, I didn't 
have time to compare the linux-software-raid reconstructed image with 
the aacraid-hw-raid reconstructed version, but this would be easy enough 
todo using some test data

I've posted this detail here in case someone is faced with having to 
attempt a similar job again, but can't get the controller to put the 
data back together - or perhaps someone who is trying this with drives 
from a different HW raid controller - in which case this method might 
Just Work (tm).

Similarly if anyone else can see anything

Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)

2010-05-12 Thread Tim Small

On 12/05/10 14:59, J. Epperson wrote:
 Not that I'd attempt anything like this short of a
 national security issue or forensics for a particularly heinous crime


Would running a server with:

. No RAID array status monitoring, and..
. No backups at all

... be sufficiently heinous?


Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)

2010-05-05 Thread Tim Small

Tim Small wrote:
 Here's a diff between the hex-dump of the first 128 sectors of two of
 the drives
   


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

--- /tmp/scsi-SSEAGATE_ST336607LC_3JA760WM.raw.hd	2010-05-05 22:18:02.0 +
+++ /tmp/scsi-SSEAGATE_ST336607LC_3JA763SY.raw.hd	2010-05-05 22:18:02.0 +
@@ -1,56 +1,54 @@
   56 19 02 00 1e 00 00 00  10 00 00 00 f6 cd 3c 04  |V..|
-0010  00 00 02 00 7d d3 9e 6c  00 00 00 00 00 00 00 00  |}..l|
+0010  00 00 02 00 5a 96 f3 61  00 00 00 00 00 00 00 00  |Z..a|
 0020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
-01f0  00 00 00 00 00 00 00 00  00 00 00 00 9e 55 f8 0a  |.U..|
-0200  c4 55 00 00 d3 30 04 bd  01 f0 fa fa 33 03 00 00  |.U...0..3...|
+01f0  00 00 00 00 00 00 00 00  00 00 00 00 14 8f e2 44  |...D|
+0200  c4 55 00 00 d3 30 04 bd  01 f0 fa fa 2c 03 00 00  |.U...0..,...|
 0210  00 00 00 00 2c 00 00 00  00 00 00 00 2c 00 00 00  |,...,...|
-0220  00 00 00 00 32 03 00 00  00 00 00 00 00 00 00 00  |2...|
+0220  00 00 00 00 2b 03 00 00  00 00 00 00 00 00 00 00  |+...|
 0230  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
-03f0  00 00 00 00 00 00 00 00  00 00 00 00 72 3c 4f d8  |rO.|
+03f0  00 00 00 00 00 00 00 00  00 00 00 00 72 3c 6f 23  |ro#|
 0400  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
 0c00  98 19 13 04 00 00 01 00  d3 30 04 bd 01 f0 fa fa  |.0..|
 0c10  ff ff ff ff ff ff ff ff  00 00 00 00 ff ff ff ff  ||
 0c20  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ||
 0c30  ff ff ff ff ff ff ff ff  ff ff ff ff 00 00 00 00  ||
 0c40  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
 0df0  00 00 00 00 00 00 00 00  00 00 00 00 eb 72 03 35  |.r.5|
 0e00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
 1200  99 25 03 00 d3 30 04 bd  01 f0 fa fa 2c 00 00 00  |.%...0..,...|
 1210  14 00 00 00 40 00 00 00  00 00 00 00 40 00 00 00  |@...@...|
-1220  00 00 00 00 32 03 00 00  00 00 00 00 00 04 05 ff  |2...|
-1230  d0 45 97 46 00 00 01 00  02 00 01 00 04 00 00 00  |.E.F|
+1220  00 00 00 00 2b 03 00 00  00 00 00 00 00 04 05 ff  |+...|
+1230  d0 45 97 46 00 00 03 00  02 00 01 00 04 00 00 00  |.E.F|
 1240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
-1ff0  00 00 00 00 00 00 00 00  00 00 00 00 0b 02 09 e9  ||
-2000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
+1ff0  00 00 00 00 00 00 00 00  00 00 00 00 0f 02 29 ea  |..).|
+2000  00 00 00 00 01 96 01 6c  62 45 43 05 d3 30 04 bd  |...lbEC..0..|
+2010  01 f0 fa fa 22 03 00 00  80 00 00 00 00 c5 3c 04  |..|
+2020  d3 30 04 bd 01 f0 fa fa  03 00 00 00 03 00 00 00  |.0..|
+2030  04 00 00 00 00 00 00 00  05 00 00 00 01 00 00 00  ||
+2040  1a 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
+2050  00 00 00 00 00 00 00 00  01 00 00 00 80 00 00 00  ||
+2060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
+2070  00 00 00 00 00 00 00 00  00 00 00 00 03 00 00 00  ||
+2080  00 00 00 00 00 00 00 00  01 01 00 00 ff ff 00 00  ||
+2090  52 35 20 53 79 73 74 65  6d 20 20 20 20 20 20 20  |R5 System   |
+20a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
 *
-2200  00 00 00 00 01 96 01 6c  62 45 43 05 d3 30 04 bd  |...lbEC..0..|
-2210  01 f0 fa fa 33 03 00 00  80 00 00 00 00 c5 3c 04  |3..|
-2220  d3 30 04 bd 01 f0 fa fa  02 00 00 00 02 00 00 00  |.0..|
-2230  04 00 00 00 00 00 00 00  05 00 00 00 02 00 00 00  ||
-2240  1a 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
-2250  00 00 00 00 10 00 00 00  01 00 00 00 80 00 00 00  ||
-2260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 08 00  ||
-2270  01 00 00 00 01 00 00 00  00 a6 03 a0 02 00 00 00  ||
-2280  00 00 00 00 00 00 00 00  01 01 00 00 ff ff 00 00  ||
-2290  52 35 20 53 79 73 74 65  6d 20 20 20 20 20 20 20  |R5 System   |
-22a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
-*
-23f0  00 00 00 00 00 00 00 00  00 00 00 00 0d d4 d2 4b  |...K|
-2400  00 00 00 00 00 00 00 00  00 00 00

SAS5 SAS6 etc. DMA alignment (hardware/firmware bug) - can Dell investigate this one please?

2010-05-04 Thread Tim Small

Hi,

With respect to:

http://lkml.org/lkml/2010/4/26/335

https://bugzilla.kernel.org/show_bug.cgi?id=14831

It would appear that it's unsafe to carry out ATA-passthrough operations 
on SAS5* and SAS6* controllers (1068 / 1068E and others).

This definitely affects smartctl, and I'm assuming it has the potential 
to impact operations such as drive firmware updates - locking up the 
controller during a firmware update seems like it could be a Bad Thing.

Could someone at Dell take a look at this issue and see if their Linux 
firmware update packages could be impacted?  If they are impacted, I 
guess they should be withdrawn until a workaround can be put in place...

Thanks,

Tim.

-- 

South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: how to get rid of bad blocks in a file on PERC 5/I?

2010-05-01 Thread Tim Small

On 01/05/10 07:16, Adam Nielsen wrote:
 disks, as long as the hardware RAID controller can keep up with the disks
 there would be no difference in performance.


I remember reading a benchmark which showed that under random I/O 
patterns, the Linux software RAID performed better on a (from memory) 8 
disks RAID5, due to better use of SCSI scatter/gather.  I think this was 
vs. MegaRAID, but was a while ago.  I've not carried out any benchmarks 
myself, and no idea whether this goes for SATA NCQ as well...

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: how to get rid of bad blocks in a file on PERC 5/I?

2010-04-30 Thread Tim Small

Adam Nielsen wrote:
 I believe that when hard disks discover they have a bad sector 
 they attempt to remap it themselves, but it may not always happen right 
 away.  So it's possible that by the time you rebuilt the array the 
 sectors had been relocated.
   

I believe the standard behaviour is:

. Read and apply simple (fast/hardware-implemented AKA online) error
correction

. If that fails try to use more complex (slow/firmware-implemented
AKA offline) ECC - retry this a (usually configurable) number of times.

. In the case of successful correction (we have the user data),
write the data back to the sector, and then read-check it to see if it
was written successfully.

   . If the re-read-verify is OK, then continue as normal (maybe
increment one of the SMART counters)

   . If the re-read-verify fails, then reallocate the sector
(use a spare hidden reserved sector elsewhere on the disk).  Increment
the SMART reallocated sector count.

   . If the offline ECC fails, then we've really lost data, so
return a read-error to the disk controller - mark the sector as
pending - attempting to read the sector again may restart the
offline correction attempts.




If the controller later tries to WRITE to that sector instead of reading
it, then the drive will do the write, and verify step again as above
with the new data (i.e. see if the data can then be read, and if-not
then reallocate it).

In the case of a RAID controller, standard practise is for the
controller to reconstruct the data from the other drives, and then issue
the write instruction back to the original drive.  The better RAID
implementations will actually REPORT THIS TO YOU, when it happens (e.g.
Linux software RAID, so that you know the drive may be unwell).  To make
matters worse you can't even reliably check the SMART data yourself with
some of the Dell/LSI controllers - and LSI/Dell don't seem to care
enough to fix this...

https://bugzilla.kernel.org/show_bug.cgi?id=14831

 However given the subsequent failures I would think that the drive may 
 actually be fine - maybe you can run a self test on it without going 
 through a RAID controller.
   

Using smartctl to check what's gone on the with the drive itself would
be the best thing to do, I think...  Recent smartctl has support for
communicating with drives behind PERCs.


 I don't know whether the situation has improved in recent years, the 
 experiences were enough to persuade me to switch to software RAID which 
 I have stuck with ever since.
   

ACK.  My conclusion is also to use AHCI, and software RAID.  It's more
reliable generally, and if you do find a bug, the maintainers are
responsive (or you can even fix it yourself, or pay someone else to -
this is Open Source right?  Presumably that's why people use Linux in
the first place?).  Oh, and it's cheaper too.

Tim.


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PE2950, LSI SAS - SATA very slow

2010-04-30 Thread Tim Small

Adam Nielsen wrote:
 Hi all,

 We have a PowerEdge 2950 with an LSISAS1068E controller, hooked up to 
 two Seagate 1TB SATA disks.  For some reason the performance of these 
 disks is quite poor - I'm lucky to get over 10MB/sec from them, when I 
 should be getting closer to 100MB/sec.
   

I've seen flaky/unpredictable performance with these LSI controllers. 
Also some Seagate 1TB drives have performance bugs which affect
sequential reads - so you could try using alternative drive firmwares,
as the bug seems to have been fixed in the non-Dell firmware (see my
recent posts to this list - still no response from Dell on this issue!).

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PE2950, LSI SAS - SATA very slow

2010-04-30 Thread Tim Small

Adam Nielsen wrote:
 We have a PowerEdge 2950 with an LSISAS1068E controller, hooked up to 
 two Seagate 1TB SATA disks.  For some reason the performance of these 
 disks is quite poor - I'm lucky to get over 10MB/sec from them, when I 
 should be getting closer to 100MB/sec.

   

Sorry - forgot to say... You could try disabling NCQ on the disks
(lsiutil, or in the BIOS menu, I think), to see if that helps sequential
reads


Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Hot disk change.

2010-04-20 Thread Tim Small

Fabio Catunda wrote:

 I'm really worried to know it Linux would be able to read the old disk
 connected on a different SATA port.


dmraid knows about the metadata formats of various hardware RAID
controllers, so do recent mdadm 3.x tools, I think.

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Seagate ST31000340NS firmware MA0D - Baracuda ES.2 1TB poor sequential read performance with NCQ enabled

2010-04-13 Thread Tim Small

Hi,

I'm seeing poor sequential read performance (as measured using hdparm 
-t) using the recommended Dell firmware version MA0D on Baracuda ES.2 
1TB drives using libata with ICH Sata controllers in AHCI mode with NCQ 
enabled (Poweredge R210, R410 etc.).  Similar performance bugs are seen 
during RAID verifies, rebuilds etc.

If I put the non-Dell firmware version AN05 on the drives, the 
performance bug goes away.  Similarly if I reduce the NCQ depth to 2, 
sequential read performance is restored, but as this may impact 
performance under random I/O loads, I don't really want to do this...

Seeing as Seagate seem to have fixed this issue for other non-Dell 
firmwares, any chance they could be persuaded to do-so for the Dell 
firmware series too?

More here:

https://ata.wiki.kernel.org/index.php/Known_issues#Affected_devices_2

Thanks,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Dell R300 with Silicon Image 3132 PCIe SATA/eSATA card - doesn't work with BIOS v1.4.3

2010-03-24 Thread Tim Small

Just a quick BIOS bug report -

I have this device working on a Dell R300 with BIOS version 1.2.0, but 
when I upgrade the R300 BIOS to 1.4.3, it doesn't show up in the PCI 
device listings.  I tried with Silicon Image BIOS versions 7.4.05 and 
7.7.02.

07:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA 
Raid II Controller (rev 01)
 Subsystem: Silicon Image, Inc. Device 7132
 Flags: bus master, fast devsel, latency 0, IRQ 16
 Memory at dfcfbf80 (64-bit, non-prefetchable) [size=128]
 Memory at dfcfc000 (64-bit, non-prefetchable) [size=16K]
 I/O ports at dc80 [size=128]
 Expansion ROM at dfc0 [disabled] [size=512K]
 Capabilities: [54] Power Management version 2
 Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ 
Queue=0/0 Enable-
 Capabilities: [70] Express Legacy Endpoint, MSI 00
 Capabilities: [100] Advanced Error Reporting ?
 Kernel driver in use: sata_sil24
 Kernel modules: sata_sil24


I haven't tried the version of the same card which Dell sells -

http://search.dell.co.uk/1/2/13140-startech-com-2-port-pci-express-esata-controller-adapter-card-storage-controller-2-channel-esata-300-low-profile-pci-express-x1.html

but I'd expect the same results (both cards have just the sil3132 AKA 
sii3132 chip, and an EEPROM with the silicon image BIOS onboard).

Incidentally, you can program the EEPROM on this chip under Linux (I 
used Debian Squeeze) using the Silicon Image BIOS from 
http://www.siliconimage.com/docs/SiI3132_7702.zip along with flashrom 
from http://flashrom.org/

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Adding a second Quad Core L5420 CPU to a 1950 III

2010-03-03 Thread Tim Small

PJF wrote:
 I'm top posting on my on post...

 I added a second quad core, everything is working fine.

 I noticed the second one is   Model 23 Stepping 6
 The first one Model 23 Stepping 10

 Do these need to match? Openmanage reports everything is okay...
   

I believe that Intel say the steppings should match.  Can you swap one
of the CPUs out from another box?  Not sure if using microcode.ctl would
help, but probably worth doing anyway (we run it on all our productions
boxes).

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Adding a second Quad Core L5420 CPU to a 1950 III

2010-03-03 Thread Tim Small

PJF wrote:
 I've been through the dell docs and intel docs, not sure if the stepping
 matters.
   


Out of curiosity I had another look at this.  Intel's line on older
processor lines were that the steppings shouldn't differ by more than 1,
Intel's 5400 series datasheet says:

http://www.intel.com/Assets/en_US/PDF/datasheet/318589.pdf

 Not all operating systems can support dual processors with mixed
 frequencies. Mixing
 processors of different steppings but the same model (as per CPUID
 instruction) is
 supported. Details regarding the CPUID instruction are provided in the
 AP-485 Intel®
 Processor Identification and the CPUID Instruction application note.

but OTOH, it seems that mixing both settings of L5420 caused some
trouble on Intel S5000 boards before they updated the BIOS...

http://downloadmirror.intel.com/18075/eng/release.txt


Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: help configuring a db server

2010-02-11 Thread Tim Small

John G. Heim wrote:
 But I'm confused about disk. I would think disk pspeed 
 would be fairly important.

That entirely depends on your database usage pattern:

What is your total dataset size?
What is your working set (i.e. data which is commonly/regularly accessed)?
Can you reasonably fit your working set into RAM?
What is your read/write ratio (i.e. select vs update/insert)?

If at all possible, aim to have enough RAM for your working set, if you 
can't then disk speed becomes important for read performance.
Write latency (time taken to commit to permanent storage) is critical if 
you are doing a lot of writes - in this case getting a RAID controller 
with battery-backed cache is a win - otherwise it probably isn't.

So... if you have a database (or databases) with a working set of 10G, 
and a high read:write ratio, then disk performance probably isn't going 
to be important.

Another hint (excuse if you know this stuff already), but you can very 
readily get large performance improvements by optimising your mysql  
server config (e.g. using mysqlanalyze, and this munin plugin: 
http://github.com/kjellm/munin-mysql/downloads )


Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: Third-party drives not permitted on Gen 11 servers

2010-02-08 Thread Tim Small

Philip Tait wrote:
 the supplied drives, and installed 4 Barracuda ES.2s. After doing a
 Clear Configuration in the pre-boot RAID setup utility, I can perform
 no operation with the drives - they are marked as blocked.

 Is Dell preventing the use of 3rd-party HDDs now?

 Thanks for any enlightenment.
   

Hi Philip,

I was wondering what the firmware version on the blocked drives is?  
e.g. using smartctl or hdparm -I on the drives when stuck in a different 
box?  Assuming your drives are SATA rather than SAS, the firmware in a 
250G Dell-supplied ES.2 in an R200 which I have here is MA08, whereas 
some third-party drives in other machines use SNxx series firmware.  I 
believe it is possible to switch from one to the other firmware series.

Whilst I think Dell's policy is probably wrong (it should be complain 
loudly rather than disallow), it's possible that there are genuine 
reasons for this - I spent/wasted most of last week diagnosing what is 
starting to look like a firmware bug on WD 2TB green power drives on a 
non-Dell server - interspersing SMART queries with other types of 
transactions would appear to occasionally cause the drives to lock-up!

I wouldn't be surprised if the H700 adaptor firmwares are doing various 
unusual things to the hard drives, and it's possible that Dell has got 
nervous about buggy firmware from unqualified drives reflecting badly on 
their hardware.

Some official (or non-official) comment from Dell on the *technical* 
reasons for this decision would be welcome

Cheers,

Tim.

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

ipmitool delloem half-baked patch

2010-02-05 Thread Tim Small

Alexander Dupuy wrote:
 You can get power statistics from ipmitool with delloem support, e.g. 
 ipmitool delloem powermonitor powerconsumptionhistory.  I realize this 
 tends to fall in the category of a vendor-proprietary tool

WTF!  Why isn't this merged upstream?  It's not like there isn't already 
a precedent, what with there already being other vendor oem commands etc.

Oh, I see...

http://sourceforge.net/mailarchive/message.php?msg_id=4A4DCEA9.9070003%40cern.ch

The patch has been half baked, thrown over the wall, and then abandoned, 
by the look of it.  Great

Writing code like this seems to be a complete waste of time to me - it 
would be cheaper and easier for Dell to just release the specs, and let 
someone else implement the functionality and get it merged upstream - 
creating this sort of half-finished work just discourages other people 
from creating their own code.

So, does dell have any plans to do any more work on this code, or is it 
abandoned?


That would be a shame, since if you want this functionality, you have to:

1. Know about it.
2. Find the patch.
3. Apply, build, maintain (repeat).


But Mr Dent, the plans have been available in the local planning office 
for the last nine months.

Oh yes, well as soon as I heard I went straight round to see them, 
yesterday afternoon. You hadn't exactly gone out of your way to call 
attention to them, had you? I mean, like actually telling anybody or 
anything.

But the plans were on display ...

On display? I eventually had to go down to the cellar to find them.

That's the display department.

With a flashlight.

Ah, well the lights had probably gone.

So had the stairs.

But look, you found the notice didn't you?

Yes, said Arthur, yes I did. It was on display in the bottom of a 
locked filing cabinet stuck in a disused lavatory with a sign on the 
door saying 'Beware of the Leopard'.


Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: R210 / L3426 power consumption figures - 40w idle!

2010-02-04 Thread Tim Small

Matt Domsch wrote:
 The Dell Advanced Power Control feature is enabled by default in BIOS
 SETUP, which prevents the OS from managing it.  If you prefer to have
 the OS do it, disable this feature in BIOS.
   

Thanks for that information Matt - must have missed that in the BIOS 
(all the other systems which I've configured have defaulted to OS Control).

On reflection, I suppose I'm a bit surprised that this must be set in 
the BIOS - I'd have thought that if the OS tries to take control of 
frequency scaling, the BIOS should automatically relinquish control at 
that point

I also think that unless you take the trouble to read the manual, it's 
not really obvious from the string Active Power Control, that this 
means BIOS control only - no OS control.

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PowerEdge 1950 ECC question...

2010-02-03 Thread Tim Small

It seems probable that the memory fault is causing the BIOS to crash 
before it gets a chance to enable ECC - thus no errors are logged.  It 
could also be a bus-loading issue with the FB-DIMMs (such that no memory 
can be issued - maybe a faulty AMB chip on one of the sticks).

Try the system with half the original RAM in at a time - you could also 
try moving each stick up by four slots (and wrapping round).

Tim.


Henrik Schmiediche wrote:
 Did that. The system is up using new RAM and OMSA and related utilities are
 running. There are no memory related entries in the ESM log. Old ram freezes
 system, but no error of any kind is generated in ESM, memtest, mpmemory,
 dell diags.
   


-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

R210 / L3426 power consumption figures - 40w idle!

2010-02-03 Thread Tim Small

Hi,

I've had an R210 recently to set up for a client and thought I'd share 
the power consumption figures that I recorded with the list.

Basically, very impressive results IMO, when you consider the last R200 
(X3220) which I set up used 135-odd watts idle...


Disks: 2x 250G Seagate ES.3 SATA
RAM: 8G in 4 sticks (not sure if it was 1066, or 1333 DDR3)
CPU: L3426
Kernel: Debian 2.6.30-bpo.2-amd64
Supply voltage: 240
Ambient Temp: 20 degrees

Idle in Linux (no CPU freq scaling, HDs asleep): 0.19 amps / 33 watts

Idle in Linux (no CPU freq scaling): 0.2 amps / 40 watts

With 100% CPU usage (on all 4 cores): 0.38 amps / 82 watts

With 100% CPU usage (on all 4 cores), whilst doing a sequential read
from both hard disks with hdparm -t: 0.38 amps / 81 watts

With 100% CPU usage (on all 4 cores),and having issued an ipmitool mc
reset cold command (to get the fans to full power), whilst doing a 
sequential read from both hard disks with hdparm -t: 0.46 amps / 114 watts

With 100% CPU usage (on all 4 cores),and having issued an ipmitool mc
reset cold command, whilst spinning-up both hard disks having
previously put them to sleep with hdparm -Y: 0.56 amps / 131 watts

I could probably get the CPU/mem power usage up a little bit higher, as 
my CPU loading was a bit simplistic (burnBX + memtester).

The CPU frequency scaling module doesn't seem to load.  This is either
because Dell's BIOS doesn't support it, or it Dell's BIOS cpu frequency
interface doesn't work with the Debian Lenny kernel.

More to follow...

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PowerEdge 1950 ECC question...

2010-02-03 Thread Tim Small

Henrik Schmiediche wrote:
 My original question (turning of ECC for memory testing) was hopefully going 
 to narrow down the issue.
   

It's probably possible to disable ECC after boot time using setpci (I've
used it to read and write the ECC status registers  on Intel chipsets in
the past), but I don't know the details of the i5000 ECC implementation
(I don't even know if the ECC functionality is still controlled via PCI
Configuration space) - you'll have to check the datasheet (or the
i5000_edac driver source).

ISTR, memtest86 and memtest86+ both had some functionality for
reading/writing ECC status bits on some chipsets as well, so you could
hack on these too (but the code was a bit messed up last time I looked -
I think ECC no, and ECC NO had different meanings!)...

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Will DOS floppy firmware updates please leave the building...

2010-01-25 Thread Tim Small

Ian Forde wrote:
 Heck - even at 10 servers, PXE installations should be the
 norm.  So to be told that one has to use a DOS floppy is a little...
 well... grating...
   


I've had good luck in the past using memdisk from syslinux to load a
floppy,CD,or USB stick image into RAM as part of a PXE boot - and then
using IPMI serial redirection to carry out the update (except you can't
see *graphical* DOS firmware updaters (WFT?!!!) but even those seem to
have unattended operation options).

Qemu v11 + etherboot (I use the pcnet PXE rom image that ships with
recent Debian) is a very good tool for prototyping / checking images

Still a crappy bodge, but a bit less so...  To give Dell their due, a
lot of their problems are with their suppliers, and at least this:

http://www.ducea.com/2007/08/27/dell-bios-firmware-updates-on-debian/

... is a lot easier on Dell systems than most others (yeah there is
http://packages.debian.org/sid/flashrom for other systems which I've had
100% success with, but if it breaks you're going to need to start
pulling chips).

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: PXE boot R900 via 10Gb Intel NIC?

2010-01-23 Thread Tim Small

Jefferson Ogata wrote:
 Maybe there's some way to do it with ethtool -E, but I don't have any
 way to find out what it is.
   

The Intel datasheet?  May not help, tho, I suppose.

 Any other ideas?
   

Unload the linux driver, and use a recent qemu/kvm to passthrough the
PCI device, and then boot the DOS floppy image within qemu/kvm?

Cheers,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: massive io problems

2009-12-11 Thread Tim Small

John Hodrien wrote:
 What about block device activity (-d option)?
 

 Looks like sar isn't configured for this at the minute, I'll see if I can sort
 that out

I'm not sure how frequently sar collects data, but I think you'll
probably want something to collect it at 1-second or less granularity -
e.g. vmstat 1, or (probably better) dstat.  dstat also has various
plugins which you may want to investigate - including a good NFS plugin.

BTW, are you using noatime or relatime?

Thanks,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

PE1950 IPMI BMC appears to deliver junk to the serial port

2009-11-03 Thread Tim Small

Hello,

All of our PE1950s (on two different) sites have problems with SoL...

They have been set up for IPMI remote access and SoL using ipmitool. 
When SoL sessions are not active the BMC outputs junk characters from
its serial connection (and thus into the OS-visible serial port UART) -
this behaviour is always reproducible, and can be triggered by sending a
few hundred bytes of characters from the OS to the BMC (e.g. OS boot
messages etc.).

Some relevant snippets from my notes on the issue:

I am 99.9% sure that this is a firmware bug on the BMC, and not an OS or
application software bug, since it also shows up prior to OS boot.

On Dell PowerEdge 1950s (BMC firmware version 2.37) - it has been
observed on a number of different machines that:
When IPMI SoL sessions are enabled, but NOT active, spurious characters
are received by the serial UART from the BMC (on Linux device
/dev/ttyS1).  The problem also exists outside of Linux - these spurious
characters have (on several occasions) interrupted the boot process - by
sending character sequences which interrupt the normal automatic boot
process of the BIOS and/or boot loader - as such IPMI SoL must be
disabled on these systems for reliable operation - this leaves the
systems in-question without a viable remote-access system for
BIOS/boot/OS interventions etc.


BMC settings are as follows:

arundel:~# ipmitool sol info 1
Set in progress : set-complete
Enabled : true
Force Encryption: true
Force Authentication: false
Privilege Level : ADMINISTRATOR
Character Accumulate Level (ms) : 50
Character Send Threshold: 220
Retry Count : 7
Retry Interval (ms) : 1000
Volatile Bit Rate (kbps): 57.6
Non-Volatile Bit Rate (kbps): 57.6
Payload Channel : 1 (0x01)
Payload Port: 623


All baud rates are set to 57.6k / 8bit / no parity in Linux (Linux
kernel and 'getty' processes).

BTW, I administer Intel and Tyan IPMI v2.0 machines using identical
software and the same IPMI SoL settings - without seeing these problems.

I can arrange to supply a hex-dump of the received junk characters if
that's useful.  I'm also happy to execute arbitrary IPMI commands etc. etc.


Thanks,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  Registered 
Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Re: [smartmontools-support] SMART causes disks to go offline on an LSI SAS1068 controller - Dell SAS 5/iR

2009-10-27 Thread Tim Small

Hello,

Just to say that I'm seeing this bug as well, with smartmontools 5.38 
and smartctl 5.39 2009-10-10 r2955 on Debian lenny.  The machine is a 
Dell PowerEdge 860.  I'm guessing that this is either a firmware or 
driver issue.

02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X 
Fusion-MPT SAS (rev 01)
Subsystem: Dell SAS 5/iR Adapter RAID Controller
Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 1275
I/O ports at ec00 [disabled] [size=256]
Memory at fe9fc000 (64-bit, non-prefetchable) [size=16K]
Memory at fe9e (64-bit, non-prefetchable) [size=64K]
Expansion ROM at fea0 [disabled] [size=1M]
Capabilities: [50] Power Management version 2
Capabilities: [98] Message Signalled Interrupts: Mask- 64bit+ 
Queue=0/0 Enable+
Capabilities: [68] PCI-X non-bridge device
Capabilities: [b0] MSI-X: Enable- Mask- TabSize=1
Kernel driver in use: mptsas
Kernel modules: mptsas

# modinfo mptsas
filename:   
/lib/modules/2.6.26-2-openvz-amd64/kernel/drivers/message/fusion/mptsas.ko
version:3.04.06
license:GPL
description:Fusion MPT SAS Host driver
author: LSI Corporation



The errors look like this:

428.524463] mptscsih: ioc0: attempting task abort! (sc=81021b950940)
428.524471] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
433.199851] mptbase: ioc0: LogInfo(0x3114): Originator={PL}, 
Code={IO Executed}, SubCode(0x)
433.199851] mptsas: ioc0: removing sata device, channel 0, id 0, phy 0
433.199851]  port-0:0: mptsas: ioc0: delete port (0)
433.199851] sd 0:0:0:0: [sda] Synchronizing SCSI cache
433.348856] mptscsih: ioc0: task abort: SUCCESS (sc=81021b950940)
433.348868] mptscsih: ioc0: attempting task abort! (sc=81021b950440)
433.348873] sd 0:0:0:0: [sda] CDB: Synchronize Cache(10): 35 00 00 00 00 
00 00 00 00 00
433.348885] mptscsih: ioc0: task abort: SUCCESS (sc=81021b950440)
433.348893] mptscsih: ioc0: attempting target reset! (sc=81021b950940)
433.348896] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
433.605026] mptscsih: ioc0: target reset: SUCCESS (sc=81021b950940)
433.605034] mptscsih: ioc0: attempting bus reset! (sc=81021b950940)
433.605037] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 
0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00
434.157594] mptscsih: ioc0: bus reset: SUCCESS (sc=81021b950940)
444.546154] mptscsih: ioc0: attempting host reset! (sc=81021b950940)
444.546162] mptbase: ioc0: Initiating recovery
461.540429] mptscsih: ioc0: host reset: SUCCESS (sc=81021b950940)
461.540437] sd 0:0:0:0: Device offlined - not ready after error recovery
461.540440] sd 0:0:0:0: Device offlined - not ready after error recovery
461.540475] end_request: I/O error, dev sda, sector 15631039
461.540480] md: super_written gets error=-5, uptodate=0
461.540485] raid1: Disk failure on sda1, disabling device.



and the drives are:

Model Family: Seagate Barracuda ES
Device Model: ST3250620NS
Serial Number:9QE3L9E0
Firmware Version: 3BKS

and are in JBOD mode (+ sw RAID with md).

lsiutil says:

Current active firmware version is 0.10.51
Firmware image's version is MPTFW-00.10.51.00-IE
  LSI Logic
x86 BIOS image's version is MPTBIOS-6.12.05.00 (2007.09.29)

... which is the latest on Dell's download pages for this server.

The kernel is 2.6.26-2-openvz-amd64 from Debian Lenny (same behaviour 
with non-openvz kernel).  Running smartd makes the drives disappear 
after a few hours, but doing this:

while true ; do smartctl -T permissive -d sat -a /dev/sda  /dev/null  
echo -n . ; done

seems to knock them out in about a minute.

Subjectively, 5.38 seemed to upset the controller a lot quicker than 
5.39 r2955 does.  For good measure I'm currently stress-testing a PE1950 
with a SAS 6/iR (SAS1068E) in the same way (however this is using RAID 
setup through the BIOS).

smartctl 5.39-pre needs '-T permissive' on the PE860, but 5.38 doesn't 
seem to require it.


It is worth trying a newer mptsas driver?

Regards,

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309

___
Linux-PowerEdge mailing list
Linux-PowerEdge@dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

63 matches

Mail list logo