Re: [Linux-PowerEdge] CVE-2020-5344
[EXTERNAL EMAIL] Hi, BTW, I use this type of chroot solution to deploy updates which only target other Linux OS versions (e.g. RHEL6) on servers which run Debian 10 and Ubuntu LTS. This will generally work, but some updates which rely on a specific kernel version (e.g. because they ship use "out-of-tree" kernel modules) may still fail. In some cases if the "out-of-tree" kernel modules have since been "upstreamed" and included in the kernel you are running on the server, you can instead just use modprobe to load these kernel modules (e.g. "dell_rbu") from outside the chroot, before running the update. I use the Debian "schroot" tool (which takes care of bind-mounting /proc /dev /sys /home etc. - schroot is also availabe for Redhat I believe), and pre-generated root archives from https://images.linuxcontainers.org/ HTH, Tim. On 09/04/2020 21:37, Yannick PALANQUE wrote: > [EXTERNAL EMAIL] > > Hello, > > Le 09/04/2020 22:12, miguel.cha...@dell.com a écrit : >> Is there a solution? > > I think maybe running the DUP from a chrooted installation of CentOS 7 > could work? (you should copy a big tar.gz or something like that) > > But it must be like a using a truck to move a cup of tea one meter away... > ___ > Linux-PowerEdge mailing list > Linux-PowerEdge@dell.com > https://lists.us.dell.com/mailman/listinfo/linux-poweredge -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Re: [Linux-PowerEdge] iDRAC6 2.92 on PowerEdge R210 II
On 13/03/2020 17:16, josh.mo...@dell.com wrote: > This is probably a miss Upgrading directly worked - thanks. It'd be good to get this fixed for the other R210 II upgrade methods. Are you able to raise a bug for that? Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge
[Linux-PowerEdge] iDRAC6 2.92 on PowerEdge R210 II
Hello, I wanted to update some Dell R210 II servers from iDRAC6 firmware 2.90 to 2.92. Strangely I get: # ./ESM_Firmware_KPCCC_LN32_2.92_A00.BIN This Update Package is not compatible with your system Your system: PowerEdge R210 II System(s) supported by this package: R710, R815, T410, R715, R210, R510, T310, R310, T610, R610, R410 Since the fix is purely a security update for remote access, I can't see why it wouldn't be applicable to the R210 II (especially as the R210 is listed). I haven't tried exploiting the security problems that 2.90 -> 2.92 addresses, but I would be extremely surprised if they aren't present on R210 II and v2.90. Is this exclusion a mistake? Thanks, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge
[Linux-PowerEdge] Poweredge R230 IPMI BMC bug triggers frequent error messages on RHEL 7 kernel and others
Hello, Our EL7 machines get about 5000 messages per day saying: ipmi_si ipmi_si.0: Could not set the global enables: 0xcc. The OpenIPMI developers say: "Some BMCs don't let you clear the receive irq bit in the global enables. This is kind of silly, but they give an error if you try to clear it." Ubuntu 16.04LTS (and other distros with kernel >4.0 or a backported patch) say: "The BMC does not support clearing the recv irq bit, compensating, but the BMC needs to be fixed." Was wondering if this was a known bug on the Dell BMCs, and if-so when a fix was planned? # ipmitool bmc info Device ID : 32 Device Revision : 1 Firmware Revision : 2.30 IPMI Version : 2.0 Manufacturer ID : 674 Manufacturer Name : DELL Inc Product ID: 256 (0x0100) Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Re: Oracle Enterprise Linux
On 24/11/10 11:38, Nick Lunt wrote: is Oracle Enterprise Linux supported on all Dell servers, along with open manage, firmware updates, disk array drivers etc ? It's not supported AFAIK, but since it's essentially a rebuilt-from-source Redhat Enterprise Linux with some kernel performance tweaks - you are unlikely to see any issues with OEL that you won't also see when running RHEL, CentOS etc. on Dell hardware. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Oracle Enterprise Linux
On 24/11/10 12:07, Tim Small wrote: It's not supported AFAIK, but since it's essentially a rebuilt-from-source Redhat Enterprise Linux with some kernel performance tweaks - you are unlikely to see any issues with OEL that you won't also see when running RHEL, CentOS etc. on Dell hardware. Err, actually since the latest OEL 5 release uses the 2.6.32 kernel (same as RHEL6), but the userspace is largely RHEL5, you are likely to hit some packaging/repository issues when using the Dell RHEL6 on it, I'd guess. http://lwn.net/Articles/406242/ ... but since Dell support both RHEL5, and RHEL6, I would have thought things will largely work. There is also some sort of agreement between Dell / Oracle, however it looks like you need to go to Oracle for the actual support, and it doesn't indicated which servers are supported: http://www.oracle.com/us/corporate/press/161333 Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PERC 4e/Di errors on RHEL3 server
On 04/11/10 19:55, Eric Wood wrote: Can I get some help on what replacement controller to buy? I have a PE 2800 with a RAID-5 with three seagate 36gig drives on a PERC 4e/Di.Currently all drives are ONLINE and working. But the server has been up since 2004 and has crashed three times in recent days with various error messages. Currently all drives are ONLINE and working but Oof, up since 2004 it'll be a like swiss cheese from a security vulnerability point of view More likely to be the drives than the controller I think - maybe the controller is not coping very well with various failure conditions on the drives. I would try: modprobe sg then smartctl -a /dev/sgX (if you can get smartctl installed on this box) - where X is the relevent letter for each drive. You may find one or more drives with reallocated sectors (grown defect list etc.), also check the offline-corrected ECC count etc. After that you might want to look into upgrading the firmware on the controller, or possibly replacing the entire machine with a new one (which will use considerably less electricity - the 6 month electricity and cooling cost of a server of that vintage often exceeds it's total value by several times). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Perc6/i does not want to upgrade firmware. suggestions?
On 31/10/10 13:09, Arno van der Veen wrote: Hello all, I upgraded all firmware manually as written earlier, but I really can't get the perc6/i upgraded in it's firmware.. :-( Don't use Dell's buggy, overly-complex scripts (self-extracting shell scripts, which then install RPMs - makes me feel nauseous just thinking about the concept - what do you think this is? Microsoft Windows?) - get the file out of them, and run the update manually instead using megactl? Cheers, Tim. p.s. If anyone from Dell happens to be reading - if you do insist on these nauseating byzantine scripts, don't assume /bin/sh is a link to /bin/bash cos it often isn't -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: also OT:: Import Raid Config from Perc6/i to SAS 6i/R
I believe MegaRAID (and hence PERC 5 / 6) uses DDF - which is an open standard - but SAS6 (LSI 106x/107x SAS etc.) don't (BICBW). dmraid and recent mdadm will allow you to read/modify the raw metadata. If you are just looking to move the drives and don't mind about using the SAS6's raid features, then dmraid, or a recent mdadm may be the easiest solution - the drives would just be a JBOD, with the RAID implemented in the Linux kernel (using either dm, or md). Tim. On 28/10/10 11:59, Gregor Friedrich wrote: Hello also OT, sorry is there a way to import (or modify and import)the raid config metadata on disk form Perc6i to SAS 6i/R Thanks Gregor ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: using a PERC6/i (MegaRaid SAS 1078) in JBOD mode?
On 22/10/10 13:52, Louis-David Mitterrand wrote: On my PowerEdge 2900 III I have a PERC6/i (MegaRaid SAS 1078). From the controller bios is seems I have to create a raid0 virtual drive for each physical disk or in order for them to appear in Linux. As I intend to use none of that controller's raid features, is it at all possible to switch it to plain JBOD mode? AFAIK no, but you could script the creation of the whole-disk-single-drive-raid0 from within Linux, and if you need to plug the drives into another Linux box, dmraid (and also recent mdadm) will understand the meta data format, and so allow you to access the data. Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: problems with LSISAS2008 6Gb/s SAS kernel mpt2sas driver
On 21/10/10 08:31, Louis-David Mitterrand wrote: Hi, I am setting up a new Dell T610 server with 8 WD Black Caviar sata3 1TB disks on a LSISAS2008 controller: Oct 21 09:12:37 grml kernel: [ 83.377388] mpt2sas0: LSISAS2008: FWVersion(02.1 5.63.00), ChipRevision(0x02), BiosVersion(07.01.09.00) My layout is as follows: - small un-encrypted raid1 boot partition on /dev/md0 - dm-crypt main partition on /dev/md1 (actuallly /dev/mapper/cmd1) A recent grml64 is used to create the partitions, install the system and run lilo. When running lilo I get these errors from the controller: Oct 21 08:57:11 grml kernel: [40832.015207] mpt2sas0: fault_state(0x265d)! Oct 21 08:57:11 grml kernel: [40832.015210] mpt2sas0: sending diag reset !! Any suggestion on fixing that problem would be welcome. I can send more complete logs. Looks like a firmware bug - do you have the latest firmware? Drive firmwares? Anything in the drive error logs (using smartctl)? If not, then try opening a bug on the kernel bugzilla - LSI engineers read that (and sometimes even fix things). Otherwise, you could try replacing with a straight SATA contoller, if that box doesn't have a SAS backplane - I've not been to impressed by the quality of engineering for LSI contollers, and SATA-on-SAS in general hasn't been very reliable IMO. Just go for a well supported SATA controller (e.g. Sil 3132 etc.). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: EDAC on Dell servers
On 20/10/10 19:11, Alexander Dupuy wrote: This is the first time I have heard of this. When you refer to Dell ESM are you talking about OMSA, or the onboard firmware (ESM = embedded system management?) of the BMC/DRAC? At the moment everything is racey when it comes to the EDAC registers - it'd be nice to have a firmware API, so that EDAC etc. could tell the firmware to leave those registers alone (if that's what the user wants). On some newer Intel chips, I believe the EDAC registers are only visible from the CPU System Management mode, so Linux doesn't even get a look in. Bah, yet more closed sourceness... Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: R310 BMC woes
On 15/10/10 20:53, Drew Weaver wrote: We don't want the management NIC or traffic to be visible to the operating system for security reasons. It won't be - the OS never sees traffic destined for the BMC, it goes directly from the NIC chip to the BMC, and doesn't hit the PCIe bus which is attached to the main computer, so it's pretty much invisible (and it'd be encrypted anyway, and you can enforce encrypted-only coms). The BMC gets its own MAC and IP address (these days, older IPMI implementations did it differently). I use IPMI and SoL exclusive - don't bother with the DRACs - don't need them (and you can't get them on Tyans or Intels or HPs, whereas I can use ipmitool on everything). If you're looking for low cost, did you consider the R210s? Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: perc6i alignment?
On 14/10/10 02:27, Eugene Vilensky wrote: One more question...is there anything to be concerned about regarding on disk geometry or does the PERC do the right thing automatically when using OEM drives? Nearly all drives have 512byte sectors, so these won't be a problem. WD have been shipping 1.5TB and 2TB drives with 4k sectors for a while, and Hitachi is now also (just) doing so. In the case of the WD drives, they lie that they have a 512byte physical sectors, because if they don't various BIOS and software breaks (dunno about the Hitachis). So, if you're using those drives, then the PERC had better align it's user-visible data on 4096 byte boundries, otherwise write performance will go down the toilet (unaligned write of 4k of data will result in 2 reads and 2 writes instead of a single write). Dunno if it does or not, I guess you could pull a drive an use dmraid to work out what it's doing (the RAID metadata it uses is an open standard). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: 2.5 vs 3.5 drive performance?
On 08/10/10 23:23, Dave Sparks wrote: Anyone still buying their servers with 3.5 drives? Everyone who needs large capacity storage? If you need performance, for most applications SSDs would seem to be a better idea than 2.5 drives, no? This is based on real-world prices for the drives, I haven't checked Dell's comedy figures recently... Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: ordering a T610: what options?
On 30/09/10 14:00, Louis-David Mitterrand wrote: - what Raid Connectivity (C0 to C16) should I select? Probably doesn't matter, you can just reconfig it when you get it. - which Raid Controller is the best in plain JBOD mode? Which will allow 'smartctl' to monitor the individual disks' health? I believe the PERC H200 and SAS6 both use the same chip, but the SAS6 may be more straightforward, however be aware of: http://bugzilla.kernel.org/show_bug.cgi?id=14831 personally, I've not had good experiences of using SATA drives with these controllers under Linux - I'd prefer a straight AHCI controller like the Intel ICH10, or SiI3132 etc. instead of SATA-on-SAS... Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: 3.5in SAS flash drive for R900?
On 09/09/10 18:57, Philip Tait wrote: Will the PERC-6i do TRIM passthrough? If not, you'll need to use an AHCI controller such as the server's onboard Intel SATA controller etc. (this is what we are using). Does your kernel and filesystem support TRIM too? Is the availability of TRIM support critical to the operation of these drives? I believe all SSDs will suffer degraded performance over time, and/or excessive wear without TRIM - they end up doing extra writes in order to (unnecessarily) preserve the contents of deleted logical blocks. Same goes for running them pretty-much full all the time with TRIM. However the Intel drives suffer less from a lack of TRIM support than some other designs do (i.e. performance degrades, but doesn't drop through the floor). This is basically due to the current generation of SSDs pretending a hard disk - when really they are flash (which has some very different physical properties). Probably the best solution from an engineering point of view would be to use a flash-file-system (Linux has several - and they are designed to suite the physical properties of flash storage), and have the SSDs expose the raw underlying flash. This however, is not the route the industry is taking TRIM is a piece of gaffer/duct-tape to fix this problem by providing a mechanism for the OS to tell the SSD which logical blocks it no-longer needs to work to preserve Without TRIM a workaround is to periodically discard all data by doing an ATA security-erase-unit, but this might not fit in with your anticipated usage. bcache also looks very interesting, but is currently alpha-quality http://bcache.evilpiepirate.org/ BTW, the machine I'm using currently uses TRIM with both ext4, and btrfs on an Intel SSD, with an AHCI controller (Intel ICH10). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Advice for a debian server
On 06/09/10 13:57, Emmanuel Lesouef wrote: Reading this list, I sometimes see threads complaining about compatibility issue between recent Dell servers and Debian GNU/Linux. What is the most stable and compatible model in order to install Debian Stable (Lenny) ? Anything without a SASx controller (LSI 1068 / LSI 1068E etc.), in my experience (just spent the morning recovering a Lenny server following data corruptions with these crappy controllers). I'm quite happy with the option of using software RAID along with the onboard SATA controllers on the R210, R310, R410. Not sure if they do a hot-swap option with SATA-only (i.e. non-SAS) configurations, but if they don't and you need this, then I can recommend the Intel Server Systems instead (e.g. Intel SR1630HGPRX, SR1695GPRX etc.) - they are engineered to a similar quality, and you don't end up paying through the nose for large hard disks. Make sure you use a recent Lenny kernel for the bnx2 NICs in the R210 etc. to work - you'll need to have the firmware-bnx2 package installed. The Lenny installer images probably have the kernel patch in by now, but if they don't, just install using a USB NIC, or similar and then update to the latest kernel post-install. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: 16tb filesystems on linux
On 27/08/10 09:18, Andrew Robert Nicols wrote: As I say, we're primarily a Debian shop and Solaris did used to feel like a bit of a thorn in the side but things have improved. Did you consider/try ZFS on Debian-kFreeBSD instead of OpenSolaris to try and make things less painful? http://packages.debian.org/sid/zfsutils Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Dell OpenManage 6.3 for Ubuntu
On 23/08/10 15:05, Johan Sjöberg wrote: I then tried to install it on Debian testing (Squeeze). On that version, it was only the libsmbios-utils package that stopped me from installing. The Ubuntu smbios-utils package has the same contents as the Debian libsmbios-bin package, AFAIK. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: external ESATA drives in a R610
On 19/08/10 22:50, Bond Masuda wrote: ave you actually tested hot plugging a eSATA drive from a controller based on a Silicon Image chip that also has a PCI-E to PCI-X bridge? Nope. i have one such card, which works as long as I don't hot plug. if I do hot plug, I get a machine check and instant shutdown/restart... Not seen anything like that myself, sounds like that might be an electrical issue, or a design fault on that card maybe? If you still have access to that setup - maybe try connecting ground on the drive chassis, and the PE first... Were you able to decode the machine-check? Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: external ESATA drives in a R610
The Intel ICH10R (as used in the R210 R310 R410 R510) does support SATA port multipliers (in which case I'd suggest a simple header to bring the motherboard SATA out to the back-panel), but the R610 seems to use the ICH9, which doesn't. The Intel ICH are very good SATA chips, and will get you a very high throughput, but as Silicon Image make most of the port multiplier chips, they may have better compatibility. The Sil3132s work well, but have somewhat limited throughput (300M per second each I think), but this may not be an issue to you depending on your application. The Sil3124s may be a bit better they are 4 port chips, and although this is a PCI-X chipset you can find implementations which put them behind a PCIe to PCIX bridge chip - have a look on ebay or elsewhere, they are around $70 or so, I think. It looks like the card you found is one of those (3124 + pcie bridge), although the fan+HS on it looks like a load of BS to me (it's on the PCI bridge chip unless I'm mistaken - and is almost certainly there to make it look cool only), and the price is very high.. You can find more detailed info including port multiplier (PM) support here: https://ata.wiki.kernel.org/index.php/Hardware,_driver_status Also ask on the linux-ide mailing list... I'm interested in adding hotplug ESATA capability with port multiplier for backup purposes to something like an R610. Is this supported by any of the Dell external SAS controllers (don't need raid for this). I don' think any SAS multipliers support SATA port multipliers. You can use SAS multipliers instead, but I don't like LSIs SAS cards, and have had loads of trouble with them in the past (e.g. hardware bugs in the LSI1068E etc.) - just stick with plain commodity (well supported, and cheaper) SATA IMO. Let us know what you come up with... Tim. p.s. I found that the R300s have a bug whereby they don't reliably detect Sil3132s. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PowerEdge 2800: megaraid/scsi errors (PERC 4e/di)
On 05/08/10 08:00, Marc Petitmermet wrote: megaraid: aborting-12854 cmd=2ac=2 t=0 I=0 megaraid abort: [255:128], driver owner megaraid: resetting the host... What do the above errors mean? Are the disks failing or is this an other hardware issue? Any advise would be greatly appreciated. I'd want to take a closer look at the general health of the drives themselves (grown defect list, ECC correction count, uncorrectable error count and the like) using a tool like smartctl - recent smartmontools releases have support for looking at drives behind perc 4s - search for megaraid in: http://smartmontools.sourceforge.net/man/smartctl.8.html alternatively if getting smartctl onto this box is fiddly (and you can easily take the drives offline) it might be easier to plug the drives into a plain SCSI controller on a more modern box... HTH, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: LCD Display on R710
On 17/12/09 19:00, Alexander Dupuy wrote: you can use [...] the Dell-modified ipmitool (http://linux.dell.com/files/openipmi/ipmitool/ipmitool-1.8.6-13.4.DELL.13.src.rpm or https://launchpad.net/ubuntu/karmic/+source/ipmitool): # ipmitool delloem lcd help lcd set {mode}|{lcdqualifier}|{errordisplay} lcd set mode {none}|{modelname}|{ipv4address}|{macaddress}| {systemname}|{servicetag}|{ipv6address}|{ambienttemp} {systemwatt }|{assettag}|{userdefined}text [...and lots more...] Sorry to resurrect an ancient thread, but out of interest which version are you running to get those options? I used the binaries from launchpad (1.8.9+patches and 1.8.11+patches), and also compiled the 1.8.6 version from the src rpm, and I only get: ipmitool-with-dell-hacks/ipmitool-1.8.6$ ./src/ipmitool delloem lcd help lcd set {none}|{default}|{custom text} Set LCD text displayed during non-fault conditions lcd info Show LCD text that is displayed during non-fault conditions ... and that's all. Tim. p.s. Any word on when Dell is going to sort their patches out and get them in upstream ipmitool so we don't have to put up with this bollocks? -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Chassis identify light status checking
Hi, I was wondering if there's a way to get the status of the chassis identify light on recent poweredges? If memory serves me correctly, this pops out in amongst the output from ipmitool chassis status on Intel SRxxx servers, and although I can turn the light on with ipmitool chassis identify force - I can't see a way of checking the lights' status, and no IPMI events seem to be generated... Any ideas? Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Safe to run smartctl on PowerEdge 2950 Dell PERC 5/E?
dellpowere...@semantico.com wrote: Hi List, I've been reading about the smartctl problems people have been having. I couldn't see my controller listed in any of the threads. OS is RHEL5.4 smartmontools is smartmontools-5.38-2.el5 Safe as far as I know, but if you have a test system which you can hammer using smartctl whilst under I/O load for a few hours, I'd recommend doing that first. I'd be interested to see the results, and have been meaning to do this myself. To be on the safe side, perhaps you'd want to review the controler and disk firmware changelogs (from the Dell firmware update packages) too to see if there have been any passthrough, or SMART related bug fixes since the versions that you have installed (assuming they aren't the latest) - however the risk of a manual one-off run of smartctl is likely to be far lower than that of running smartd continuously... Whilst I'm at it DELL: PLEASE can you include the full release history in your changelogs, not just the changes since the last release - chasing through 10 package releases to piece together all the changes from an oldish release is just PAINFUL. Other vendors (e.g. Intel) include the full history, and it makes my life a LOT easier. Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Intel VT-d (IOMMU) support on poweredge servers
Hi, I'm trying to find out which poweredges support Intel VT-d (Directed I/O) - so that I can pass through a PCIe SCSI controller directly to a virtualised W2K3 instance using KVM? It looks to me like the PE1950 doesn't, but I can't seem to find a definitive answer... Anyone have that info? Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: [Fwd: [Bug 616517] libvirt should not use the MAC address assigned to tap devices/vnet interfaces by the TAP/TUN driver.]
Andreas Rogge wrote: The bridge-interface will lose network connectivity because it changes its MAC. Perhaps this is a bug in the Linux bridging code? Shouldn't brctl setportprio be used to force the MAC of the bridge to be the same MAC as the physical port - then it wouldn't change Didn't work last time I tried this tho (which I thought was probably a bug), and it's not particularly well documented, so I may have the wrong end of the stick here. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
Rahul Nabar wrote: ** First thing you should do in BMC setup is reset to default. The BMCs often ship with a weird non-default setting that will cause lots of serial port feedback if you try to run a getty on the serial console. Would ipmitool, or syscfg or another equivalent tool have the way of setting the BMC SOL? I already set the BMC's IP and pasword using ipmitool. I have ~300 machines so doing this step manually via Ctrl+E is tedious. SoL can be setup and used quite easily using ipmitool - disabling IPMI-over-serial is another matter (without rebooting, and entering the BMC-control bit of the BIOS) - this seemed to work on the PE1950s which I have access to: ipmitool raw 0x06 0x40 0x02 0xb8 0x84 ipmitool raw 0x06 0x40 0x02 0x78 0x44 AFAIK this issue was only present on some PE1950s, and was probably a problem with Dell's production-time automated testing system (IPMI over serial - not SoL - was set up to carry out various tests, but not turned off again prior to shipping). HTH, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: 1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?
Peter Kjellstrom wrote: We've seen both Hitachi and Seagate 2T drives from Dell. Great, thanks. Any ideas what model and firmware numbers (preferably hdparm -I or otherwise cat /proc/scsi/scsi or lsscsi etc.). Cheers! Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: 1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?
Blake Hudson wrote: Tim Small wrote: I know they've previously been shipping Seagate ST31000340NS 1TB drives, but I've no idea which vendor/model 2TB drives they're using? I recently purchased a couple Dell branded 2TB SATA drives. They are Hitachi's - HDS72202A28A. Interesting - thanks for that - any idea if they were pulls from server-class, or desktop-class Dell hardware? The reason I ask is that whilst Google come up with a blank for that part number, the non-Dell 2TB Hitachi Deskstar (i.e. desktop-class drive) is the HDS722020ALA330, whereas their Ultrastar drives have the part number HUA722020ALA330... Anyone else had any 1TB or 2TB SATA drives in Poweredges? Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
1-2TB SATA drives currently shipping with 11G servers + onboard SATA options?
Hi, I'm planning on purchasing a couple of 11G servers (prob R210 / R310 / R410) for a client, and was wondering which 1-2TB drives Dell are shipping with these boxes? I know they've previously been shipping Seagate ST31000340NS 1TB drives, but I've no idea which vendor/model 2TB drives they're using? Also: If you order the On-board SATA Controller, Min. 1 Max. 4 SATA Only Cabled Drives type options (which is what I'm planning on doing given all the bollocks I've had from LSI controllers recently - give me a plain old just-works-and-fast Intel ICH10 any day) - do the servers actually ship with all the cables and mounting hardware for adding bare drives to non-populated non-hotswap drive bays? ... i.e. does the R310 with that option selected plug just a single SATA drive actually ship with the brackets and cables to fit a bare 3rd party SSD of my choice, such as the Intel ones at a later date (I realise that I might need to supply a suitable 2.5 to 3.5 bracket, and maybe a few screws myself), or would I have to hunt around the net at great length/expense to do that (or just buy the min-size SATA drives with the server, and discard them when I needed to put in the SSD)? The online configurator for the R410 no longer seems to have the on-board sata controller option - anyone know if it's still actually available (it definitely used to be there, and is in the R410 technical guide book)? Thanks! Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: dell 2850 initrd problem.
Ron Croonenberg wrote: So if I issue: e2fsck -f -b 32768 /dev/mapper/VolGroup00-LogVol01 Hi, If you're running filesystem checks - I'd strongly suggest running them on a backup image, or (more easy to achieve and quicker) running them against a copy-on-write snapshot (see my post of about a month ago about RAID recovery). You might want to try the ext3-users mailing list or similar if you need more help putting the filesystem back together. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: how to find which server to buy
Tapas Mishra wrote: I want to know how to decide as which server of Dell will be able to sustain request to a particular application. That is what should be CPU frequency Quad Core or 2 Cpu or RAM .If it makes sense then stress testing of a Dell Server for my application. Assuming I know the maximum load of my application. Err - simulate your application on some other hardware - a desktop PC is often fine as a starting point, and see when the following things saturate, and the overall application performance becomes unacceptable (e.g. latency or whatever is appropriate to measure your application performance): . CPU . RAM . disk . network bandwidth ... then do some rough calculations using: . The figures that you've just got from your benchmarks. . The approximate difference between your test system, and the system that you are thinking of buying. ... that should get you pretty close Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
Adam Nielsen wrote: Perhaps you can answer something that's been bugging me for some time. How does this actually work? I mean, what gets sent over the wire when you redirect a serial port? It's always bugged me that there's not much information about how this is done, and it seems to use a bit too much magic for my liking. I mean, is it TCP? Can you restrict access to it with a firewall? How does it share the network card with the host OS, in the cases where you use the one NIC for both? The IPMI BMC is a complete autonomous embedded computer on the motherboard. It has various connections to the main computer, but is otherwise distinct from it (runs all it's own code, and has its own CPU and RAM). For shared LAN access it also (typically) has a backdoor into the NIC chip, so that it can tell the NIC to - for example - get all traffic destined for a certain MAC address (earlier implementations were even more strange - in that they could set the NIC up to do things like steal all UDP traffic to the IPMI port). The IPMI over LAN protocol is implemented as UDP (on port 623) - look at the LAN INTERFACE and LANPLUS INTERFACE entries in a recent ipmitool manual page for details With SOL, Linux sends serial data to the serial port - the output of this serial port is then connected to the BMC which receives the traffic on its own serial port, encapsulates it as IPMI lanplus SOL UDP packets, and sends it out via the NIC backdoor... Because of the way that the BMC goes straight-to-the-NIC, any iptables firewalls under Linux aren't going to see the traffic - so you'd need to do any firewalling before the traffic hits the NIC (i.e. outside of the box). Another alternative is to configure the BMC to only communicate on a separate VLAN, so that you can isolate it from other traffic using that mechanism instead (e.g. ipmitool lan set X vlan id 888). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
Jefferson Ogata wrote: Hmm, my impression was that Redirection After Boot only affects comms up to the point of getting a bootloader running, but I'm not 100% sure. I believe (but again, I could be wrong), that with redirection after boot enabled, the BIOS polls the VGA character buffers using a timer interrupt. This mechanism continues to work until Linux switches the CPU to protected mode very early in the kernel boot process - i.e. it works in DOS and the boot loader - and (as has been mentioned already) it relies on things like the boot loader NOT being configured to communicate with the serial port directly themselves. Personally, I normally enable redirection after boot, and disable the native serial coms in grub. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
Case van Rij wrote: I have this configured on 45 R410s with iDRAC Express and have tried it in the past on 50 R210s and a handful of 1950s and I have to say, even once configured it's actually rather frustrating to use and it's no substitute for a cyclades-like serial console server. I initially tried to use this with the ethernet port on my regular switch (even though the ethernet port was dedicated to PXE and IPMI, all normal traffic is using an add-on 10G card) and the management port would simply stop responding to network traffic on a daily basis. I've since moved the ethernet to an isolated switch and it's marginally better, but the serial-over-LAN is still so unresponsive that remote management automation frequently times out while trying to manage the server). I've found performance on R210s to be good. 1950s and PE860s less so. I've found reliability pretty good using recent ipmitool builds. Here's how I have an R210 set up (I think these were just the defaults). # ipmitool sol info 1 Character Accumulate Level (ms) : 50 Character Send Threshold: 255 Retry Count : 7 Retry Interval (ms) : 480 Volatile Bit Rate (kbps): 115.2 Non-Volatile Bit Rate (kbps): 115.2 Payload Channel : 1 (0x01) Payload Port: 623 You may want to try: 1. Using a recent ipmitool, if you're not already. 2. Fiddling with the first 4 values. I'm guessing that things like the retry interval could come down on a LAN - half a second to retry a dropped packet seems like a long time at first consideration It would perhaps have been nice if IPMI did stuff over TCP rather than UDP, but when it was defined it didn't to SoL, so I suppose UDP was good enough, and nice and light-weight for doing other (non-interactive) IPMI comms. HTH, Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Serial over LAN
Jefferson Ogata wrote: On 2010-05-20 16:14, Marc Moreau wrote: I'm looking to setup Serial over Lan on my cluster of PowerEdge 1950's. Does anyone have this setup? Sure. Use it all the time. Yes - here too - using ipmitool on Debian. I use pretty much the same setup with Dell PE 860/1950/R210, Intel SR1500/SSR212MC2, and a couple of different Tyan boxes too. Dell's IPMI implementation is reasonable, but for some weird reason, they want to develop proprietary tools to manage it - what's wrong with just working with the existing open source tools? They seem to work pretty well, and have less wacky interfaces than the Dell-proprietary stuff. I *really* don't get this policy. 2. In BMC setup (control-E during POST), enable serial over LAN, set IP and password. ** First thing you should do in BMC setup is reset to default. The BMCs often ship with a weird non-default setting that will cause lots of serial port feedback if you try to run a getty on the serial console Actually, a guy I work with ( David from Positive Internet ) recently fixed that on poweredge 1950s with: ipmitool raw 0x06 0x40 0x02 0xb8 0x84 ipmitool raw 0x06 0x40 0x02 0x78 0x44 He got this from tracing what the Dell proprietary binary did. Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)
On 06/05/10 08:47, Support @ Technologist.si wrote: Hi tim, You gave yourself a hell of a job.. Below here are some links.. the last 2 links are linux ways to go.. http://forum.synology.com/enu/viewtopic.php?f=9t=10346 http://www.diskinternals.com/raid-recovery/ http://www.chiark.greenend.org.uk/~peterb/linux/raidextract/ http://www.intelligentedu.com/how_to_recover_from_a_broken_raid5.html Ta for those who sent along some tips... In the end, I did manage to persuade the controller to put the array back together (succeeded on the second attempt, after restoring the drive metadata from the backups I'd taken). Part of the reason that I didn't try this originally is that I didn't have access to any spare SCSI/SCA drives, or the original RAID controller either! Once I had access to the original block device, I created a COW snapshot in order to run fsck.ext3 on the filesystem without actually triggering any writes to the array (I think a write caused by replaying the journal killed the array the first time around). Here are some handy instructions on using dmsetup to do this: http://www.thelinuxsociety.org.uk/content/device-mapper-copy-on-write-filesystems ... which would also be handy in the case of any other file-system corruption, and is a lot faster than copying around image files! Before that I tried the following method using Linux software RAID to reconstruct the array (which nearly worked): . Take images of the 5 drives . Work out how big the metadata is (assuming it's at the beginning of the drives): for i in {0..1024} ; do dd if=/mnt/tmp/raid_0 skip=$i | file - ; done ... etc. for all 5 drive images. . Create read-only loop-back devices from the drives using: losetup -r -o 65536 /dev/loop0 /mnt/tmp/raid_0 ... having found a valid MBR 64k into one of the drives - so assuming the Adaptec aacraid controller metadata was on the first 64k of the disk. The loop device skips over this first 64k using the offset argument above. . Create a set of 5 empty files (to hold the Linux md metadata) using dd, and set these up as loopX as well. . Create a set of RAID appends (without metadata) using: ./mdadm --build /dev/md0 --force -l linear -n 2 /dev/loop0 /dev/loop10 etc. - with the idea that a to-be-created-later md RAID5 device will put their (version 0.9) metadata into the (read/write) files which make up the end of these RAID append arrays. It would be handy if you could create software RAID5s without metadata, but you can't - they wouldn't be much practical use except for this soft of data-recovery purpose, I suppose . Create a set of degraded md RAID5s using commands like: ./mdadm --create /dev/md5 -e 0.9 --assume-clean -l 5 -n 5 /dev/md0 /dev/md1 /dev/md2 /dev/md3 missing ... for all possible permutations of 4 out-of the 5 drives, plus one missing (actually it tried the all-5-drives running layouts as well, but I disregarded these to be on the safe side). http://www.perlmonks.org/?node_id=29374 perl permutations.pl /dev/md0 /dev/md1 /dev/md2 /dev/md3 /dev/md4 missing | xargs -n 6 ./attempt.sh 21 | tee output2.txt Where attempt.sh look like this: #!/bin/bash lev=5 for layout in ls la rs ra do for c in 64 do echo echo echo echo echo level: $lev alg: $layout chunk: $c order: $1 $2 $3 $4 $5 echo y | ./mdadm-3.1.2/mdadm --create /dev/md5 -e 0.9 --chunk=${c} -l $lev -n 5 --layout=${layout} --assume-clean $1 $2 $3 $4 $5 /dev/null 21 sfdisk -d /dev/md5 21 | grep 'Id=82' sleep 4 fsck.ext3 -v -n /dev/md5p1 mdadm -S /dev/md5 done done ... so this assembles a v0.9 metadata md array (which puts its metadata at the end), and then looks for a Linux swap partition in the partition table, and tries a read-only fsck of the data partition. A chunk size of 64 seemed to be the default for the BIOS but I did originally try others. Anyway, this came up with two layouts which looked kind-of-OK (which is what I was expecting, as I assume that first one drive failed, then a second), both used left-asymetric parity layout. ... but e2fsck came up with loads of errors, and although the directory structure ended-up largely intact, the contents of most files were wrong - so there must be something else which is a bit different about the way that these aacraids layout their data - maybe something discontinuous about the array or something? After I'd completed the job, I didn't have time to compare the linux-software-raid reconstructed image with the aacraid-hw-raid reconstructed version, but this would be easy enough todo using some test data I've posted this detail here in case someone is faced with having to attempt a similar job again, but can't get the controller to put the data back together - or perhaps someone who is trying this with drives from a different HW raid controller - in which case this method might Just Work (tm). Similarly if anyone else can see anything
Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)
On 12/05/10 14:59, J. Epperson wrote: Not that I'd attempt anything like this short of a national security issue or forensics for a particularly heinous crime Would running a server with: . No RAID array status monitoring, and.. . No backups at all ... be sufficiently heinous? Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)
Tim Small wrote: Here's a diff between the hex-dump of the first 128 sectors of two of the drives -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 --- /tmp/scsi-SSEAGATE_ST336607LC_3JA760WM.raw.hd 2010-05-05 22:18:02.0 + +++ /tmp/scsi-SSEAGATE_ST336607LC_3JA763SY.raw.hd 2010-05-05 22:18:02.0 + @@ -1,56 +1,54 @@ 56 19 02 00 1e 00 00 00 10 00 00 00 f6 cd 3c 04 |V..| -0010 00 00 02 00 7d d3 9e 6c 00 00 00 00 00 00 00 00 |}..l| +0010 00 00 02 00 5a 96 f3 61 00 00 00 00 00 00 00 00 |Z..a| 0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * -01f0 00 00 00 00 00 00 00 00 00 00 00 00 9e 55 f8 0a |.U..| -0200 c4 55 00 00 d3 30 04 bd 01 f0 fa fa 33 03 00 00 |.U...0..3...| +01f0 00 00 00 00 00 00 00 00 00 00 00 00 14 8f e2 44 |...D| +0200 c4 55 00 00 d3 30 04 bd 01 f0 fa fa 2c 03 00 00 |.U...0..,...| 0210 00 00 00 00 2c 00 00 00 00 00 00 00 2c 00 00 00 |,...,...| -0220 00 00 00 00 32 03 00 00 00 00 00 00 00 00 00 00 |2...| +0220 00 00 00 00 2b 03 00 00 00 00 00 00 00 00 00 00 |+...| 0230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * -03f0 00 00 00 00 00 00 00 00 00 00 00 00 72 3c 4f d8 |rO.| +03f0 00 00 00 00 00 00 00 00 00 00 00 00 72 3c 6f 23 |ro#| 0400 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0c00 98 19 13 04 00 00 01 00 d3 30 04 bd 01 f0 fa fa |.0..| 0c10 ff ff ff ff ff ff ff ff 00 00 00 00 ff ff ff ff || 0c20 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff || 0c30 ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 || 0c40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 0df0 00 00 00 00 00 00 00 00 00 00 00 00 eb 72 03 35 |.r.5| 0e00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * 1200 99 25 03 00 d3 30 04 bd 01 f0 fa fa 2c 00 00 00 |.%...0..,...| 1210 14 00 00 00 40 00 00 00 00 00 00 00 40 00 00 00 |@...@...| -1220 00 00 00 00 32 03 00 00 00 00 00 00 00 04 05 ff |2...| -1230 d0 45 97 46 00 00 01 00 02 00 01 00 04 00 00 00 |.E.F| +1220 00 00 00 00 2b 03 00 00 00 00 00 00 00 04 05 ff |+...| +1230 d0 45 97 46 00 00 03 00 02 00 01 00 04 00 00 00 |.E.F| 1240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * -1ff0 00 00 00 00 00 00 00 00 00 00 00 00 0b 02 09 e9 || -2000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || +1ff0 00 00 00 00 00 00 00 00 00 00 00 00 0f 02 29 ea |..).| +2000 00 00 00 00 01 96 01 6c 62 45 43 05 d3 30 04 bd |...lbEC..0..| +2010 01 f0 fa fa 22 03 00 00 80 00 00 00 00 c5 3c 04 |..| +2020 d3 30 04 bd 01 f0 fa fa 03 00 00 00 03 00 00 00 |.0..| +2030 04 00 00 00 00 00 00 00 05 00 00 00 01 00 00 00 || +2040 1a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || +2050 00 00 00 00 00 00 00 00 01 00 00 00 80 00 00 00 || +2060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || +2070 00 00 00 00 00 00 00 00 00 00 00 00 03 00 00 00 || +2080 00 00 00 00 00 00 00 00 01 01 00 00 ff ff 00 00 || +2090 52 35 20 53 79 73 74 65 6d 20 20 20 20 20 20 20 |R5 System | +20a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || * -2200 00 00 00 00 01 96 01 6c 62 45 43 05 d3 30 04 bd |...lbEC..0..| -2210 01 f0 fa fa 33 03 00 00 80 00 00 00 00 c5 3c 04 |3..| -2220 d3 30 04 bd 01 f0 fa fa 02 00 00 00 02 00 00 00 |.0..| -2230 04 00 00 00 00 00 00 00 05 00 00 00 02 00 00 00 || -2240 1a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || -2250 00 00 00 00 10 00 00 00 01 00 00 00 80 00 00 00 || -2260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 || -2270 01 00 00 00 01 00 00 00 00 a6 03 a0 02 00 00 00 || -2280 00 00 00 00 00 00 00 00 01 01 00 00 ff ff 00 00 || -2290 52 35 20 53 79 73 74 65 6d 20 20 20 20 20 20 20 |R5 System | -22a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 || -* -23f0 00 00 00 00 00 00 00 00 00 00 00 00 0d d4 d2 4b |...K| -2400 00 00 00 00 00 00 00 00 00 00 00
SAS5 SAS6 etc. DMA alignment (hardware/firmware bug) - can Dell investigate this one please?
Hi, With respect to: http://lkml.org/lkml/2010/4/26/335 https://bugzilla.kernel.org/show_bug.cgi?id=14831 It would appear that it's unsafe to carry out ATA-passthrough operations on SAS5* and SAS6* controllers (1068 / 1068E and others). This definitely affects smartctl, and I'm assuming it has the potential to impact operations such as drive firmware updates - locking up the controller during a firmware update seems like it could be a Bad Thing. Could someone at Dell take a look at this issue and see if their Linux firmware update packages could be impacted? If they are impacted, I guess they should be withdrawn until a workaround can be put in place... Thanks, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: how to get rid of bad blocks in a file on PERC 5/I?
On 01/05/10 07:16, Adam Nielsen wrote: disks, as long as the hardware RAID controller can keep up with the disks there would be no difference in performance. I remember reading a benchmark which showed that under random I/O patterns, the Linux software RAID performed better on a (from memory) 8 disks RAID5, due to better use of SCSI scatter/gather. I think this was vs. MegaRAID, but was a while ago. I've not carried out any benchmarks myself, and no idea whether this goes for SATA NCQ as well... Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: how to get rid of bad blocks in a file on PERC 5/I?
Adam Nielsen wrote: I believe that when hard disks discover they have a bad sector they attempt to remap it themselves, but it may not always happen right away. So it's possible that by the time you rebuilt the array the sectors had been relocated. I believe the standard behaviour is: . Read and apply simple (fast/hardware-implemented AKA online) error correction . If that fails try to use more complex (slow/firmware-implemented AKA offline) ECC - retry this a (usually configurable) number of times. . In the case of successful correction (we have the user data), write the data back to the sector, and then read-check it to see if it was written successfully. . If the re-read-verify is OK, then continue as normal (maybe increment one of the SMART counters) . If the re-read-verify fails, then reallocate the sector (use a spare hidden reserved sector elsewhere on the disk). Increment the SMART reallocated sector count. . If the offline ECC fails, then we've really lost data, so return a read-error to the disk controller - mark the sector as pending - attempting to read the sector again may restart the offline correction attempts. If the controller later tries to WRITE to that sector instead of reading it, then the drive will do the write, and verify step again as above with the new data (i.e. see if the data can then be read, and if-not then reallocate it). In the case of a RAID controller, standard practise is for the controller to reconstruct the data from the other drives, and then issue the write instruction back to the original drive. The better RAID implementations will actually REPORT THIS TO YOU, when it happens (e.g. Linux software RAID, so that you know the drive may be unwell). To make matters worse you can't even reliably check the SMART data yourself with some of the Dell/LSI controllers - and LSI/Dell don't seem to care enough to fix this... https://bugzilla.kernel.org/show_bug.cgi?id=14831 However given the subsequent failures I would think that the drive may actually be fine - maybe you can run a self test on it without going through a RAID controller. Using smartctl to check what's gone on the with the drive itself would be the best thing to do, I think... Recent smartctl has support for communicating with drives behind PERCs. I don't know whether the situation has improved in recent years, the experiences were enough to persuade me to switch to software RAID which I have stuck with ever since. ACK. My conclusion is also to use AHCI, and software RAID. It's more reliable generally, and if you do find a bug, the maintainers are responsive (or you can even fix it yourself, or pay someone else to - this is Open Source right? Presumably that's why people use Linux in the first place?). Oh, and it's cheaper too. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PE2950, LSI SAS - SATA very slow
Adam Nielsen wrote: Hi all, We have a PowerEdge 2950 with an LSISAS1068E controller, hooked up to two Seagate 1TB SATA disks. For some reason the performance of these disks is quite poor - I'm lucky to get over 10MB/sec from them, when I should be getting closer to 100MB/sec. I've seen flaky/unpredictable performance with these LSI controllers. Also some Seagate 1TB drives have performance bugs which affect sequential reads - so you could try using alternative drive firmwares, as the bug seems to have been fixed in the non-Dell firmware (see my recent posts to this list - still no response from Dell on this issue!). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PE2950, LSI SAS - SATA very slow
Adam Nielsen wrote: We have a PowerEdge 2950 with an LSISAS1068E controller, hooked up to two Seagate 1TB SATA disks. For some reason the performance of these disks is quite poor - I'm lucky to get over 10MB/sec from them, when I should be getting closer to 100MB/sec. Sorry - forgot to say... You could try disabling NCQ on the disks (lsiutil, or in the BIOS menu, I think), to see if that helps sequential reads Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Hot disk change.
Fabio Catunda wrote: I'm really worried to know it Linux would be able to read the old disk connected on a different SATA port. dmraid knows about the metadata formats of various hardware RAID controllers, so do recent mdadm 3.x tools, I think. Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Seagate ST31000340NS firmware MA0D - Baracuda ES.2 1TB poor sequential read performance with NCQ enabled
Hi, I'm seeing poor sequential read performance (as measured using hdparm -t) using the recommended Dell firmware version MA0D on Baracuda ES.2 1TB drives using libata with ICH Sata controllers in AHCI mode with NCQ enabled (Poweredge R210, R410 etc.). Similar performance bugs are seen during RAID verifies, rebuilds etc. If I put the non-Dell firmware version AN05 on the drives, the performance bug goes away. Similarly if I reduce the NCQ depth to 2, sequential read performance is restored, but as this may impact performance under random I/O loads, I don't really want to do this... Seeing as Seagate seem to have fixed this issue for other non-Dell firmwares, any chance they could be persuaded to do-so for the Dell firmware series too? More here: https://ata.wiki.kernel.org/index.php/Known_issues#Affected_devices_2 Thanks, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Dell R300 with Silicon Image 3132 PCIe SATA/eSATA card - doesn't work with BIOS v1.4.3
Just a quick BIOS bug report - I have this device working on a Dell R300 with BIOS version 1.2.0, but when I upgrade the R300 BIOS to 1.4.3, it doesn't show up in the PCI device listings. I tried with Silicon Image BIOS versions 7.4.05 and 7.7.02. 07:00.0 RAID bus controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01) Subsystem: Silicon Image, Inc. Device 7132 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at dfcfbf80 (64-bit, non-prefetchable) [size=128] Memory at dfcfc000 (64-bit, non-prefetchable) [size=16K] I/O ports at dc80 [size=128] Expansion ROM at dfc0 [disabled] [size=512K] Capabilities: [54] Power Management version 2 Capabilities: [5c] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable- Capabilities: [70] Express Legacy Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting ? Kernel driver in use: sata_sil24 Kernel modules: sata_sil24 I haven't tried the version of the same card which Dell sells - http://search.dell.co.uk/1/2/13140-startech-com-2-port-pci-express-esata-controller-adapter-card-storage-controller-2-channel-esata-300-low-profile-pci-express-x1.html but I'd expect the same results (both cards have just the sil3132 AKA sii3132 chip, and an EEPROM with the silicon image BIOS onboard). Incidentally, you can program the EEPROM on this chip under Linux (I used Debian Squeeze) using the Silicon Image BIOS from http://www.siliconimage.com/docs/SiI3132_7702.zip along with flashrom from http://flashrom.org/ Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Adding a second Quad Core L5420 CPU to a 1950 III
PJF wrote: I'm top posting on my on post... I added a second quad core, everything is working fine. I noticed the second one is Model 23 Stepping 6 The first one Model 23 Stepping 10 Do these need to match? Openmanage reports everything is okay... I believe that Intel say the steppings should match. Can you swap one of the CPUs out from another box? Not sure if using microcode.ctl would help, but probably worth doing anyway (we run it on all our productions boxes). Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Adding a second Quad Core L5420 CPU to a 1950 III
PJF wrote: I've been through the dell docs and intel docs, not sure if the stepping matters. Out of curiosity I had another look at this. Intel's line on older processor lines were that the steppings shouldn't differ by more than 1, Intel's 5400 series datasheet says: http://www.intel.com/Assets/en_US/PDF/datasheet/318589.pdf Not all operating systems can support dual processors with mixed frequencies. Mixing processors of different steppings but the same model (as per CPUID instruction) is supported. Details regarding the CPUID instruction are provided in the AP-485 Intel® Processor Identification and the CPUID Instruction application note. but OTOH, it seems that mixing both settings of L5420 caused some trouble on Intel S5000 boards before they updated the BIOS... http://downloadmirror.intel.com/18075/eng/release.txt Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: help configuring a db server
John G. Heim wrote: But I'm confused about disk. I would think disk pspeed would be fairly important. That entirely depends on your database usage pattern: What is your total dataset size? What is your working set (i.e. data which is commonly/regularly accessed)? Can you reasonably fit your working set into RAM? What is your read/write ratio (i.e. select vs update/insert)? If at all possible, aim to have enough RAM for your working set, if you can't then disk speed becomes important for read performance. Write latency (time taken to commit to permanent storage) is critical if you are doing a lot of writes - in this case getting a RAID controller with battery-backed cache is a win - otherwise it probably isn't. So... if you have a database (or databases) with a working set of 10G, and a high read:write ratio, then disk performance probably isn't going to be important. Another hint (excuse if you know this stuff already), but you can very readily get large performance improvements by optimising your mysql server config (e.g. using mysqlanalyze, and this munin plugin: http://github.com/kjellm/munin-mysql/downloads ) Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: Third-party drives not permitted on Gen 11 servers
Philip Tait wrote: the supplied drives, and installed 4 Barracuda ES.2s. After doing a Clear Configuration in the pre-boot RAID setup utility, I can perform no operation with the drives - they are marked as blocked. Is Dell preventing the use of 3rd-party HDDs now? Thanks for any enlightenment. Hi Philip, I was wondering what the firmware version on the blocked drives is? e.g. using smartctl or hdparm -I on the drives when stuck in a different box? Assuming your drives are SATA rather than SAS, the firmware in a 250G Dell-supplied ES.2 in an R200 which I have here is MA08, whereas some third-party drives in other machines use SNxx series firmware. I believe it is possible to switch from one to the other firmware series. Whilst I think Dell's policy is probably wrong (it should be complain loudly rather than disallow), it's possible that there are genuine reasons for this - I spent/wasted most of last week diagnosing what is starting to look like a firmware bug on WD 2TB green power drives on a non-Dell server - interspersing SMART queries with other types of transactions would appear to occasionally cause the drives to lock-up! I wouldn't be surprised if the H700 adaptor firmwares are doing various unusual things to the hard drives, and it's possible that Dell has got nervous about buggy firmware from unqualified drives reflecting badly on their hardware. Some official (or non-official) comment from Dell on the *technical* reasons for this decision would be welcome Cheers, Tim. ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
ipmitool delloem half-baked patch
Alexander Dupuy wrote: You can get power statistics from ipmitool with delloem support, e.g. ipmitool delloem powermonitor powerconsumptionhistory. I realize this tends to fall in the category of a vendor-proprietary tool WTF! Why isn't this merged upstream? It's not like there isn't already a precedent, what with there already being other vendor oem commands etc. Oh, I see... http://sourceforge.net/mailarchive/message.php?msg_id=4A4DCEA9.9070003%40cern.ch The patch has been half baked, thrown over the wall, and then abandoned, by the look of it. Great Writing code like this seems to be a complete waste of time to me - it would be cheaper and easier for Dell to just release the specs, and let someone else implement the functionality and get it merged upstream - creating this sort of half-finished work just discourages other people from creating their own code. So, does dell have any plans to do any more work on this code, or is it abandoned? That would be a shame, since if you want this functionality, you have to: 1. Know about it. 2. Find the patch. 3. Apply, build, maintain (repeat). But Mr Dent, the plans have been available in the local planning office for the last nine months. Oh yes, well as soon as I heard I went straight round to see them, yesterday afternoon. You hadn't exactly gone out of your way to call attention to them, had you? I mean, like actually telling anybody or anything. But the plans were on display ... On display? I eventually had to go down to the cellar to find them. That's the display department. With a flashlight. Ah, well the lights had probably gone. So had the stairs. But look, you found the notice didn't you? Yes, said Arthur, yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard'. Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: R210 / L3426 power consumption figures - 40w idle!
Matt Domsch wrote: The Dell Advanced Power Control feature is enabled by default in BIOS SETUP, which prevents the OS from managing it. If you prefer to have the OS do it, disable this feature in BIOS. Thanks for that information Matt - must have missed that in the BIOS (all the other systems which I've configured have defaulted to OS Control). On reflection, I suppose I'm a bit surprised that this must be set in the BIOS - I'd have thought that if the OS tries to take control of frequency scaling, the BIOS should automatically relinquish control at that point I also think that unless you take the trouble to read the manual, it's not really obvious from the string Active Power Control, that this means BIOS control only - no OS control. Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PowerEdge 1950 ECC question...
It seems probable that the memory fault is causing the BIOS to crash before it gets a chance to enable ECC - thus no errors are logged. It could also be a bus-loading issue with the FB-DIMMs (such that no memory can be issued - maybe a faulty AMB chip on one of the sticks). Try the system with half the original RAM in at a time - you could also try moving each stick up by four slots (and wrapping round). Tim. Henrik Schmiediche wrote: Did that. The system is up using new RAM and OMSA and related utilities are running. There are no memory related entries in the ESM log. Old ram freezes system, but no error of any kind is generated in ESM, memtest, mpmemory, dell diags. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
R210 / L3426 power consumption figures - 40w idle!
Hi, I've had an R210 recently to set up for a client and thought I'd share the power consumption figures that I recorded with the list. Basically, very impressive results IMO, when you consider the last R200 (X3220) which I set up used 135-odd watts idle... Disks: 2x 250G Seagate ES.3 SATA RAM: 8G in 4 sticks (not sure if it was 1066, or 1333 DDR3) CPU: L3426 Kernel: Debian 2.6.30-bpo.2-amd64 Supply voltage: 240 Ambient Temp: 20 degrees Idle in Linux (no CPU freq scaling, HDs asleep): 0.19 amps / 33 watts Idle in Linux (no CPU freq scaling): 0.2 amps / 40 watts With 100% CPU usage (on all 4 cores): 0.38 amps / 82 watts With 100% CPU usage (on all 4 cores), whilst doing a sequential read from both hard disks with hdparm -t: 0.38 amps / 81 watts With 100% CPU usage (on all 4 cores),and having issued an ipmitool mc reset cold command (to get the fans to full power), whilst doing a sequential read from both hard disks with hdparm -t: 0.46 amps / 114 watts With 100% CPU usage (on all 4 cores),and having issued an ipmitool mc reset cold command, whilst spinning-up both hard disks having previously put them to sleep with hdparm -Y: 0.56 amps / 131 watts I could probably get the CPU/mem power usage up a little bit higher, as my CPU loading was a bit simplistic (burnBX + memtester). The CPU frequency scaling module doesn't seem to load. This is either because Dell's BIOS doesn't support it, or it Dell's BIOS cpu frequency interface doesn't work with the Debian Lenny kernel. More to follow... Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PowerEdge 1950 ECC question...
Henrik Schmiediche wrote: My original question (turning of ECC for memory testing) was hopefully going to narrow down the issue. It's probably possible to disable ECC after boot time using setpci (I've used it to read and write the ECC status registers on Intel chipsets in the past), but I don't know the details of the i5000 ECC implementation (I don't even know if the ECC functionality is still controlled via PCI Configuration space) - you'll have to check the datasheet (or the i5000_edac driver source). ISTR, memtest86 and memtest86+ both had some functionality for reading/writing ECC status bits on some chipsets as well, so you could hack on these too (but the code was a bit messed up last time I looked - I think ECC no, and ECC NO had different meanings!)... Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Will DOS floppy firmware updates please leave the building...
Ian Forde wrote: Heck - even at 10 servers, PXE installations should be the norm. So to be told that one has to use a DOS floppy is a little... well... grating... I've had good luck in the past using memdisk from syslinux to load a floppy,CD,or USB stick image into RAM as part of a PXE boot - and then using IPMI serial redirection to carry out the update (except you can't see *graphical* DOS firmware updaters (WFT?!!!) but even those seem to have unattended operation options). Qemu v11 + etherboot (I use the pcnet PXE rom image that ships with recent Debian) is a very good tool for prototyping / checking images Still a crappy bodge, but a bit less so... To give Dell their due, a lot of their problems are with their suppliers, and at least this: http://www.ducea.com/2007/08/27/dell-bios-firmware-updates-on-debian/ ... is a lot easier on Dell systems than most others (yeah there is http://packages.debian.org/sid/flashrom for other systems which I've had 100% success with, but if it breaks you're going to need to start pulling chips). Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: PXE boot R900 via 10Gb Intel NIC?
Jefferson Ogata wrote: Maybe there's some way to do it with ethtool -E, but I don't have any way to find out what it is. The Intel datasheet? May not help, tho, I suppose. Any other ideas? Unload the linux driver, and use a recent qemu/kvm to passthrough the PCI device, and then boot the DOS floppy image within qemu/kvm? Cheers, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: massive io problems
John Hodrien wrote: What about block device activity (-d option)? Looks like sar isn't configured for this at the minute, I'll see if I can sort that out I'm not sure how frequently sar collects data, but I think you'll probably want something to collect it at 1-second or less granularity - e.g. vmstat 1, or (probably better) dstat. dstat also has various plugins which you may want to investigate - including a good NFS plugin. BTW, are you using noatime or relatime? Thanks, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
PE1950 IPMI BMC appears to deliver junk to the serial port
Hello, All of our PE1950s (on two different) sites have problems with SoL... They have been set up for IPMI remote access and SoL using ipmitool. When SoL sessions are not active the BMC outputs junk characters from its serial connection (and thus into the OS-visible serial port UART) - this behaviour is always reproducible, and can be triggered by sending a few hundred bytes of characters from the OS to the BMC (e.g. OS boot messages etc.). Some relevant snippets from my notes on the issue: I am 99.9% sure that this is a firmware bug on the BMC, and not an OS or application software bug, since it also shows up prior to OS boot. On Dell PowerEdge 1950s (BMC firmware version 2.37) - it has been observed on a number of different machines that: When IPMI SoL sessions are enabled, but NOT active, spurious characters are received by the serial UART from the BMC (on Linux device /dev/ttyS1). The problem also exists outside of Linux - these spurious characters have (on several occasions) interrupted the boot process - by sending character sequences which interrupt the normal automatic boot process of the BIOS and/or boot loader - as such IPMI SoL must be disabled on these systems for reliable operation - this leaves the systems in-question without a viable remote-access system for BIOS/boot/OS interventions etc. BMC settings are as follows: arundel:~# ipmitool sol info 1 Set in progress : set-complete Enabled : true Force Encryption: true Force Authentication: false Privilege Level : ADMINISTRATOR Character Accumulate Level (ms) : 50 Character Send Threshold: 220 Retry Count : 7 Retry Interval (ms) : 1000 Volatile Bit Rate (kbps): 57.6 Non-Volatile Bit Rate (kbps): 57.6 Payload Channel : 1 (0x01) Payload Port: 623 All baud rates are set to 57.6k / 8bit / no parity in Linux (Linux kernel and 'getty' processes). BTW, I administer Intel and Tyan IPMI v2.0 machines using identical software and the same IPMI SoL settings - without seeing these problems. I can arrange to supply a hex-dump of the received junk characters if that's useful. I'm also happy to execute arbitrary IPMI commands etc. etc. Thanks, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq
Re: [smartmontools-support] SMART causes disks to go offline on an LSI SAS1068 controller - Dell SAS 5/iR
Hello, Just to say that I'm seeing this bug as well, with smartmontools 5.38 and smartctl 5.39 2009-10-10 r2955 on Debian lenny. The machine is a Dell PowerEdge 860. I'm guessing that this is either a firmware or driver issue. 02:08.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01) Subsystem: Dell SAS 5/iR Adapter RAID Controller Flags: bus master, 66MHz, medium devsel, latency 72, IRQ 1275 I/O ports at ec00 [disabled] [size=256] Memory at fe9fc000 (64-bit, non-prefetchable) [size=16K] Memory at fe9e (64-bit, non-prefetchable) [size=64K] Expansion ROM at fea0 [disabled] [size=1M] Capabilities: [50] Power Management version 2 Capabilities: [98] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+ Capabilities: [68] PCI-X non-bridge device Capabilities: [b0] MSI-X: Enable- Mask- TabSize=1 Kernel driver in use: mptsas Kernel modules: mptsas # modinfo mptsas filename: /lib/modules/2.6.26-2-openvz-amd64/kernel/drivers/message/fusion/mptsas.ko version:3.04.06 license:GPL description:Fusion MPT SAS Host driver author: LSI Corporation The errors look like this: 428.524463] mptscsih: ioc0: attempting task abort! (sc=81021b950940) 428.524471] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00 433.199851] mptbase: ioc0: LogInfo(0x3114): Originator={PL}, Code={IO Executed}, SubCode(0x) 433.199851] mptsas: ioc0: removing sata device, channel 0, id 0, phy 0 433.199851] port-0:0: mptsas: ioc0: delete port (0) 433.199851] sd 0:0:0:0: [sda] Synchronizing SCSI cache 433.348856] mptscsih: ioc0: task abort: SUCCESS (sc=81021b950940) 433.348868] mptscsih: ioc0: attempting task abort! (sc=81021b950440) 433.348873] sd 0:0:0:0: [sda] CDB: Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00 433.348885] mptscsih: ioc0: task abort: SUCCESS (sc=81021b950440) 433.348893] mptscsih: ioc0: attempting target reset! (sc=81021b950940) 433.348896] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00 433.605026] mptscsih: ioc0: target reset: SUCCESS (sc=81021b950940) 433.605034] mptscsih: ioc0: attempting bus reset! (sc=81021b950940) 433.605037] sd 0:0:0:0: [sda] CDB: ATA command pass through(16): 85 08 0e 00 d5 00 01 00 09 00 4f 00 c2 00 b0 00 434.157594] mptscsih: ioc0: bus reset: SUCCESS (sc=81021b950940) 444.546154] mptscsih: ioc0: attempting host reset! (sc=81021b950940) 444.546162] mptbase: ioc0: Initiating recovery 461.540429] mptscsih: ioc0: host reset: SUCCESS (sc=81021b950940) 461.540437] sd 0:0:0:0: Device offlined - not ready after error recovery 461.540440] sd 0:0:0:0: Device offlined - not ready after error recovery 461.540475] end_request: I/O error, dev sda, sector 15631039 461.540480] md: super_written gets error=-5, uptodate=0 461.540485] raid1: Disk failure on sda1, disabling device. and the drives are: Model Family: Seagate Barracuda ES Device Model: ST3250620NS Serial Number:9QE3L9E0 Firmware Version: 3BKS and are in JBOD mode (+ sw RAID with md). lsiutil says: Current active firmware version is 0.10.51 Firmware image's version is MPTFW-00.10.51.00-IE LSI Logic x86 BIOS image's version is MPTBIOS-6.12.05.00 (2007.09.29) ... which is the latest on Dell's download pages for this server. The kernel is 2.6.26-2-openvz-amd64 from Debian Lenny (same behaviour with non-openvz kernel). Running smartd makes the drives disappear after a few hours, but doing this: while true ; do smartctl -T permissive -d sat -a /dev/sda /dev/null echo -n . ; done seems to knock them out in about a minute. Subjectively, 5.38 seemed to upset the controller a lot quicker than 5.39 r2955 does. For good measure I'm currently stress-testing a PE1950 with a SAS 6/iR (SAS1068E) in the same way (however this is using RAID setup through the BIOS). smartctl 5.39-pre needs '-T permissive' on the PE860, but 5.38 doesn't seem to require it. It is worth trying a newer mptsas driver? Regards, Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 ___ Linux-PowerEdge mailing list Linux-PowerEdge@dell.com https://lists.us.dell.com/mailman/listinfo/linux-poweredge Please read the FAQ at http://lists.us.dell.com/faq