Re: [CentOS] raid 5 install
> > > On 2019-07-01 10:01, Warren Young wrote: >> On Jul 1, 2019, at 8:26 AM, Valeri Galtsev >> wrote: >>> >>> RAID function, which boils down to simple, short, easy to debug well >>> program. > > I didn't intend to start software vs hardware RAID flame war when I > joined somebody's else opinion. > > Now, commenting with all due respect to famous person who Warren Young > definitely is. > >> >> RAID firmware will be harder to debug than Linux software RAID, if only >> because of easier-to-use tools. > > I myself debug neither firmware (or "microcode", speaking the language > as it was some 30 years ago), not Linux kernel. In both cases it is > someone else who does the debugging. > > You are speaking as the person who routinely debugs Linux components. I > still have to stress, that in debugging RAID card firmware one has small > program which this firmware is. > > In the case of debugging EVERYTHING that affects reliability of software > RAID, on has to debug the following: > > 1. Linux kernel itself, which is huge; > > 2. _all_ the drivers that are loaded when system runs. Some of the > drivers on one's system may be binary only, like NVIDIA video card > drives. So, even for those who like Warren can debug all code, these > still are not accessible. > > All of the above can potentially panic kernel (as they all run in kernel > context), so they all affect reliability of software RAID, not only the > chunk of software doing software RAID function. > >> >> Furthermore, MD RAID only had to be debugged once, rather that once per >> company-and-product line as with hardware RAID. > > Alas, MD RAID itself not the only thing that affects reliability of > software RAID. Panicking kernel has grave effects on software RAID, so > anything that can panic kernel had also to be debugged same thoroughly. > And it always have to be redone once changed to kernel or drivers are > introduced. > >> >> I hope you’re not assuming that hardware RAID has no bugs. It’s >> basically a dedicated CPU running dedicated software that’s difficult to >> upgrade. > > That's true, it is dedicated CPU running dedicated program, and it keeps > doing it even if the operating system crashed. Yes, hardware itself can > be unreliable. But in case of RAID card it is only the card itself. > Failure rate of which in my racks is much smaller that overall failure > rate of everything. In case of kernel panic, any piece of hardware > inside computer in some mode of failure can cause it. > > One more thing: apart from hardware RAID "firmware" program being small > and logically simple, there is one more factor: it usually runs on RISC > architecture CPU, and introduce bugs programming for RISC architecture > IMHO is more difficult that when programming for i386 and amd64 > architectures. Just my humble opinion I carry since the time I was > programming. > >> >>> if kernel (big and buggy code) is panicked, current RAID operation will >>> never be finished which leaves the mess. >> >> When was the last time you had a kernel panic? And of those times, when >> was the last time it happened because of something other than a hardware >> or driver fault? If it wasn’t for all this hardware doing strange >> things, the kernel would be a lot more stable. :) > > Yes, I half expected that. When did we last have kernel crash, and who > of us is unable to choose reliable hardware, and unable to insist that > our institution pays mere 5-10% higher price for reliable box than they > would for junk hardware? Indeed, we all run reliable boxes, and I am > retiring still reliably working machines of age 10-13 years... > > However, I would rather suggest to compare not absolute probabilities, > which, exactly as you said, are infinitesimal. But with relative > probabilities, I still will go with hardware RAID. > >> >> You seem to be saying that hardware RAID can’t lose data. You’re >> ignoring the RAID 5 write hole: >> >> https://en.wikipedia.org/wiki/RAID#WRITE-HOLE > > Neither of our RAID cards runs without battery backup. > >> >> If you then bring up battery backups, now you’re adding cost to the >> system. And then some ~3-5 years later, downtime to swap the battery, >> and more downtime. And all of that just to work around the RAID write >> hole. > > You are absolutely right about system with hardware RAID being more > expensive than that with software RAID. I would say, for "small scale > big storage" boxes (i.e. NOT distributed file systems), hardware RAID > adds about 5-7% of cost in our case. Now, with hardware RAID all > maintenance (what one needs to do in case of single failed drive > replacement routine) takes about 1/10 of a time necessary do deal with > similar failure in case of software RAID. I deal with both, as it > historically happened, so this is my own observation. Maybe software > RAID boxes I have to deal with are too messy (imagine almost two dozens > of software RAIDs 12-16 drives each on one machine; even bios runs out > of number
Re: [CentOS] raid 5 install
On Jul 1, 2019, at 10:10 AM, Valeri Galtsev wrote: > > On 2019-07-01 10:01, Warren Young wrote: >> On Jul 1, 2019, at 8:26 AM, Valeri Galtsev wrote: >>> >>> RAID function, which boils down to simple, short, easy to debug well >>> program. > > I didn't intend to start software vs hardware RAID flame war Where is this flame war you speak of? I’m over here having a reasonable discussion. I’ll continue being reasonable, if that’s all right with you. :) > Now, commenting with all due respect to famous person who Warren Young > definitely is. Since when? I’m not even Internet Famous. >> RAID firmware will be harder to debug than Linux software RAID, if only >> because of easier-to-use tools. > > I myself debug neither firmware (or "microcode", speaking the language as it > was some 30 years ago) There is a big distinction between those two terms; they are not equivalent terms from different points in history. I had a big digression explaining the difference, but I’ve cut it as entirely off-topic. It suffices to say that with hardware RAID, you’re almost certainly talking about firmware, not microcode, not just today, but also 30 years ago. Microcode is a much lower level thing than what happens at the user-facing product level of RAID controllers. > In both cases it is someone else who does the debugging. If it takes three times as much developer time to debug a RAID card firmware as it does to debug Linux MD RAID, and the latter has to be debugged only once instead of multiple times as the hardware RAID firmware is reinvented again and again, which one do you suppose ends up with more bugs? > You are speaking as the person who routinely debugs Linux components. I have enough work fixing my own bugs that I rarely find time to fix others’ bugs. But yes, it does happen once in a while. > 1. Linux kernel itself, which is huge; …under which your hardware RAID card’s driver runs, making it even more huge than it was before that driver was added. You can’t zero out the Linux kernel code base size when talking about hardware RAID. It’s not like the card sits there and runs in a purely isolated environment. It is a testament to how well-debugged the Linux kernel is that your hardware RAID card runs so well! > All of the above can potentially panic kernel (as they all run in kernel > context), so they all affect reliability of software RAID, not only the chunk > of software doing software RAID function. When the kernel panics, what do you suppose happens to the hardware RAID card? Does it keep doing useful work, and if so, for how long? What’s more likely these days: a kernel panic or an unwanted hardware restart? And when that happens, which is more likely to fail, a hardware RAID without BBU/NV storage or a software RAID designed to be always-consistent? I’m stripping away your hardware RAID’s advantage in NV storage to keep things equal in cost: my on-board SATA ports for your stripped-down hardware RAID card. You probably still paid more, but I’ll give you that, since you’re using non-commodity hardware. Now that they’re on even footing, which one is more reliable? > hardware RAID "firmware" program being small and logically simple You’ve made an unwarranted assumption. I just did a blind web search and found this page: https://www.broadcom.com/products/storage/raid-controllers/megaraid-sas-9361-8i#downloads …on which we find that the RAID firmware for the card is 4.1 MB, compressed. Now, that’s considered a small file these days, but realize that there are no 1024 px² icon files in there, no massive XML libraries, no language internationalization files, no high-level language runtimes… It’s just millions of low-level highly-optimized CPU instructions. From experience, I’d expect it to take something like 5-10 person-years to reproduce that much code. That’s far from being “small and logically simple.” > it usually runs on RISC architecture CPU, and introduce bugs programming for > RISC architecture IMHO is more difficult that when programming for i386 and > amd64 architectures. I don’t think I’ve seen any such study, and if I did, I’d expect it to only be talking about assembly language programming. Above that level, you’re talking about high-level language compilers, and I don’t think the underlying CPU architecture has anything to do with the error rates in programs written in high-level languages. I’d expect RAID firmware to be written in C, not assembly language, which means the CPU the has little or nothing to do with programmer error rates. Thought experiment: does Linux have fewer bugs on ARM than on x86_64? I even doubt that you can dig up a study showing that assembly language programming on CISC is significantly more error-prone than RISC programming in the first place. My experience says that error rates in programs are largely a function of the number of lines of code, and that puts RISC at a severe disadvantage
Re: [CentOS] raid 5 install
>> You seem to be saying that hardware RAID can’t lose data. You’re >> ignoring the RAID 5 write hole: >> >> https://en.wikipedia.org/wiki/RAID#WRITE-HOLE >> >> If you then bring up battery backups, now you’re adding cost to the >> system. And then some ~3-5 years later, downtime to swap the battery, >> and more downtime. And all of that just to work around the RAID write >> hole. > > Yes. Furthermore, with the huge capacity disks in use today, rebuilding > a RAID 5 array after a disk fails, with all the necessary parity > calculations, can take days. > RAID 5 is obsolete, and I'm not the only one saying it. Needless to say hardware and software RAID have the problem above. Simon ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
> On Mon, 1 Jul 2019, Warren Young wrote: > >> If you then bring up battery backups, now you’re adding cost to the >> system. And then some ~3-5 years later, downtime to swap the battery, >> and more downtime. And all of that just to work around the RAID write >> hole. > > Although batteries have disappeared in favour of NV storage + capacitors, > meaning you don't have to replace anything on those models. That's what you think before you have to replace the capacitors module :-) Simon ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On Jul 1, 2019, at 9:10 AM, mark wrote: > > ZFS with a zpoolZ2 You mean raidz2. > which we set up using the LSI card set to JBOD Some LSI cards require a complete firmware re-flash to get them into “IT mode” which completely does away with the RAID logic and turns them into dumb SATA controllers. Consequently, you usually do this on the lowest-end models, since there’s no point paying for the expensive RAID features on the higher-end cards when you do this. I point this out because there’s another path, which is to put each disk into a single-target “JBOD”, which is less efficient, since it means each disk is addressed indirectly via the RAID chipset, rather than as just a plain SATA disk. You took the first path, I hope? We gave up on IT-mode LSI cards when motherboards with two SFF-8087 connectors became readily available, giving easy 8-drive arrays. No need for the extra board any more. > took about 3 days and > 8 hours for backing up a large project, while the same o/s, but with xfs > on an LSI-hardware RAID 6, took about 10 hours less. Hardware RAID is > faster. I doubt the speed difference is due to hardware vs software. The real difference you tested there is ZFS vs XFS, and you should absolutely expect to pay some performance cost with ZFS. You’re getting a lot of features in trade. I wouldn’t expect the difference to be quite that wide, by the way. That brings me back to my guess about IT mode vs RAID JBOD mode on your card. Anyway, one of those compensating benefits are snapshot-based backups. Before starting the first backup, set a ZFS snapshot. Do the backup with a “zfs send” of the snapshot, rather than whatever file-level backup tool you were using before. When that completes, create another snapshot and send *that* snapshot. This will complete much faster, because ZFS uses the two snapshots to compute the set of changed blocks between the two snapshots and sends only the changed blocks. This is a sub-file level backup, so that if a 1 kB header changes in a 2 GB data file, you send only one block’s worth of data to the backup server, since you’ll be using a block size bigger than 1 kB, and that header — being a *header* — won’t straddle two blocks. This is excellent for filesystems with large files that change in small areas, like databases. You might say, “I can do that with rsync already,” but with rsync, you have to compute this delta on each backup, which means reading all of the blocks on *both* sides of the backup. ZFS snapshots keep that information continuously as the filesystem runs, so there is nothing to compute at the beginning of the backup. rsync’s delta compression primarily saves time only when the link between the two machines is much slower than the disks on either side, so that the delta computation overhead gets swamped by the bottleneck’s delays. With ZFS, the inter-snapshot delta computation is so fast that you can use it even when you’ve got two servers sitting side by side with a high-bandwidth link between them. Once you’ve got a scheme like this rolling, you can do backups very quickly, possibly even sub-minute. And you don’t have to script all of this yourself. There are numerous pre-built tools to automate this. We’ve been happy users of Sanoid, which does both the automatic snapshot and automatic replication parts: https://github.com/jimsalterjrs/sanoid Another nice thing about snapshot-based backups is that they’re always consistent: just as you can reboot a ZFS based system at any time and have it reboot into a consistent state, you can take a snapshot and send it to another machine, and it will be just as consistent. Contrast something like rsync, which is making its decisions about what to send on a per-file basis, so that it simply cannot be consistent unless you stop all of the apps that can write to the data store you’re backing up. Snapshot based backups can occur while the system is under a heavy workload. A ZFS snapshot is nearly free to create, and once set, it freezes the data blocks in a consistent state. This benefit falls out nearly for free with a copy-on-write filesystem. Now that you’re doing snapshot-based backups, you’re immune to crypto malware, as long as you keep your snapshots long enough to cover your maximum detection window. Someone just encrypted all your stuff? Fine, roll it back. You don’t even have to go to the backup server. > when one fails, "identify" rarely works, which means use smartctl > or MegaCli64 (or the lsi script) to find the s/n of the drive, then > guess… It’s really nice when you get a disk status report and the missing disk is clear from the labels: left-1: OK left-2: OK left-4: OK right-1: OK right-2: OK right-3: OK right-4: OK Hmmm, which disk died, I wonder? Gotta be left-3! No need to guess, the system just told you in human terms, rather than in abstract hardware terms. __
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On Mon, Jul 1, 2019 at 4:37 PM Jonathan Billings wrote: > I never was able to find a bootable FreeDOS image that could run it from a > USB boot disk. https://lists.centos.org/pipermail/centos/2013-May/134512.html ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On Jul 1, 2019, at 16:47, lejeczek via CentOS wrote: > So far it looks like not many people here if any at all, use > fwupd/LVFS which is a bit surprising to me since this if > what Redhat promote as a solution universally supported by > increasingly more hardware vendors. > I do upgrade UEFI/BIOS on my Dell Latitude with fwupd, have > had for last couple of years and it works beautifully, > though my other Lenovo e485 is missing from fwupd. If you check out the device list: https://fwupd.org/lvfs/devicelist You can see that not a lot of older hardware is supported. Dell seems to do a better job than others, although I’m glad to see Lenovo has most of their recent thinkpads there now. — Jonathan Billings ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On 01/07/2019 17:42, lejeczek via CentOS wrote: > hi guys > > does anybody here runs on HPE ProLiant? > I was hoping you can tell whether HPE support Linux Vendor > Firmware Service and you actually get to upgrade ProLiants' > BIOS/firmware via fwupdmgr? > > many thanks, L. > > ___ > CentOS mailing list > CentOS@centos.org > https://lists.centos.org/mailman/listinfo/centos So far it looks like not many people here if any at all, use fwupd/LVFS which is a bit surprising to me since this if what Redhat promote as a solution universally supported by increasingly more hardware vendors. I do upgrade UEFI/BIOS on my Dell Latitude with fwupd, have had for last couple of years and it works beautifully, though my other Lenovo e485 is missing from fwupd. regards, L. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On Mon, Jul 01, 2019 at 12:58:12PM -0600, Frank Cox wrote: > So what's wrong with using DOS to update firmware? DOS is a small > and simple program loader that's unlikely to require much in the way > of hardware to work and is unlikely to be infected by a virus in > today's world. Honestly, I've tried to use a .EXE to update the BIOS on my personal system, and I never was able to find a bootable FreeDOS image that could run it from a USB boot disk. Who has floppy disks anymore? I don't even have a CDROM drive. I never ran DOS so I honestly have no clue what I'm doing with it. Fortunately, newer hardware let me drop the executable in the EFI volume for updates. > Would you rather have to boot a mulit-gigabyte image of > who-knows-what that does ghawd-knows-what for what should be simple > task? It's not a simple task. Do it wrong, and you've bricked your system. -- Jonathan Billings ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On 2019-07-01 14:15, mark wrote: Frank Cox wrote: On Mon, 1 Jul 2019 19:38:29 +0100 lejeczek via CentOS wrote: I also a few years ago got Dell's tech support telling me to do MS-DOS stuff in order to update BIOS. So what's wrong with using DOS to update firmware? DOS is a small and simple program loader that's unlikely to require much in the way of hardware to work and is unlikely to be infected by a virus in today's world. Would you rather have to boot a mulit-gigabyte image of who-knows-what that does ghawd-knows-what for what should be simple task? The above is really weird. From CentOS 5, 6, and 7, I've run Dell's firmware update from a running system, no OMSA. Updates with no problems. I really agree with Frank. The smaller the thing your run flash/firmware burner is the better. So, rudimentary DOS is what I would prefer given a choice. And I have to say I really like Dell's firmware installer - it scans the system, and then *tells* you that a) it is for that system, and b) that this is newer than the current, and do you want to install. Though I do note that tastes differ. Valeri mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos -- Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
Frank Cox wrote: > On Mon, 1 Jul 2019 19:38:29 +0100 > lejeczek via CentOS wrote: > >> I also a few years ago got Dell's tech support telling me to >> do MS-DOS stuff in order to update BIOS. > > So what's wrong with using DOS to update firmware? DOS is a small and > simple program loader that's unlikely to require much in the way of > hardware to work and is unlikely to be infected by a virus in today's > world. > > Would you rather have to boot a mulit-gigabyte image of who-knows-what > that does ghawd-knows-what for what should be simple task? > The above is really weird. From CentOS 5, 6, and 7, I've run Dell's firmware update from a running system, no OMSA. Updates with no problems. And I have to say I really like Dell's firmware installer - it scans the system, and then *tells* you that a) it is for that system, and b) that this is newer than the current, and do you want to install. mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
Hi, We had some DL380-Gen10 on production, firmware updates are made via iLO! Then boot after it is done, on Dell (with OMSA) we can update from linux and boot after, same thing on my point of view. Att., Antonio. Em seg, 1 de jul de 2019 às 14:12, lejeczek via CentOS escreveu: > hi guys > > does anybody here runs on HPE ProLiant? > I was hoping you can tell whether HPE support Linux Vendor > Firmware Service and you actually get to upgrade ProLiants' > BIOS/firmware via fwupdmgr? > > many thanks, L. > > ___ > CentOS mailing list > CentOS@centos.org > https://lists.centos.org/mailman/listinfo/centos > -- *Antonio da Silva Martins Jr. * *Analista de Suporte* NPD - Núcleo de Processamento de Dados UEM - Universidade Estadual de Maringá email: *asmart...@uem.br* fone: +55 (44) 3011-4015 / 3011-4411 inoc-dba: 263076*100 "Real Programmers don’t need comments — the code is obvious." ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
lejeczek via CentOS wrote: > On 01/07/2019 18:38, mark wrote: >> lejeczek via CentOS wrote: >>> >>> does anybody here runs on HPE ProLiant? I was hoping you can tell >>> whether HPE support Linux Vendor >>> Firmware Service and you actually get to upgrade ProLiants' >>> BIOS/firmware via fwupdmgr? >>> >> Dunno 'bout "Linux Vendor Firmware Service", but HPE support, ah, >> yeah... let's not go there. And they *really* want you to use MS DOS to >> update the firmware. Oh, and when we had support in to do repairs about >> 6 or so >> months ago on our small SGI supercomputer (they bought SGI), the techs >> were worried, because HPE was spinning off support to Unisys, and how >> they were going to get parts >> >> mark "at least it's not Oracle/Sun support is all I can say" >> > hi, thanks for the info. And you have tried fwupdmgr and no positive > results? Which Gen your ProLiants are? I don't remember if I ever used that. Only had one HP Proliant, and did not like it - a gen 5, I think it was, and, on boot, 70 sec *before* the logo even appeared. That system was my "why I don't care about systemd SEE HOW FAST WE BOOT!!!", when it took almost five MINUTES before it ever got to the grub screen. > Dell, which I have had for many years, do their own OMSA > which is better than nothing but this too is flaky at times. I also a few > years ago got Dell's tech support telling me to do MS-DOS stuff in order > to update BIOS. As I just said in another post, I've never had tech support tell me that. They give me a link for a .BIN, which I run, and it's an shell script with embedded binary software. > > I'm thinking & hoping that maybe IBM, since they are now > Redhat, will supply us with premium grade software support > for their hardware. Although IBM is a bit like Intel in my opinion - they > do not innovate that much, are old and struggle to understand the end > users like us. I dunno 'bout that. IBM hardware has always been really solid, in my experience. And you have to understand, they do a lot of service/consulting. Understand us? IBM's been seriously big in Linux from very early. Hell, around 18 years ago, one of their folks had the use of a Z-series mainframe, and maxed it out, using IBM's VM (which goes back to the seventies, really), with 48,000 separate instances of Linux, and it ran fine on 32,000 VMs Hell, I wasn't happy, a few years ago, when I found out that RH's CEO since a few years ago was a former exec at... Delta Airlines. I'm sure he know s much about Unix, Linux, or o/s's in general mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On Mon, 1 Jul 2019 19:38:29 +0100 lejeczek via CentOS wrote: > I also a few years ago got Dell's tech support telling me to > do MS-DOS stuff in order to update BIOS. So what's wrong with using DOS to update firmware? DOS is a small and simple program loader that's unlikely to require much in the way of hardware to work and is unlikely to be infected by a virus in today's world. Would you rather have to boot a mulit-gigabyte image of who-knows-what that does ghawd-knows-what for what should be simple task? -- MELVILLE THEATRE ~ Real D 3D Digital Cinema ~ www.melvilletheatre.com ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
On 01/07/2019 18:38, mark wrote: > lejeczek via CentOS wrote: >> hi guys >> >> does anybody here runs on HPE ProLiant? I was hoping you can tell whether >> HPE support Linux Vendor >> Firmware Service and you actually get to upgrade ProLiants' >> BIOS/firmware via fwupdmgr? >> >> > Dunno 'bout "Linux Vendor Firmware Service", but HPE support, ah, yeah... > let's not go there. And they *really* want you to use MS DOS to update the > firmware. Oh, and when we had support in to do repairs about 6 or so > months ago on our small SGI supercomputer (they bought SGI), the techs > were worried, because HPE was spinning off support to Unisys, and how they > were going to get parts > > mark "at least it's not Oracle/Sun support is all I can say" > hi, thanks for the info. And you have tried fwupdmgr and no positive results? Which Gen your ProLiants are? On https://fwupd.org/ HPE logo shows up plus some notes but first-hand experience is as always best to have, which I do not have as I only begin to consider HPE hardware for the first time. Dell, which I have had for many years, do their own OMSA which is better than nothing but this too is flaky at times. I also a few years ago got Dell's tech support telling me to do MS-DOS stuff in order to update BIOS. I'm thinking & hoping that maybe IBM, since they are now Redhat, will supply us with premium grade software support for their hardware. Although IBM is a bit like Intel in my opinion - they do not innovate that much, are old and struggle to understand the end users like us. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
lejeczek via CentOS wrote: > hi guys > > does anybody here runs on HPE ProLiant? I was hoping you can tell whether > HPE support Linux Vendor > Firmware Service and you actually get to upgrade ProLiants' > BIOS/firmware via fwupdmgr? > > Dunno 'bout "Linux Vendor Firmware Service", but HPE support, ah, yeah... let's not go there. And they *really* want you to use MS DOS to update the firmware. Oh, and when we had support in to do repairs about 6 or so months ago on our small SGI supercomputer (they bought SGI), the techs were worried, because HPE was spinning off support to Unisys, and how they were going to get parts mark "at least it's not Oracle/Sun support is all I can say" ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
Warren Young wrote on 7/1/2019 9:48 AM: On Jul 1, 2019, at 7:56 AM, Blake Hudson wrote: I've never used ZFS, as its Linux support has been historically poor. When was the last time you checked? The ZFS-on-Linux (ZoL) code has been stable for years. In recent months, the BSDs have rebased their offerings from Illumos to ZoL. The macOS port, called O3X, is also mostly based on ZoL. That leaves Solaris as the only major OS with a ZFS implementation not based on ZoL. 1) A single drive failure in a RAID4 or 5 array (desktop IDE) Can I take by “IDE” that you mean “before SATA”, so you’re giving a data point something like twenty years old? 2) A single drive failure in a RAID1 array (Supermicro SCSI) Another dated tech reference, if by “SCSI” you mean parallel SCSI, not SAS. I don’t mind old tech per se, but at some point the clock on bugs must reset. Yes, this experience spans decades and a variety of hardware. I'm all for giving things another try, and would love to try ZFS again now that it's been ported to Linux. As far as mdadm goes, I'm happy with LSI hardware RAID controllers and have no desire to retry mdadm at this time. I have enough enterprise class drives fail on a regular basis (I manage a reasonable volume) that the predictability gained by standardizing on one vendor for HW RAID cards is worth a lot. I have no problem recommending LSI cards to folks that feel the improved availability outweighs the cost (~$500). This would assume those folks have already covered other aspects of availability and redundancy first (power, PSUs, cooling, backups, etc). ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] HPE ProLiant - support Linux Vendor Firmware Service ?
hi guys does anybody here runs on HPE ProLiant? I was hoping you can tell whether HPE support Linux Vendor Firmware Service and you actually get to upgrade ProLiants' BIOS/firmware via fwupdmgr? many thanks, L. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On 2019-07-01 10:10, mark wrote: I haven't been following this thread closely, but some of them have left me puzzled. 1. Hardware RAID: other than Rocket RAID, who don't seem to support a card more than about 3 years (i used to have to update and rebuild the drivers), anything LSI based, which includes Dell PERC, have been pretty good. The newer models do even better at doing the right thing. 2. ZFS seems to be ok, though we were testing it with an Ubuntu system just a month or so ago. Note: ZFS with a zpoolZ2 - the equivalent of RAID 6, which we set up using the LSI card set to JBOD - took about 3 days and 8 hours for backing up a large project, while the same o/s, but with xfs on an LSI-hardware RAID 6, took about 10 hours less. Hardware RAID is faster. 3. Being in the middle of going through three days of hourly logs and the loghost reports, and other stuff, from the weekend (> 600 emails), I noted that we have something like 50 mdraids, and we've had very little trouble with them, almost all are either RAID 1 or RAID 6 (we may have a RAID 5 left), except for the system that had a h/d fail, and another starting to through errors (I suspect the server itself...). The biggest issue for me is that when one fails, "identify" rarely works, which means use smartctl or MegaCli64 (or the lsi script) to find the s/n of the drive, then guess... and if that doesn't work, bring the system down to find the right bloody bad drive. In my case I spend a bit of time before I roll out the system, so I know which physical drive (or which tray) the controller numbers with which number. They stay the same over the life of the system, those are just physical connections. Then when the controller tells drive number "N" failed, I know which tray to pull. Valeri But... they rebuild, no problems. Oh, and I have my own workstation at home on a mdraid 1. mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos -- Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Was, Re: raid 5 install, is ZFS
Speaking of ZFS, got a weird one: we were testing ZFS (ok, it was on Ubuntu, but that shouldn't make a difference, I would think). and I've got a zpool z2. I pulled one drive, to simulate a drive failure, and it rebuilt with the hot spare. Then I pushed the drive I'd pulled back in... and it does not look like I've got a hot spare. zpool status shows config: NAME STATE READ WRITE CKSUM export1 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 sda ONLINE 0 0 0 spare-1 ONLINE 0 0 0 sdbONLINE 0 0 0 sdlONLINE 0 0 0 sdc ONLINE 0 0 0 sdd ONLINE 0 0 0 sde ONLINE 0 0 0 sdf ONLINE 0 0 0 sdg ONLINE 0 0 0 sdh ONLINE 0 0 0 sdi ONLINE 0 0 0 sdj ONLINE 0 0 0 sdk ONLINE 0 0 0 spares sdlINUSE currently in use Does anyone know what I need to do to make the spare sdl back to being just a hot spare? mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On 2019-07-01 10:01, Warren Young wrote: On Jul 1, 2019, at 8:26 AM, Valeri Galtsev wrote: RAID function, which boils down to simple, short, easy to debug well program. I didn't intend to start software vs hardware RAID flame war when I joined somebody's else opinion. Now, commenting with all due respect to famous person who Warren Young definitely is. RAID firmware will be harder to debug than Linux software RAID, if only because of easier-to-use tools. I myself debug neither firmware (or "microcode", speaking the language as it was some 30 years ago), not Linux kernel. In both cases it is someone else who does the debugging. You are speaking as the person who routinely debugs Linux components. I still have to stress, that in debugging RAID card firmware one has small program which this firmware is. In the case of debugging EVERYTHING that affects reliability of software RAID, on has to debug the following: 1. Linux kernel itself, which is huge; 2. _all_ the drivers that are loaded when system runs. Some of the drivers on one's system may be binary only, like NVIDIA video card drives. So, even for those who like Warren can debug all code, these still are not accessible. All of the above can potentially panic kernel (as they all run in kernel context), so they all affect reliability of software RAID, not only the chunk of software doing software RAID function. Furthermore, MD RAID only had to be debugged once, rather that once per company-and-product line as with hardware RAID. Alas, MD RAID itself not the only thing that affects reliability of software RAID. Panicking kernel has grave effects on software RAID, so anything that can panic kernel had also to be debugged same thoroughly. And it always have to be redone once changed to kernel or drivers are introduced. I hope you’re not assuming that hardware RAID has no bugs. It’s basically a dedicated CPU running dedicated software that’s difficult to upgrade. That's true, it is dedicated CPU running dedicated program, and it keeps doing it even if the operating system crashed. Yes, hardware itself can be unreliable. But in case of RAID card it is only the card itself. Failure rate of which in my racks is much smaller that overall failure rate of everything. In case of kernel panic, any piece of hardware inside computer in some mode of failure can cause it. One more thing: apart from hardware RAID "firmware" program being small and logically simple, there is one more factor: it usually runs on RISC architecture CPU, and introduce bugs programming for RISC architecture IMHO is more difficult that when programming for i386 and amd64 architectures. Just my humble opinion I carry since the time I was programming. if kernel (big and buggy code) is panicked, current RAID operation will never be finished which leaves the mess. When was the last time you had a kernel panic? And of those times, when was the last time it happened because of something other than a hardware or driver fault? If it wasn’t for all this hardware doing strange things, the kernel would be a lot more stable. :) Yes, I half expected that. When did we last have kernel crash, and who of us is unable to choose reliable hardware, and unable to insist that our institution pays mere 5-10% higher price for reliable box than they would for junk hardware? Indeed, we all run reliable boxes, and I am retiring still reliably working machines of age 10-13 years... However, I would rather suggest to compare not absolute probabilities, which, exactly as you said, are infinitesimal. But with relative probabilities, I still will go with hardware RAID. You seem to be saying that hardware RAID can’t lose data. You’re ignoring the RAID 5 write hole: https://en.wikipedia.org/wiki/RAID#WRITE-HOLE Neither of our RAID cards runs without battery backup. If you then bring up battery backups, now you’re adding cost to the system. And then some ~3-5 years later, downtime to swap the battery, and more downtime. And all of that just to work around the RAID write hole. You are absolutely right about system with hardware RAID being more expensive than that with software RAID. I would say, for "small scale big storage" boxes (i.e. NOT distributed file systems), hardware RAID adds about 5-7% of cost in our case. Now, with hardware RAID all maintenance (what one needs to do in case of single failed drive replacement routine) takes about 1/10 of a time necessary do deal with similar failure in case of software RAID. I deal with both, as it historically happened, so this is my own observation. Maybe software RAID boxes I have to deal with are too messy (imagine almost two dozens of software RAIDs 12-16 drives each on one machine; even bios runs out of numbers in attempt to enumerate all drives...) No, I am not taking the blame for building box like that ;-) All in all, simpler way of routinely dealing wi
Re: [CentOS] raid 5 install
You seem to be saying that hardware RAID can’t lose data. You’re ignoring the RAID 5 write hole: https://en.wikipedia.org/wiki/RAID#WRITE-HOLE If you then bring up battery backups, now you’re adding cost to the system. And then some ~3-5 years later, downtime to swap the battery, and more downtime. And all of that just to work around the RAID write hole. Yes. Furthermore, with the huge capacity disks in use today, rebuilding a RAID 5 array after a disk fails, with all the necessary parity calculations, can take days. RAID 5 is obsolete, and I'm not the only one saying it. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
I haven't been following this thread closely, but some of them have left me puzzled. 1. Hardware RAID: other than Rocket RAID, who don't seem to support a card more than about 3 years (i used to have to update and rebuild the drivers), anything LSI based, which includes Dell PERC, have been pretty good. The newer models do even better at doing the right thing. 2. ZFS seems to be ok, though we were testing it with an Ubuntu system just a month or so ago. Note: ZFS with a zpoolZ2 - the equivalent of RAID 6, which we set up using the LSI card set to JBOD - took about 3 days and 8 hours for backing up a large project, while the same o/s, but with xfs on an LSI-hardware RAID 6, took about 10 hours less. Hardware RAID is faster. 3. Being in the middle of going through three days of hourly logs and the loghost reports, and other stuff, from the weekend (> 600 emails), I noted that we have something like 50 mdraids, and we've had very little trouble with them, almost all are either RAID 1 or RAID 6 (we may have a RAID 5 left), except for the system that had a h/d fail, and another starting to through errors (I suspect the server itself...). The biggest issue for me is that when one fails, "identify" rarely works, which means use smartctl or MegaCli64 (or the lsi script) to find the s/n of the drive, then guess... and if that doesn't work, bring the system down to find the right bloody bad drive. But... they rebuild, no problems. Oh, and I have my own workstation at home on a mdraid 1. mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On Mon, 1 Jul 2019, Warren Young wrote: If you then bring up battery backups, now you’re adding cost to the system. And then some ~3-5 years later, downtime to swap the battery, and more downtime. And all of that just to work around the RAID write hole. Although batteries have disappeared in favour of NV storage + capacitors, meaning you don't have to replace anything on those models. jh ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On Jul 1, 2019, at 8:26 AM, Valeri Galtsev wrote: > > RAID function, which boils down to simple, short, easy to debug well program. RAID firmware will be harder to debug than Linux software RAID, if only because of easier-to-use tools. Furthermore, MD RAID only had to be debugged once, rather that once per company-and-product line as with hardware RAID. I hope you’re not assuming that hardware RAID has no bugs. It’s basically a dedicated CPU running dedicated software that’s difficult to upgrade. > if kernel (big and buggy code) is panicked, current RAID operation will never > be finished which leaves the mess. When was the last time you had a kernel panic? And of those times, when was the last time it happened because of something other than a hardware or driver fault? If it wasn’t for all this hardware doing strange things, the kernel would be a lot more stable. :) You seem to be saying that hardware RAID can’t lose data. You’re ignoring the RAID 5 write hole: https://en.wikipedia.org/wiki/RAID#WRITE-HOLE If you then bring up battery backups, now you’re adding cost to the system. And then some ~3-5 years later, downtime to swap the battery, and more downtime. And all of that just to work around the RAID write hole. Copy-on-write filesystems like ZFS and btrfs avoid the write hole entirely, so that the system can crash at any point, and the filesystem is always consistent. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On Jul 1, 2019, at 7:56 AM, Blake Hudson wrote: > > I've never used ZFS, as its Linux support has been historically poor. When was the last time you checked? The ZFS-on-Linux (ZoL) code has been stable for years. In recent months, the BSDs have rebased their offerings from Illumos to ZoL. The macOS port, called O3X, is also mostly based on ZoL. That leaves Solaris as the only major OS with a ZFS implementation not based on ZoL. > 1) A single drive failure in a RAID4 or 5 array (desktop IDE) Can I take by “IDE” that you mean “before SATA”, so you’re giving a data point something like twenty years old? > 2) A single drive failure in a RAID1 array (Supermicro SCSI) Another dated tech reference, if by “SCSI” you mean parallel SCSI, not SAS. I don’t mind old tech per se, but at some point the clock on bugs must reset. > We had to update the BIOS to boot from the working drive That doesn’t sound like a problem with the Linux MD raid feature. It sounds like the system BIOS had a strange limitation about which drives it was willing to consider bootable. > and possibly grub had to be repaired or reinstalled as I recall That sounds like you didn’t put GRUB on all disks in the array, which in turn means you probably set up the RAID manually, rather than through the OS installer, which should take care of details like that for you. > 3) A single drive failure in a RAID 4 or 5 array (desktop IDE) was not > clearly identified and required a bit of troubleshooting to pinpoint which > drive had failed. I don’t know about Linux MD RAID, but with ZFS, you can make it tell you the drive’s serial number when it’s pointing out a faulted disk. Software RAID also does something that I haven’t seen in typical PC-style hardware RAID: marry GPT partition drive labels to array status reports, so that instead of seeing something that’s only of indirect value like “port 4 subunit 3” you can make it say “left cage, 3rd drive down”. ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
On July 1, 2019 8:56:35 AM CDT, Blake Hudson wrote: > > >Warren Young wrote on 6/28/2019 6:53 PM: >> On Jun 28, 2019, at 8:46 AM, Blake Hudson wrote: >>> Linux software RAID…has only decreased availability for me. This has >been due to a combination of hardware and software issues that are are >generally handled well by HW RAID controllers, but are often handled >poorly or unpredictably by desktop oriented hardware and Linux >software. >> Would you care to be more specific? I have little experience with >software RAID, other than ZFS, so I don’t know what these “issues” >might be. > >I've never used ZFS, as its Linux support has been historically poor. >My >comments are limited to mdadm. I've experienced three faults when using > >Linux software raid (mdadm) on RH/RHEL/CentOS and I believe all of them > >resulted in more downtime than would have been experienced without the >RAID: > 1) A single drive failure in a RAID4 or 5 array (desktop IDE) >caused the entire system to stop responding. The result was a degraded >(from the dead drive) and dirty (from the crash) array that could not >be >rebuilt (either of the former conditions would have been fine, but not >both due to buggy Linux software). > 2) A single drive failure in a RAID1 array (Supermicro SCSI) caused > >the system to be unbootable. We had to update the BIOS to boot from the > >working drive and possibly grub had to be repaired or reinstalled as I >recall (it's been a long time). > 3) A single drive failure in a RAID 4 or 5 array (desktop IDE) was >not clearly identified and required a bit of troubleshooting to >pinpoint >which drive had failed. > >Unfortunately, I've never had an experience where a drive just failed >cleanly and was marked bad by Linux software RAID and could then be >replaced without fanfare. This is in contrast to my HW raid experiences > >where a single drive failure is almost always handled in a reliable and > >predictable manner with zero downtime. Your points about having to use >a >clunky BIOS setup or CLI tools may be true for some controllers, as are > >your points about needing to maintain a spare of your RAID controller, >ongoing driver support, etc. I've found the LSI brand cards have good >Linux driver support, CLI tools, an easy to navigate BIOS, and are >backwards compatible with RAID sets taken from older cards so I have no > >problem recommending them. LSI cards, by default, also regularly test >all drives to predict failures (avoiding rebuild errors or double >failures). +1 in favor of hardware RAID. My usual argument is: in case of hardware RAID dedicated piece of hardware runs a single task: RAID function, which boils down to simple, short, easy to debug well program. In case of software RAID there is no dedicated hardware, and if kernel (big and buggy code) is panicked, current RAID operation will never be finished which leaves the mess. One does not need computer science degree to follow this simple logic. Valeri > > >___ >CentOS mailing list >CentOS@centos.org >https://lists.centos.org/mailman/listinfo/centos Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] raid 5 install
Warren Young wrote on 6/28/2019 6:53 PM: On Jun 28, 2019, at 8:46 AM, Blake Hudson wrote: Linux software RAID…has only decreased availability for me. This has been due to a combination of hardware and software issues that are are generally handled well by HW RAID controllers, but are often handled poorly or unpredictably by desktop oriented hardware and Linux software. Would you care to be more specific? I have little experience with software RAID, other than ZFS, so I don’t know what these “issues” might be. I've never used ZFS, as its Linux support has been historically poor. My comments are limited to mdadm. I've experienced three faults when using Linux software raid (mdadm) on RH/RHEL/CentOS and I believe all of them resulted in more downtime than would have been experienced without the RAID: 1) A single drive failure in a RAID4 or 5 array (desktop IDE) caused the entire system to stop responding. The result was a degraded (from the dead drive) and dirty (from the crash) array that could not be rebuilt (either of the former conditions would have been fine, but not both due to buggy Linux software). 2) A single drive failure in a RAID1 array (Supermicro SCSI) caused the system to be unbootable. We had to update the BIOS to boot from the working drive and possibly grub had to be repaired or reinstalled as I recall (it's been a long time). 3) A single drive failure in a RAID 4 or 5 array (desktop IDE) was not clearly identified and required a bit of troubleshooting to pinpoint which drive had failed. Unfortunately, I've never had an experience where a drive just failed cleanly and was marked bad by Linux software RAID and could then be replaced without fanfare. This is in contrast to my HW raid experiences where a single drive failure is almost always handled in a reliable and predictable manner with zero downtime. Your points about having to use a clunky BIOS setup or CLI tools may be true for some controllers, as are your points about needing to maintain a spare of your RAID controller, ongoing driver support, etc. I've found the LSI brand cards have good Linux driver support, CLI tools, an easy to navigate BIOS, and are backwards compatible with RAID sets taken from older cards so I have no problem recommending them. LSI cards, by default, also regularly test all drives to predict failures (avoiding rebuild errors or double failures). ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Migrate older disk image to new UEFI
Hi All - I have a CentOS 7 (image) that is fully updated. However its not UEFI. Is there a way to use an older computer and migrate my current image to be UEFI so this new UEFI only computer I have will boot. Was hoping to not have to completely re-install and setup for my needs. Jerry ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] OT, hardware: new router and a USB winprinter
Well, the router I've had for some years now, that I had DD-WRT on, borked itself. Went out and bought a new one, an Asus RT-AC66U. Now, I've got this 11 yr old cute HP Laserjet... and it's a winprinter. With the version of the DD-WRT I had, all I had to do was send the magic M$ data to it, it would wake up, the lights on the printer telling me that it was installing that. New printer magically actually recognizes the printer... but when I try to send the data, nothing happens. If anyone can give me some ideas, feel free to do it offlist, as this is very OT. PS: I will NEVER touch DD-WRT again. It's an amateur project, in the worst sense of the word. When I was first putting it on, people on its general support mailing list talked about their "favorite builder" (there were several), and their "favorite builds". There were no formal releases, etc, etc. I went that way, because the OEM firmware "supported USB printers"... except, according to their tech support, "oh, not that printer". *sigh* Thanks in advance. mark ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos