[CentOS] USB drive very slow on CentOS 8
I recently upgraded from CentOS 7 to CentOS 8. I have a Mediasonic HF2-SU3S2 external drive enclosure with 4x3Tb drives configured as a software RAID 5 array (mdadm) with LVM. It's connected to a USB 3.0 port. On CentOS 7, the drive performance is reasonable. On CentOS 8, performance is extremely slow, about USB 1.0 performance. Maybe worse. Connecting an external USB SSD to the CentOS 8 system appears to perform well. I've updated the ASUS Prime-B350-Plus motherboard with the latest BIOS, but it doesn't appear to make any difference. Does anyone have any idea what might be different between CentOS 7 and 8 that would cause this? -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Problems installing CentOS 8
On 12/23/19 5:19 PM, Michael Eager wrote: I'm having a problem installing CentOS 8 from a USB drive. When the installer boots from the USB, it displays the language selection screen. After I select English and continue, the installer freezes. The USB drive flashes a couple times over the next minute or so, the stops. The mouse moves the cursor, but the installer is unresponsive to either selecting QUIT or HELP. I resolved the problem. I had a multi-Tb external RAID box attached by USB to the system. Apparently, the installer was trying to analyze the available space and either failing or taking forever. Once I disconnected the external drive, the installer was able to find the local disks. -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Problems installing CentOS 8
On 12/24/19 6:39 AM, Mauricio Tavares wrote: On Mon, Dec 23, 2019 at 8:20 PM Michael Eager wrote: I'm having a problem installing CentOS 8 from a USB drive. When the installer boots from the USB, it displays the language selection screen. After I select English and continue, the installer freezes. The USB drive flashes a couple times over the next minute or so, the stops. The mouse moves the cursor, but the installer is unresponsive to either selecting QUIT or HELP. I've tried both the default and the basic graphic install with the same results. Stupid (as in I am guilty of that) question: do you know if this USB is not a bum? The later explains why I could not my raspberry pi booting. Replacing with a new sd card solved this issue. CentOS completes the self-test without error before booting. Booting completes without apparent problem. My guess is that the USB is good. If you want to be lazy and have a hypervisor, create an vm guest and boot it using the usb. I can try that. It will tell me if there is a HW/BIOS issue. With that said, it is possible that while you are having an uncooperative gui you can still switch screens (i.e. keyboard still listening to you) to screen 1 or 2 and then take a look at the dmesg/log output for clues of what went boink. Good idea -- I hadn't thought of that. Details: CentOS-8-x86-1905-dvd1.iso (sha256 verified) ASUS Prime B350 Plus motherboard AMD Ryzen 5 1600 CPU 32Gb DRAM 4 SATA drives in RAID/LVM configuration. M.2 500Gb Samsung SSD (not formatted) -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Problems installing CentOS 8
I'm having a problem installing CentOS 8 from a USB drive. When the installer boots from the USB, it displays the language selection screen. After I select English and continue, the installer freezes. The USB drive flashes a couple times over the next minute or so, the stops. The mouse moves the cursor, but the installer is unresponsive to either selecting QUIT or HELP. I've tried both the default and the basic graphic install with the same results. Details: CentOS-8-x86-1905-dvd1.iso (sha256 verified) ASUS Prime B350 Plus motherboard AMD Ryzen 5 1600 CPU 32Gb DRAM 4 SATA drives in RAID/LVM configuration. M.2 500Gb Samsung SSD (not formatted) -- Michael Eagerea...@eagerm.com 1960 Park Blvd., Palo Alto, CA 94306 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Building for older versions
Hi -- I'm trying to build an application on CentOS 7 which can run on older versions of CentOS. I'm running into problems with versioning of memcpy in Glibc. Executables built on CentOS 7 require memcpy from glibc-2.14, which causes the program not to load on systems with older versions of glibc. My online search suggests to add an asm() with a .symver option to select memcpy from glibc-2.2.5 in each of the source files which reference memcpy(). This isn't practical with a program with tens of thousands of source files. Does anyone have a reasonable solution? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Building for older versions
On 11/23/2015 08:06 AM, Nicolas Thierry-Mieg wrote: On 11/23/2015 04:33 PM, Michael Eager wrote: Hi -- I'm trying to build an application on CentOS 7 which can run on older versions of CentOS. I'm running into problems with versioning of memcpy in Glibc. Executables built on CentOS 7 require memcpy from glibc-2.14, which causes the program not to load on systems with older versions of glibc. My online search suggests to add an asm() with a .symver option to select memcpy from glibc-2.2.5 in each of the source files which reference memcpy(). This isn't practical with a program with tens of thousands of source files. Does anyone have a reasonable solution? IMO you should really be building your app on an older Centos version (5 or 6). Then your binary should run everywhere, though it may sometimes require installing a -compat package. That causes a number of other problems, when the only issue is accessing a working version of memcpy from the installed glibc. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Building for older versions
On 11/23/2015 07:43 AM, Chris Adams wrote: Once upon a time, Michael Eager <ea...@eagerm.com> said: I'm trying to build an application on CentOS 7 which can run on older versions of CentOS. I'm running into problems with versioning of memcpy in Glibc. Executables built on CentOS 7 require memcpy from glibc-2.14, which causes the program not to load on systems with older versions of glibc. Most shared libraries are "upwards compatible" but not "backwards compatible" - builds against an old version will run with the new version, but not the other way around. You've found this with glibc, but you could also run into it with other libraries. The situation with memcpy is a bit different. This isn't really a forward/backward interface compatibility issue. There was a patch applied to memcpy to improve performance on some architectures, but it also changed the order in which data was moved. Some programs were dependent on this and they broke with the new implementation. These programs did not conform to the non-overlapping data requirements in memcpy's specification. Programs which did conform worked with both the new and old implementation. To prevent non-conforming programs from using the new version of memcpy, they tagged it with glibc-2.14. Unfortunately, this also makes conforming programs, which work with either the old or new implementation from running on systems which have older versions of glibc. My online search suggests to add an asm() with a .symver option to select memcpy from glibc-2.2.5 in each of the source files which reference memcpy(). This isn't practical with a program with tens of thousands of source files. Does anyone have a reasonable solution? Would it be practical to use mock and build on the oldest version you want to support? This is how EPEL packages are built for example. It is targeted at building RPMs, but you can manually use copy-in and copy-out to do other things. I'll look into mock. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Building for older versions
On 11/23/2015 09:10 AM, Nicolas Thierry-Mieg wrote: On 11/23/2015 06:00 PM, Michael Eager wrote: On 11/23/2015 08:06 AM, Nicolas Thierry-Mieg wrote: On 11/23/2015 04:33 PM, Michael Eager wrote: Hi -- I'm trying to build an application on CentOS 7 which can run on older versions of CentOS. I'm running into problems with versioning of memcpy in Glibc. Executables built on CentOS 7 require memcpy from glibc-2.14, which causes the program not to load on systems with older versions of glibc. My online search suggests to add an asm() with a .symver option to select memcpy from glibc-2.2.5 in each of the source files which reference memcpy(). This isn't practical with a program with tens of thousands of source files. Does anyone have a reasonable solution? IMO you should really be building your app on an older Centos version (5 or 6). Then your binary should run everywhere, though it may sometimes require installing a -compat package. That causes a number of other problems, can you please provide some details? I'm genuinely curious as I've been faced with this occasionally and the only problem I've encountered is having to install a few *-compat packages. thanks. Building on an older version of CentOS means using older compilers and libraries. Some applications require building with more current tools. So you end up between a rock and a hard place. You can try to build on the older system for library compatibility, but then you have to use development tools from newer versions. Or you can build with the newer tools, and you have compatibility issues running on the older system. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Building for older versions
On 11/23/2015 07:57 AM, Gordon Messmer wrote: On 11/23/2015 07:33 AM, Michael Eager wrote: Does anyone have a reasonable solution? I'd start here: https://wiki.linuxfoundation.org/en/Using_lsbdev Yeah. I know about LSB and I've worked with the LSB committee. Maybe it's time I tried using it. :-) It does seem to be a sledge hammer to address what seems to be a minor issue. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] NFS performance on CentOS 7
I am setting up a file server with CentOS 7. I'm seeing performance which is considerably slower than a similar server running CentOS 6.6. A 3Gb directory can be copied to/from the CentOS 6.6 server in about 50 seconds. The same directory takes about 270 seconds to copy to/from the CentOS 7 system. I see the same performance difference with NFS mounted file systems or using scp, so it doesn't appear to be an NFS issue. The MTU on the NICs on both systems is 1500, and changing it to 6000 on the CentOS 7 system had no effect. Anyone have any ideas what might cause this problem or how to fix it? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] CentOS 5.8 crash/freeze running VMware
On 07/06/2012 11:17 AM, Johnny Hughes wrote: On 06/29/2012 09:52 AM, Michael Eager wrote: On 06/28/2012 06:33 PM, Ted Miller wrote: On 06/28/2012 12:45 PM, Michael Eager wrote: Hi -- I have a server running CentOS 5.8. It has a 6-core AMD processor, 16Gb memory, and a RAID 5 file system. It serves as both a file server and to run several VMware virtual machines. The guest machines run Windows 7 and various versions of Linux. The system is running the latest version of VMware Workstation. Until recently, I started VMs using the VMware Workstation GUI. The system has been very stable and seldom crashes. Recently, I set up an init script to start several VMs at boot time using the vmrun command. This appeared to work correctly, but the system has become unstable, freezing at various times. When the system freezes, there is no console response and it does not respond to a ping. There is nothing in syslog to indicate any error. The script started 8 VMs. I've cut back to now running 4 VMs and the system appears stable. Is there some relation between the number of cores and the number of VMs one can run? Is there something else which might cause the system to crash when running multiple VMs? Any suggestions to identify why the system crashed? Are you staggering the startups of the VMs? The server may be choking trying to boot 8 machines at once. I suggest starting a VM every 30-60 seconds, so that you aren't trying to boot all 8 at once. Don't know if it will help, but it might. The crashs happen long after boot time when all of the VMs are running. (Actually, startup goes very smoothly, with the VMs starting in parallel in the background while system boot completes.) This sounds like the issue with the machine running out of memory and the Out of Memory killer actually killing one of the VMWare instances. My experience with this on a very good machine was that there was enough memory, but it was timing that was causing the issue. The machine did not respond quickly enough to the memory request and the OOM Killer then acted. How I solved my problem was to reserve more memory as unused with this memory variable: I have had issues with VMWare host server and running out of memory, maybe try setting this variable in sysctl.conf: vm.min_free_kbytes=65536 (that will maintain 64MB of free RAM and should allow for enough time to prevent OOM kills) I'll give that a try. But the problem was not that one or more VMware instances was killed and other processes continued, but that the system hung. Nothing was running. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] System crash -- no clue why
Hi -- My CentOS 5.8 server crashed, leaving no clue why. The last entry in /var/log/messages is a dhcpd notice around 4:00am, followed by the restart message when I rebooted. The only clue that I have is that the fan was running full speed when I restarted it. The fan slowed to normal speed. Any ideas what I can do to find out the cause? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] CentOS 5.8 crash/freeze running VMware
On 06/28/2012 06:33 PM, Ted Miller wrote: On 06/28/2012 12:45 PM, Michael Eager wrote: Hi -- I have a server running CentOS 5.8. It has a 6-core AMD processor, 16Gb memory, and a RAID 5 file system. It serves as both a file server and to run several VMware virtual machines. The guest machines run Windows 7 and various versions of Linux. The system is running the latest version of VMware Workstation. Until recently, I started VMs using the VMware Workstation GUI. The system has been very stable and seldom crashes. Recently, I set up an init script to start several VMs at boot time using the vmrun command. This appeared to work correctly, but the system has become unstable, freezing at various times. When the system freezes, there is no console response and it does not respond to a ping. There is nothing in syslog to indicate any error. The script started 8 VMs. I've cut back to now running 4 VMs and the system appears stable. Is there some relation between the number of cores and the number of VMs one can run? Is there something else which might cause the system to crash when running multiple VMs? Any suggestions to identify why the system crashed? Are you staggering the startups of the VMs? The server may be choking trying to boot 8 machines at once. I suggest starting a VM every 30-60 seconds, so that you aren't trying to boot all 8 at once. Don't know if it will help, but it might. The crashs happen long after boot time when all of the VMs are running. (Actually, startup goes very smoothly, with the VMs starting in parallel in the background while system boot completes.) -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] CentOS 5.8 crash/freeze running VMware
Hi -- I have a server running CentOS 5.8. It has a 6-core AMD processor, 16Gb memory, and a RAID 5 file system. It serves as both a file server and to run several VMware virtual machines. The guest machines run Windows 7 and various versions of Linux. The system is running the latest version of VMware Workstation. Until recently, I started VMs using the VMware Workstation GUI. The system has been very stable and seldom crashes. Recently, I set up an init script to start several VMs at boot time using the vmrun command. This appeared to work correctly, but the system has become unstable, freezing at various times. When the system freezes, there is no console response and it does not respond to a ping. There is nothing in syslog to indicate any error. The script started 8 VMs. I've cut back to now running 4 VMs and the system appears stable. Is there some relation between the number of cores and the number of VMs one can run? Is there something else which might cause the system to crash when running multiple VMs? Any suggestions to identify why the system crashed? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Simon Matter wrote: One fan is listed as 0 rpm. Something to look into. Hmm, much has been said now in this thread and I know how difficult it can be to find such an issue. However, I suggest not to throw in too many new tools in parallel. And, be careful of how to interpret any information gathered by tools like lm_sensors. They can only report as good as the mainboard and it's sensors were designed and built, both can be suboptimal. I've seen all kind of things like temp sensors not mounted where they should. Of course, builtin sensors like thiose of a CPU should be taken very serious. Thanks for the suggestions. So, may I give some more tips how I'd try to find what is wrong: - Take a vacuum cleaner and *carefully* clean the whole box. Dust can really do bad things because it is not a perfect insulator. - If you feel you have to remove any device like CPU, make sure you up everything, have a good quality heat sink paste at hand and make sure everything is seated well after mounting it again. - For the memory part, do you have ECC? If not, then it is really a problem and if the box is used as a server, ECC is a must, if yes, then most errors will be corrected by ECC but what is more important, memory errors are usually logged. You should be able to find a list of those errors in the BIOS, you may see how many times errors occur and where, does something like that exist? The MB docs/website don't mention ECC support, but I presume it is as part of the DDR2 spec. I'll check whether the memory has ECC. If not, this is a reasonable upgrade. - For the temparatures, 87C is not so uncommon, but yes, it looks a little bit high. Someone else posted 80C to be the max for your CPU, that seems correct, at least our 12core Opterons have Caution: 75C; Critical: 80C but they usually run at 45C-55C under normal load. So if 87C is really correct, under normal load, that may be already too much, and then consider what happens at peak times? The most recent crash was overnight and not discovered until morning. Probably not related to load. But if it really is running over temp, then almost anything can happen. - When you look at the lm_sensors values, do they correspund with what is shown in the BIOS (if is has this kind of diagnostics)? Something I'll check when the system is taken down. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Alexander Arlt wrote: Am 03/10/2011 11:04 AM, schrieb Simon Matter: - Take a vacuum cleaner and *carefully* clean the whole box. Dust can really do bad things because it is not a perfect insulator. Never ever do that. Especially not inside the machine. There is a real risk of simply vacuuming smaller components like smd-resistors of the board. And, as already mentioned, you also have the chance of killing components by electrostatic discharge. Always use compressed air, even if just using canned one. Vacuuming is a pretty bad advice. Previous cleaning have been with canned compressed air. Thanks for the caution about vacuums and static. I may use the vacuum on the case fans from the outside. The case should provide an adequate static shield. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Dr. Ed Morbius wrote: on 09:24 Tue 08 Mar, Michael Eager (ea...@eagerm.com) wrote: Hi -- I'm running a server which is usually stable, but every once in a while it hangs. The server is used as a file store using NFS and to run VMware machines. I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? I'd very strongly recommend you configure netconsole. Though not entire clear from the name, it's actually an in-kernel network logging module, which is very useful for kicking out kernel panics which otherwise aren't logged to disk and can't be seen on a (nonresponsive) monitor. I'll take a look at netconsole. Alternately, a serial console which actually retains all output sent to it (some remote access systems support this, some don't) may help. Barring that, I'd start looking at individual HW components, starting with RAM. The problem with randomly replacing various components, other than the downtime and nuisance, is that there's no way to know that the change actually fixed any problem. When the base rate is one unknown system hang every few weeks, how many wees should I wait without a failure to conclude that the replaced component was the cause? A failure which happens infrequently isn't really amenable to a random diagnostic approach. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
John Hodrien wrote: On Wed, 9 Mar 2011, Michael Eager wrote: The problem with randomly replacing various components, other than the downtime and nuisance, is that there's no way to know that the change actually fixed any problem. When the base rate is one unknown system hang every few weeks, how many wees should I wait without a failure to conclude that the replaced component was the cause? A failure which happens infrequently isn't really amenable to a random diagnostic approach. So you pitch the whole thing over to being a test rig, and buy all new hardware? I'll repeat from my original post: I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? I'm looking for diagnostics to focus on the cause of the crash. My thanks for the several suggestions in this area. I'm not particularly interested in a listing of the myriad of hypothetical causes absent observable evidence and some of which are contradicted by evidence (such as overheating). I've encountered my share of bad power supplies, bad RAM, poorly seated cards, etc. I've replaced failing capacitors in monitors (never on a motherboard). I've replaced video cards, hard drives, bad cables. And so forth. Each of these had characteristics which pointed to the problem: kernel oops, POST failures, flickering screens, etc. The problem I have is that there is a lack of diagnostic information to focus on the cause of the server failure. I don't mean to appear unappreciative, but suggestions which amount to spending many hours making a series of unfocused modifications to the server, hoping that one of these random alterations fixes an infrequent problem, doesn't strike me as useful. At the other extreme, the suggestions that I not look for the cause of the system failure and instead replace the server with one or three servers also doesn't seem to be a useful diagnostic approach either. During the next server downtime, I'll re-seat RAM and cables, check for excess dust, and do normal maintenance as folks have suggested. I might also run a memory diag. I'll also look at the several excellent and appreciated suggestions (some of which I've already installed) on how to get a better picture on the state of the server when/if there is a future failure. Thanks all! -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
m.r...@5-cent.us wrote: Michael Eager wrote: John Hodrien wrote: On Wed, 9 Mar 2011, Michael Eager wrote: The problem with randomly replacing various components, other than the downtime and nuisance, is that there's no way to know that the change actually fixed any problem. When the base rate is one unknown system hang every few weeks, how many wees should I wait without a failure to conclude that the replaced component was the cause? A failure which happens infrequently isn't really amenable to a random diagnostic approach. So you pitch the whole thing over to being a test rig, and buy all new hardware? I'll repeat from my original post: I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? I'm looking for diagnostics to focus on the cause of the crash. My thanks for the several suggestions in this area. I'm not particularly interested in a listing of the myriad of hypothetical causes absent observable evidence and some of which are contradicted by evidence (such as overheating). snip Here's one more, off-the-wall thought: do the setterm --powersave off, and find some way to make it work, so that you can see what's on the screen when it dies. Yes, I did this. Switched to console screen. The correct command is setterm -powersave off -blank off, otherwise the screen gets blanked. Turned the monitor off. I hope it shows something useful on the next fault. What may be very important here is I recently had a problem with a honkin' big server crashing... and it turned out that a user was running a parallel processing job that kicked off three? four? dozen threads, and towards the end of the job, every single thread wanted 10G... on a system with 256G RAM (which size still boggles my mind). The OOM-Killer didn't even have a chance to do its thing Yes, he's limited what his job requests, and the system hasn't crashed since. Strange. OOM-Killer should get priority. That's what it's for. Although it usually seems to kill the innocent bystanders before it gets around to killing the offenders. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Les Mikesell wrote: Note that overheating can be localized or a bad heat sink mounting or fan on a CPU. I'll re-seat the CPU, heatsink, and fan on the next downtime. Heat related problems usually present as a system which fails and will not reboot immediately, but will after they sit for a while to cool down. This system doesn't do that. I'll install sensord to log CPU temps in case this is a problem. There's not really a good way to approach intermittent failures. It may only break when you aren't looking. Major component swaps or taking it offline for extended diagnostics hoping to catch a glimpse of the cause when it fails is about all you can do. During the next server downtime, I'll re-seat RAM and cables, check for excess dust, and do normal maintenance as folks have suggested. I might also run a memory diag. I'll also look at the several excellent and appreciated suggestions (some of which I've already installed) on how to get a better picture on the state of the server when/if there is a future failure. Memory diagnostics may take days to catch a problem. Did you check for a newer bios for your MB? I mentioned before that it seemed strange, but I've seen that fix mysterious problems even after the machines had previously been reliable for a long time (and even more oddly, all the machines in the lot weren't affected). Yes, most memory diagnostics are not very effective. I'll have to stop the server to find out what the installed bios version is and see whether there is an update. Most bios updates appear to only change supported CPUs. Something else for the next downtime. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
compdoc wrote: I'll re-seat the CPU, heatsink, and fan on the next downtime. Is the CPU overheating? Pointless to reseat the cpu or even remove the heatsink, if not. No evidence to suggest that it is. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
m.r...@5-cent.us wrote: Michael Eager wrote: snip I'll have to stop the server to find out what the installed bios version is and see whether there is an update. Most bios updates appear to only change supported CPUs. Something else for the next downtime. Nope: dmidecode, or lshw, is your friend. Thanks. Looks like there might be a newer bios available, although the vendor identifies it as 'beta'. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Dr. Ed Morbius wrote: If the issue is repeated but rare system failures on one of a set of similarly configured hosts, I'd RMA the box and get a replacement. End of story. I'll repeat: this is a house-made system. There's no vendor to RMA to. It seems obvious to me: RMA is not a diagnostic tool. If you'd post details of the host, more logging information, netconsole panic logs, etc., it might be possible to narrow down possible causes. The problem is that there are NO DIAGNOSTICS generated when the system hangs. There's no panic and nothing in the logs which indicates any problem. This is what I indicated from the get go. With what you've posted to date, it's not. I could waste my time posting logs for you to tell me that they don't point to any problem. I'd rather skip that step. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
m.r...@5-cent.us wrote: Michael Eager wrote: compdoc wrote: I'll re-seat the CPU, heatsink, and fan on the next downtime. Is the CPU overheating? Pointless to reseat the cpu or even remove the heatsink, if not. No evidence to suggest that it is. Have you used ipmitool to see what the temperatures are? No, I'm not familiar with ipmitool. I just installed it and the man page will take some time to read. It looks like it does everything and then more. According to the man page, it apparently needs a kernel driver named OpenIMPI, which it claims is installed in standard distributions. I don't find it on my system. Running impitool sdr type Temperature results in an error message saying that it could not open /dev/imp0, etc. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Rudi Ahlers wrote: On Thu, Mar 10, 2011 at 12:31 AM, Michael Eager ea...@eagerm.com wrote: Dr. Ed Morbius wrote: If the issue is repeated but rare system failures on one of a set of similarly configured hosts, I'd RMA the box and get a replacement. End of story. I'll repeat: this is a house-made system. There's no vendor to RMA to. I don't know where you are, but in our country we can RMA anything and everything. Apart from CPU's. So, even a cheap desktop mobo could be RMA'd, as long as I can prove to the suppliers it's faulty, and it's within the warrenty period I responded to Dr. Morbius' suggestion that I RMA the box. There is vendor to RMA the box to. If I knew that it was a motherboard problem, I could RMA it. Or disk, or PSU, or network card, or whatever. But, as I've mentioned, there's no indication what causes the system to hang. There is no way at this point to prove that it is a defective motherboard. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
compdoc wrote: According to the man page, it apparently needs a kernel driver named OpenIMPI, which it claims is installed in standard distributions. I don't find it on my system. lm_sensors is another, and I think installs ready to use from the repos. sensors says that the three temp sensors read +36C, +39C, and +87C. These appear to be AMD K10 temp sensors, although I might be misreading sensors-detect. Low/highs are (+127/+127, +127/+90, +127/+127) respectively. (I'm not sure if these are alarm set points or something else.) One fan is listed as 0 rpm. Something to look into. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Rudi Ahlers wrote: As far as I can see you were giving a bucked load of advice, which you haven't even bothered to follow yet. You're the only one who could actually do anything about the problem. I have followed quite a bit of the advice, which I have appreciated and noted. I've set up the monitor so that it will not be blanked on a crash, installed monitoring software, and checked a number of conditions which people have suggested. No, I have not responded to the philosophical discussions about vender management, nor to the suggestions to RMA something to somebody for unknown reasons. No, I'm not going to replace RAM or capacitors here and there on the off chance that something might be bad. (But I will look for capacitors which show signs of bulging or leaking.) No amount of suggestions made on this list will fix the problem for you. You need to actually take apart the server and see what's going on. I wasn't interested in anyone fixing the server for me. I did ask for suggestions on how improve the diagnostics for the problem, which several people have responded to. Again, I appreciate their suggestions greatly. As I've said, I have a list of things to check when the server is next taken down. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
compdoc wrote: Err, that should read 128C -Original Message- From: centos-boun...@centos.org [mailto:centos-boun...@centos.org] On Behalf Of compdoc Sent: Wednesday, March 09, 2011 4:50 PM To: 'CentOS mailing list' Subject: Re: [CentOS] Server hangs on CentOS 5.5 +36C and +39C are likely your cpu and motherboard temps. You have to look at the temps in the cmos and match them. The +87C is likely just a miss-reading by lm_sensors. Anything running that hot won't be stable. I use AMD as well, and lm_sensors tells me something is 1280C. I'll compare the values from lm_sensors with the bios temps to see if they are in line. 1280C is about the melting point of iron. Wow! -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Dr. Ed Morbius wrote: You're NOT obliged to repeat information you've already posted (e.g.: home-brew system), but it's helpful to front-load data rather than have us tease it out of you. No intention to have anyone tease information out of me. The subject line says that the system is CentOS 5.5. The other info has been forthcoming, as much as I have been able to provide. Sorry it wasn't all at the same time -- I didn't think that saying the server was not a Dell or HP box was important. With what you've posted to date, it's not. I could waste my time posting logs for you to tell me that they don't point to any problem. I'd rather skip that step. Krell forfend you should post relevant and useful information which might be useful in actually diagnosing your problem (or pointing to likely candidates and/or further tests). The logs are uninformative. No messages for hours before the crash. Thanks for the help. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] Server hangs on CentOS 5.5
Hi -- I'm running a server which is usually stable, but every once in a while it hangs. The server is used as a file store using NFS and to run VMware machines. I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
compdoc wrote: I'm running a server which is usually stable, but every once in a while it hangs. There can be many reasons for that. One thing I'm curious about - try looking at the reallocated sector count, and current pending sector count for your drives with smartctl. Thanks for the suggestions. All disks show zero realloc sectors and pending sectors. Smartctl says no failures. Also, max temp was 48 C or less. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Brian Mathis wrote: On Tue, Mar 8, 2011 at 12:24 PM, Michael Eager ea...@eagerm.com wrote: Hi -- I'm running a server which is usually stable, but every once in a while it hangs. The server is used as a file store using NFS and to run VMware machines. I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? Please be more specific when you say it hangs. Does it just pause for a minute and then continue working, or does it freeze completely until you reboot it? Does it respond to s soft reboot like Ctrl-Alt-Del, or do you need to hard power it off? System is unresponsive. Monitor blank, no response to keyboard, no response to remote ssh. Hit reset to reboot. The only indication that I had that there was a problem (other that attached systems were not accessing files) was that the fan(s) on the server were louder than normal. Since this is an NFS server I'm going to guess there might be a lot of IO. Maybe there is some large IO load going on, like maybe all your VMs are running anti-virus scan at the same time, or something like that. At the time, should be very low NFS load. To troubleshoot, I recommend installing the 'sar' utilities (yum install sysstat) and then reviewing the collected data using the 'ksar' utility (http://sourceforge.net/projects/ksar/). sar/ksar are good for tracking down acute problems. Thanks for the suggestion. I'll look into sar. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Les Mikesell wrote: On 3/8/2011 11:24 AM, Michael Eager wrote: Hi -- I'm running a server which is usually stable, but every once in a while it hangs. The server is used as a file store using NFS and to run VMware machines. I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? Probably something hardware related. Bad memory, overheating, power supply, etc. I've even seen some rare cases where a bios update would fix it although it didn't make much sense for a machine to run for years, then need a firmware change. The system is on a UPS and temps seem reasonable. Locating a transient memory problem is time consuming. Identifying a power supply which sometimes spikes is even more difficult. I'd like to have a clue about the likely problem before shutting down the server for an extended period. I'll set up sar and sensord to periodically log system status and see if this gives me a clue for the next time this happens. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
compdoc wrote: The only indication that I had that there was a problem (other that attached systems were not accessing files) was that the fan(s) on the server were louder than normal. Are you saying the fans were running faster than normal while it was hung? Or are they louder than usual even while its running? They were louder than normal when hung, but returned to being quiet after the reboot. Fans making noise can mean the fan isn't spinning as fast as it should because the bearing is failing. Be a good time to open the case to check to see that all fans are working... Good idea. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
m.r...@5-cent.us wrote: Michael Eager wrote: Brian Mathis wrote: On Tue, Mar 8, 2011 at 12:24 PM, Michael Eager ea...@eagerm.com wrote: Hi -- I'm running a server which is usually stable, but every once in a while it hangs. The server is used as a file store using NFS and to run VMware machines. I don't see anything in /var/log/messages or elsewhere to indicate any problem or offer any clue why the system was hung. Any suggestions where I might look for a clue? snip System is unresponsive. Monitor blank, no response to keyboard, no response to remote ssh. Hit reset to reboot. Suggestion 1: -from the console-, run setterm --powersave off That way, even if you connect a monitor (in our, uh, computer labs, we have a monitor-on-a-stick), you'll still see what's on the screen at the end, not the power save blanking. I get a message cannot (un)set powersave mode. I'll add this to .xinitrc. The only indication that I had that there was a problem (other that attached systems were not accessing files) was that the fan(s) on the server were louder than normal. Um. Um. What make is the server? We had that on some new Suns, where after working on them, the fans would spin up and *not* spin down to normal. The answer to that was, after powering them down, pull all the plugs, and leave them out for 20 sec or so House-built, Gigabyte MB, AMD Phenom II X6, 6Gb RAM. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Scott Silva wrote: Did you try the obvious stuff for older equipment? Remove and reseat ALL cards and memory, several times, to clean off any oxidation from contacts. Blow out any dust and collected lint. reseat drive cables. Not yet, but that's always a good idea. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
Michael Eager wrote: m.r...@5-cent.us wrote: Suggestion 1: -from the console-, run setterm --powersave off That way, even if you connect a monitor (in our, uh, computer labs, we have a monitor-on-a-stick), you'll still see what's on the screen at the end, not the power save blanking. I get a message cannot (un)set powersave mode. I'll add this to .xinitrc. Or better, CTRL-ALT-F1 to switch to serial console and run setterm -powersave off. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Server hangs on CentOS 5.5
m.r...@5-cent.us wrote: Michael Eager wrote: House-built, Gigabyte MB, AMD Phenom II X6, 6Gb RAM. Any chance the problem's with the video card? Video is on the MB. It doesn't seem likely that it's the video, since the system doesn't respond to network when it crashes. It could be anything. That's why I'm looking for something that would give me a bit of a hint what to look at. With an infrequent failure, it's not practical to replace components piecemeal. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077 ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos