Re: [gentoo-user] System freezes during compiles [SOLVED]
On Wed, 2013-03-20 at 20:57 +0100, Volker Armin Hemmann wrote: > you might just hit a thrashing situation. Linux is very bad when it > comes to abusing swap in case of an emergency. > > But it also sounds like overheating or a power problem. Power problems > might be caused by the PSU - but it could also be the power circuitry > of > your mobo. First of all, thank you to everyone for the superb help and suggestions regarding this problem. Yesterday, I enabled some swap space, but the system froze on the first attempt at compiling glibc. The next cheapest option was to clean the case of dust. The CPU heat sink was clogged with a think layer of dust. After thoroughly cleaning the case, the system compiled glibc, the kernel, qtcore and other packages without freezing. The only downside is since the fins on the heat sink are exposed directly to the fan again, the noise level has gone up. When I checked the RPMs in the BIOS I noticed a setting which states, decrease CPU voltage and frequency in the event of a temperature threshold being exceeded. This would explain the kernel watchdog messages reporting stalls were detected. For anyone that's curious, here's the output of sensors and free during the compile of glibc. Swap wasn't being touched at all, there's still 4GB of memory free. The cpu was getting close to the threshold limit even after the heat sink was cleaned of dust. k10temp-pci-00c3 Adapter: PCI adapter temp1:+58.9°C (high = +70.0°C) (crit = +71.0°C, hyst = +66.0°C) it8720-isa-0228 Adapter: ISA adapter in0: +1.49 V (min = +0.00 V, max = +4.08 V) in1: +1.47 V (min = +0.00 V, max = +4.08 V) in2: +3.38 V (min = +0.00 V, max = +4.08 V) +5V: +2.96 V (min = +0.00 V, max = +4.08 V) in4: +3.07 V (min = +0.00 V, max = +4.08 V) in5: +3.25 V (min = +0.00 V, max = +4.08 V) in6: +4.08 V (min = +0.00 V, max = +4.08 V) ALARM 5VSB: +2.98 V (min = +0.00 V, max = +4.08 V) Vbat: +3.28 V fan1:6750 RPM (min =0 RPM) fan2: 0 RPM (min =0 RPM) fan3: 0 RPM (min =0 RPM) fan4: 0 RPM (min =0 RPM) temp1:+31.0°C (low = +127.0°C, high = +127.0°C) sensor = thermistor temp2:+67.0°C (low = +127.0°C, high = +127.0°C) sensor = thermal diode temp3:+70.0°C (low = +127.0°C, high = +127.0°C) sensor = thermal diode cpu0_vid:+0.525 V intrusion0: ALARM total used free sharedbuffers cached Mem: 816740031953004972100 0 62024 1379256 -/+ buffers/cache:17540206413380 Swap: 511996 0 511996 Once again, a big thanks for everyone's help. Regards, Carlos
Re: [gentoo-user] System freezes during compiles
Oom Killer is Not instant, can take a long time or get stuck or kills something vital. ... Am 21.03.2013 07:52 schrieb "Carlos Hendson" : > On Thu, 2013-03-21 at 06:45 +0100, Volker Armin Hemmann wrote: > > You got your answer. 8gig and no swap is NOT ENOUGHT. > > It's a strong indicator, which is going to be corrected. > > I am slightly confused by the resulting behaviour however. I was of the > impression oomkiller would start to kill processes when unallocated > memory is getting scarce? > > How would no free memory cause CPU stalls? > > Regards, > Carlos > > >
Re: [gentoo-user] System freezes during compiles
On Thu, 2013-03-21 at 06:45 +0100, Volker Armin Hemmann wrote: > You got your answer. 8gig and no swap is NOT ENOUGHT. It's a strong indicator, which is going to be corrected. I am slightly confused by the resulting behaviour however. I was of the impression oomkiller would start to kill processes when unallocated memory is getting scarce? How would no free memory cause CPU stalls? Regards, Carlos
Re: [gentoo-user] System freezes during compiles
On Wed, 2013-03-20 at 16:27 -0500, Paul Hartman wrote: > > I had a virtual server that kept crashing/rebooting during compiles of > large packages such as php. It ended up being because it was running > out of memory. Added another 1GB of swap space and it has been happy > ever since. Thanks Paul. Volker suggested a possible caused was swap. I'll allocate some swap space after the smartctl self-test finishes and try to recompile gcc a few times. Regards, Carlos
Re: [gentoo-user] System freezes during compiles
You got your answer. 8gig and no swap is NOT ENOUGHT. Am 20.03.2013 22:51 schrieb "Carlos Hendson" : > On Wed, 2013-03-20 at 20:57 +0100, Volker Armin Hemmann wrote: > > you might just hit a thrashing situation. Linux is very bad when it > > comes to abusing swap in case of an emergency. > > > > But it also sounds like overheating or a power problem. Power problems > > might be caused by the PSU - but it could also be the power circuitry > > of > > your mobo. > > It's not a thrashing issue as I don't have any swap. The 8GB of ram has > been sufficient memory for all tasks thus far. I have no objection to > allocating some swap space if it could resolve the issue. > > Actually, Grant and you both suggested possible heat issues which has > just made me think that I should check for dust build up in the CPU heat > sink. There so much dust where I live that I have to vacuum dust build > up from the case. > > The sensors tool reports 51C, it doesn't appear to be running too hot, > but I don't have a baseline to compare it to. I see I need to implement > monitoring for this machine once it's stable again. > > k10temp-pci-00c3 > Adapter: PCI adapter > temp1:+51.0°C (high = +70.0°C) >(crit = +71.0°C, hyst = +66.0°C) > > > I'll give the inside a clean this weekend and see if there's any > improvement. > > Thanks for the suggestions. > > Regards, > Carlos > > >
Re: [gentoo-user] System freezes during compiles
On Wed, 2013-03-20 at 20:57 +0100, Volker Armin Hemmann wrote: > you might just hit a thrashing situation. Linux is very bad when it > comes to abusing swap in case of an emergency. > > But it also sounds like overheating or a power problem. Power problems > might be caused by the PSU - but it could also be the power circuitry > of > your mobo. It's not a thrashing issue as I don't have any swap. The 8GB of ram has been sufficient memory for all tasks thus far. I have no objection to allocating some swap space if it could resolve the issue. Actually, Grant and you both suggested possible heat issues which has just made me think that I should check for dust build up in the CPU heat sink. There so much dust where I live that I have to vacuum dust build up from the case. The sensors tool reports 51C, it doesn't appear to be running too hot, but I don't have a baseline to compare it to. I see I need to implement monitoring for this machine once it's stable again. k10temp-pci-00c3 Adapter: PCI adapter temp1:+51.0°C (high = +70.0°C) (crit = +71.0°C, hyst = +66.0°C) I'll give the inside a clean this weekend and see if there's any improvement. Thanks for the suggestions. Regards, Carlos
Re: [gentoo-user] System freezes during compiles
On Wed, 2013-03-20 at 08:17 +, Mick wrote: > Stating the obvious, it seems that the kernel is struggling and indeed > you may > have come across some nasty kernel bug. However, it could well be > that it is > not related to the kernel you're running, or your kernel config. It > could be > a problem with the power supply being faulty and causing these lock > ups. > > Unless someone else comes up with a better idea to troubleshoot it > further, I > would consider replacing the power supply with another of a known > good > condition. Thanks for the good advice Mick. I don't have spare hardware on-tap so switching psu, memory or processor may prove to be tricky. It's one of those catch 22's where I don't want to spend on components that aren't faulty, however I need to spend on components to test if they're faulty. I've been given a few other test to perform before I start moving to hardware replacement. Regards, Carlos
Re: [gentoo-user] System freezes during compiles
On Wed, 2013-03-20 at 18:43 +0100, Daniel Wagener wrote: > "Frozen" means there is no Hard Drive Activity going on right? > And there is no other indication, that you are just running out of > memory? I can't categorically state if there was drive activity. I was so fixated on regaining control of the machine that I failed to pay attention to the state of the HDD LED. I'll make a point of checking it the next time the machine appears to freeze. I saw no other indications of memory exhaustion after the system came back from the "soft-power reset" button being pressed. Regards, Carlos
Re: [gentoo-user] System freezes during compiles
On Tue, Mar 19, 2013 at 11:42 PM, Carlos Hendson wrote: > For last few weeks or so, I've been getting intermittent hard lock-ups > during the emerge of various packages. It appears the more compile > intensive the package, the more likely the lock-up. These lock-ups have > occurred under kernels 3.4.9 and 3.7.10 with gcc 4.5.4 and 4.6.3. I had a virtual server that kept crashing/rebooting during compiles of large packages such as php. It ended up being because it was running out of memory. Added another 1GB of swap space and it has been happy ever since.
Re: [gentoo-user] System freezes during compiles
Am 20.03.2013 05:42, schrieb Carlos Hendson: > Hello, > > For last few weeks or so, I've been getting intermittent hard lock-ups > during the emerge of various packages. It appears the more compile > intensive the package, the more likely the lock-up. These lock-ups have > occurred under kernels 3.4.9 and 3.7.10 with gcc 4.5.4 and 4.6.3. > > Once the machine is in a frozen state, the only thing that responds is > the soft power reset button. Some times the machine lock-ups again > after the button is pressed (this is because the compile resumes once > the system comes out of it's frozen state). > > If the system subsequently lock-ups because I wasn't able to cancel the > compile fast enough only a only option left is a hard power reset (10sec > + hold power button). If I cancel the compile, the system is perfectly > responsive and functions normally. > > There are kernel stack traces in /var/log/messages which I'm unable to > decipher and diagnose as to what caused the lock-up. > > If I had to guess, I'd blame an incorrect setting in the .config, but > since I'm stuck in the diagnostic of what part of the kernel might be > experiencing the problem, I need a bit of help to pin point the issue. > > I believe it to be a kernel configuration issue because when I booted > the machine using a system rescue Live CD, I was able to chroot into the > system and emerge packages like gcc without the lock-up problem > occurring. > > That's by no means conclusive, however, I've also run a complete pass of > memcheck for over an hour without any issues reported. > > I'd like to completely rule out hardware failure, what diagnostic tools > tools are recommend to try identify potential hardware issue of this > type? > > The various kernel stack traces are attached in case someone wants to > take a look. I can provide more information should it be needed. > > Any help or advice would be appreciated. > > Regards, > Carlos you might just hit a thrashing situation. Linux is very bad when it comes to abusing swap in case of an emergency. But it also sounds like overheating or a power problem. Power problems might be caused by the PSU - but it could also be the power circuitry of your mobo.
Re: [gentoo-user] System freezes during compiles
On Wed, 20 Mar 2013 05:42:28 +0100 Carlos Hendson wrote: > Hello, > > For last few weeks or so, I've been getting intermittent hard lock-ups > during the emerge of various packages. It appears the more compile > intensive the package, the more likely the lock-up. These lock-ups have > occurred under kernels 3.4.9 and 3.7.10 with gcc 4.5.4 and 4.6.3. > > Once the machine is in a frozen state, the only thing that responds is > the soft power reset button. Some times the machine lock-ups again > after the button is pressed (this is because the compile resumes once > the system comes out of it's frozen state). > > If the system subsequently lock-ups because I wasn't able to cancel the > compile fast enough only a only option left is a hard power reset (10sec > + hold power button). If I cancel the compile, the system is perfectly > responsive and functions normally. > > There are kernel stack traces in /var/log/messages which I'm unable to > decipher and diagnose as to what caused the lock-up. > > If I had to guess, I'd blame an incorrect setting in the .config, but > since I'm stuck in the diagnostic of what part of the kernel might be > experiencing the problem, I need a bit of help to pin point the issue. > > I believe it to be a kernel configuration issue because when I booted > the machine using a system rescue Live CD, I was able to chroot into the > system and emerge packages like gcc without the lock-up problem > occurring. > > That's by no means conclusive, however, I've also run a complete pass of > memcheck for over an hour without any issues reported. > > I'd like to completely rule out hardware failure, what diagnostic tools > tools are recommend to try identify potential hardware issue of this > type? > > The various kernel stack traces are attached in case someone wants to > take a look. I can provide more information should it be needed. > > Any help or advice would be appreciated. > > Regards, > Carlos "Frozen" means there is no Hard Drive Activity going on right? And there is no other indication, that you are just running out of memory? --
Re: [gentoo-user] System freezes during compiles
On Wed, 20 Mar 2013 08:17:11 +, Mick wrote: > Stating the obvious, it seems that the kernel is struggling and indeed > you may have come across some nasty kernel bug. However, it could well > be that it is not related to the kernel you're running, or your kernel > config. It could be a problem with the power supply being faulty and > causing these lock ups. That's certainly possible, it could also be failing memory, and it's cheaper to run memtest86+ before buying a new power supply ;-) -- Neil Bothwick Only an idiot actually READS taglines. signature.asc Description: PGP signature
Re: [gentoo-user] System freezes during compiles
On Wednesday 20 Mar 2013 04:42:28 Carlos Hendson wrote: > Hello, > > For last few weeks or so, I've been getting intermittent hard lock-ups > during the emerge of various packages. It appears the more compile > intensive the package, the more likely the lock-up. These lock-ups have > occurred under kernels 3.4.9 and 3.7.10 with gcc 4.5.4 and 4.6.3. > > Once the machine is in a frozen state, the only thing that responds is > the soft power reset button. Some times the machine lock-ups again > after the button is pressed (this is because the compile resumes once > the system comes out of it's frozen state). > > If the system subsequently lock-ups because I wasn't able to cancel the > compile fast enough only a only option left is a hard power reset (10sec > + hold power button). If I cancel the compile, the system is perfectly > responsive and functions normally. > > There are kernel stack traces in /var/log/messages which I'm unable to > decipher and diagnose as to what caused the lock-up. > > If I had to guess, I'd blame an incorrect setting in the .config, but > since I'm stuck in the diagnostic of what part of the kernel might be > experiencing the problem, I need a bit of help to pin point the issue. > > I believe it to be a kernel configuration issue because when I booted > the machine using a system rescue Live CD, I was able to chroot into the > system and emerge packages like gcc without the lock-up problem > occurring. > > That's by no means conclusive, however, I've also run a complete pass of > memcheck for over an hour without any issues reported. > > I'd like to completely rule out hardware failure, what diagnostic tools > tools are recommend to try identify potential hardware issue of this > type? > > The various kernel stack traces are attached in case someone wants to > take a look. I can provide more information should it be needed. > > Any help or advice would be appreciated. > > Regards, > Carlos Stating the obvious, it seems that the kernel is struggling and indeed you may have come across some nasty kernel bug. However, it could well be that it is not related to the kernel you're running, or your kernel config. It could be a problem with the power supply being faulty and causing these lock ups. Unless someone else comes up with a better idea to troubleshoot it further, I would consider replacing the power supply with another of a known good condition. -- Regards, Mick signature.asc Description: This is a digitally signed message part.
[gentoo-user] System freezes during compiles
Hello, For last few weeks or so, I've been getting intermittent hard lock-ups during the emerge of various packages. It appears the more compile intensive the package, the more likely the lock-up. These lock-ups have occurred under kernels 3.4.9 and 3.7.10 with gcc 4.5.4 and 4.6.3. Once the machine is in a frozen state, the only thing that responds is the soft power reset button. Some times the machine lock-ups again after the button is pressed (this is because the compile resumes once the system comes out of it's frozen state). If the system subsequently lock-ups because I wasn't able to cancel the compile fast enough only a only option left is a hard power reset (10sec + hold power button). If I cancel the compile, the system is perfectly responsive and functions normally. There are kernel stack traces in /var/log/messages which I'm unable to decipher and diagnose as to what caused the lock-up. If I had to guess, I'd blame an incorrect setting in the .config, but since I'm stuck in the diagnostic of what part of the kernel might be experiencing the problem, I need a bit of help to pin point the issue. I believe it to be a kernel configuration issue because when I booted the machine using a system rescue Live CD, I was able to chroot into the system and emerge packages like gcc without the lock-up problem occurring. That's by no means conclusive, however, I've also run a complete pass of memcheck for over an hour without any issues reported. I'd like to completely rule out hardware failure, what diagnostic tools tools are recommend to try identify potential hardware issue of this type? The various kernel stack traces are attached in case someone wants to take a look. I can provide more information should it be needed. Any help or advice would be appreciated. Regards, Carlos Mar 12 23:42:03 hydra kernel: [58066.564110] [ cut here ] Mar 12 23:42:03 hydra kernel: [58068.663176] WARNING: at kernel/watchdog.c:241 watchdog_overflow_callback+0x93/0x9e() Mar 12 23:42:03 hydra kernel: [58068.673235] Hardware name: GA-990FXA-D3 Mar 12 23:42:03 hydra kernel: [58068.673303] Watchdog detected hard LOCKUP on cpu 2 Mar 12 23:42:03 hydra kernel: [58068.751056] Modules linked in: usb_storage uas ipv6 it87 hwmon_vid fglrx(PO) uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core joydev radeon i2c_al go_bit ttm drm_kms_helper drm r8169 xhci_hcd ata_generic pata_acpi i2c_piix4 mii i2c_core pata_atiixp wmi serio_raw k10temp powernow_k8 pcspkr mperf freq_table Mar 12 23:42:03 hydra kernel: [58068.945979] Pid: 720, comm: cc1 Tainted: P O 3.4.9-gentoo #2 Mar 12 23:42:03 hydra kernel: [58068.946053] Call Trace: Mar 12 23:42:03 hydra kernel: [58069.054704][] ? warn_slowpath_common+0x78/0x8c Mar 12 23:42:03 hydra kernel: [58069.231277] [] ? warn_slowpath_fmt+0x45/0x4a Mar 12 23:42:03 hydra kernel: [58069.271020] [] ? watchdog_overflow_callback+0x93/0x9e Mar 12 23:42:03 hydra kernel: [58069.271135] [] ? touch_nmi_watchdog+0x62/0x62 Mar 12 23:42:03 hydra kernel: [58069.293566] [] ? __perf_event_overflow+0x12c/0x1ae Mar 12 23:42:03 hydra kernel: [58069.293689] [] ? perf_event_update_userpage+0x13/0xbf Mar 12 23:42:03 hydra kernel: [58069.293811] [] ? x86_pmu_handle_irq+0xbe/0xf3 Mar 12 23:42:03 hydra kernel: [58069.293939] [] ? nmi_handle.isra.4+0x3e/0x61 Mar 12 23:42:03 hydra kernel: [58069.294038] [] ? do_nmi+0x9f/0x287 Mar 12 23:42:03 hydra kernel: [58069.294139] [] ? end_repeat_nmi+0x1a/0x1e Mar 12 23:42:03 hydra kernel: [58069.294253] [] ? _raw_spin_lock_irq+0x6/0x6 Mar 12 23:42:03 hydra kernel: [58069.294357] [] ? _raw_spin_lock_irq+0x6/0x6 Mar 12 23:42:03 hydra kernel: [58069.314699] [] ? _raw_spin_lock_irq+0x6/0x6 Mar 12 23:42:03 hydra kernel: [58069.318869] <> [] ? ntp_tick_length+0x23/0x28 Mar 12 23:42:03 hydra kernel: [58069.319051] [] ? do_timer+0x89/0x465 Mar 12 23:42:03 hydra kernel: [58069.319185] [] ? tick_do_update_jiffies64+0x74/0x98 Mar 12 23:42:03 hydra kernel: [58069.319300] [] ? tick_sched_timer+0x3f/0x8d Mar 12 23:42:03 hydra kernel: [58069.319424] [] ? __run_hrtimer.isra.27+0x4b/0xa3 Mar 12 23:42:03 hydra kernel: [58069.319547] [] ? hrtimer_interrupt+0xd9/0x1c9 Mar 12 23:42:03 hydra kernel: [58069.319655] [] ? smp_apic_timer_interrupt+0x6e/0x80 Mar 12 23:42:03 hydra kernel: [58069.319750] [] ? apic_timer_interrupt+0x67/0x70 Mar 12 23:42:03 hydra kernel: [58069.319810] Mar 12 23:42:03 hydra kernel: [58069.324331] ---[ end trace b1a58589d91a0dec ]--- Mar 12 23:58:02 hydra kernel: [59023.803433] [ cut here ] Mar 12 23:58:02 hydra kernel: [59024.963950] [ cut here ] Mar 12 23:58:02 hydra kernel: [59025.152834] WARNING: at kernel/watchdog.c:241 watchdog_overflow_callback+0x93/0x9e() Mar 12 23:58:02 hydra kernel: [59025.152895] Hardware name: GA-990FXA-D3 Mar 12 23:58:02 hydra kernel: [59025.152939] Watchdog detected hard LOCKUP on cpu 4 Mar 12 2