Re: Machine Check Exception on Opteron 265

2007-04-17 Thread Espen Fjellvær Olsen
Alan Cox wrote:
> On Sat, 14 Apr 2007 16:58:43 +0200
> Espen Fjellvær Olsen <[EMAIL PROTECTED]> wrote:
>
>   
>> Hi!
>> Today our Opteron 265, 2x2, paniced after many months uptime, giving
>> only this error message:
>>
>> HARDWARE ERROR
>> CPU 2: Machine Check Exception:4 Bank 4: b60a20010813
>> TSC 6bb9fd0142921a ADDR a891e9b8
>> This is not a software problem!
>> 
*snip*
> Consult your hardware vendor but if its a single event in a year it might
> be anything - even cosmic rays.
>   
Yeah, we have had more crashes now, and have removed some of our DIMMs
in hope of getting a stable system again.
And ofcourse running memtest on those DIMMs. Hope it is one of those,
and not one the CPUs =)

-- 
Mvh
Espen Fjellvær Olsen
Drift @ Tihlde
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Machine Check Exception on Opteron 265

2007-04-17 Thread Espen Fjellvær Olsen
Alan Cox wrote:
 On Sat, 14 Apr 2007 16:58:43 +0200
 Espen Fjellvær Olsen [EMAIL PROTECTED] wrote:

   
 Hi!
 Today our Opteron 265, 2x2, paniced after many months uptime, giving
 only this error message:

 HARDWARE ERROR
 CPU 2: Machine Check Exception:4 Bank 4: b60a20010813
 TSC 6bb9fd0142921a ADDR a891e9b8
 This is not a software problem!
 
*snip*
 Consult your hardware vendor but if its a single event in a year it might
 be anything - even cosmic rays.
   
Yeah, we have had more crashes now, and have removed some of our DIMMs
in hope of getting a stable system again.
And ofcourse running memtest on those DIMMs. Hope it is one of those,
and not one the CPUs =)

-- 
Mvh
Espen Fjellvær Olsen
Drift @ Tihlde
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Machine Check Exception on Opteron 265

2007-04-14 Thread Espen Fjellvær Olsen
Hi!
Today our Opteron 265, 2x2, paniced after many months uptime, giving
only this error message:

HARDWARE ERROR
CPU 2: Machine Check Exception:4 Bank 4: b60a20010813
TSC 6bb9fd0142921a ADDR a891e9b8
This is not a software problem!

mcelog --ascii gives this on the above error:

HARDWARE ERROR
CPU 2 4 northbridge TSC 6bb9fd0142921a
  Northbridge ECC error
  ECC syndrome = 14
   bit32 = err cpu0
   bit45 = uncorrected ecc error
   bit57 = processor context corrupt
   bit61 = error uncorrected
  bus error 'local node origin, request didn't time out
  generic read mem transaction
  memory access, level generic'
STATUS b60a20010813 MCGSTATUS 4
This is not a software problem!


As far as we know there wasnt any unuasal activity on the server at the
time.
We updated glibc yesterday, but that shouldnt really cause such a problem.
So now we wonder if this might be an MCE bug, or really a HW problem,
and if it is one of the CPUs, or the RAM thats faulty.
We are running 2.6.18.

--
Mvh
Espen Fjellvær Olsen
[EMAIL PROTECTED]
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Machine Check Exception on Opteron 265

2007-04-14 Thread Espen Fjellvær Olsen
Hi!
Today our Opteron 265, 2x2, paniced after many months uptime, giving
only this error message:

HARDWARE ERROR
CPU 2: Machine Check Exception:4 Bank 4: b60a20010813
TSC 6bb9fd0142921a ADDR a891e9b8
This is not a software problem!

mcelog --ascii gives this on the above error:

HARDWARE ERROR
CPU 2 4 northbridge TSC 6bb9fd0142921a
  Northbridge ECC error
  ECC syndrome = 14
   bit32 = err cpu0
   bit45 = uncorrected ecc error
   bit57 = processor context corrupt
   bit61 = error uncorrected
  bus error 'local node origin, request didn't time out
  generic read mem transaction
  memory access, level generic'
STATUS b60a20010813 MCGSTATUS 4
This is not a software problem!


As far as we know there wasnt any unuasal activity on the server at the
time.
We updated glibc yesterday, but that shouldnt really cause such a problem.
So now we wonder if this might be an MCE bug, or really a HW problem,
and if it is one of the CPUs, or the RAM thats faulty.
We are running 2.6.18.

--
Mvh
Espen Fjellvær Olsen
[EMAIL PROTECTED]
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
On 07/08/05, Patrick McHardy <[EMAIL PROTECTED]> wrote:
> Could be related to the refcnt underflow with conntrack event
> notifications enabled. If you have CONFIG_IP_NF_CONNTRACK_EVENTS
> enabled please try this patch.
> 

I can confirm that that patch solved my problems, thank you :)

-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
On 07/08/05, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> On Sun, Aug 07, 2005 at 07:12:00PM +0200, Espen Fjellvær Olsen wrote:
> 
> > After execing "iptables -A INPUT -j DROP" my computer crashes hard. It
> > dosent hang immediately, but after a couple of seconds.
> > The machine is an amd64, running a clean x86_64 environment.
> > uname -a: Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44
> > CEST 2005 x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD
> >...
> 
> Is this reproducible or did it happen only once?

It is reproducible, happens each time when running "iptables -A INPUT
-j DROP", other rules like "iptables -A INPUT -m --state
ESTABLISHED,RELATED -p tcp --dport 22 -j ACCEPT" works well tho.

> Are there any messages that might give a hint where to search for the
> problem?

The kernel log dont give any messages before the crash, and since the
computer crash hard i cant check for relevant messages after the crash
;)
 
> You are reporting this against 2.6.13-rc4-mm1, but are attaching a
> .config of 2.6.13-rc5-mm1. Which kernels are affected, and which are
> not?

Im sorry about that glitch, recently compiled 2.6.13-rc5-mm1, but i
got a kernel panic at boot, havent looked into that yet, but it might
be related to ACPI.

The config for rc4-mm1 and rc5-mm1 is similar.

> Does it still happen if you compile your kernel with preemption
> disabled?

Havent tried this yet, but ill do it right away.

> Please send the output of ./scripts/ver_linux .

If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44 CEST 2005
x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD GNU/Linux

Gnu C  3.4.4
Gnu make   3.80
binutils   2.16.1
util-linux 2.12q
mount  2.12q
module-init-tools  3.2-pre7
e2fsprogs  1.38
reiserfsprogs  line
reiser4progs   line
xfsprogs   2.6.25
Linux C Library2.3.5
Dynamic linker (ldd)   2.3.5
Procps 3.2.5
Net-tools  1.60
Kbd1.12
Sh-utils   5.2.1
udev   065
Modules Loaded iptable_filter ip_tables snd_seq_midi
snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event
snd_seq_midi_emul snd_seq snd_pcm_oss snd_mixer_oss rtc ntfs
snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore
usb_storage uhci_hcd usbcore


-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
After execing "iptables -A INPUT -j DROP" my computer crashes hard. It
dosent hang immediately, but after a couple of seconds.
The machine is an amd64, running a clean x86_64 environment.
uname -a: Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44
CEST 2005 x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD
GNU/Linux
lspci:
:00:00.0 Host bridge: VIA Technologies, Inc. VT8385 [K8T800 AGP]
Host Bridge (rev 01)
:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge
[K8T800/K8T890 South]
:00:09.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
:00:09.1 Input device controller: Creative Labs SB Audigy
MIDI/Game port (rev 04)
:00:09.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire
Port (rev 04)
:00:0a.0 Ethernet controller: Marvell Technology Group Ltd.
88E8001 Gigabit Ethernet Controller (rev 13)
:00:0e.0 Multimedia video controller: Conexant CX23880/1/2/3 PCI
Video and Audio Decoder (rev 05)
:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420
SATA RAID Controller (rev 80)
:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
:00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[KT600/K8T800/K8T890 South]
:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] HyperTransport Technology Configuration
:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Address Map
:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] DRAM Controller
:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Miscellaneous Control
:01:00.0 VGA compatible controller: ATI Technologies Inc R420 JK
[Radeon X800]
:01:00.1 Display controller: ATI Technologies Inc: Unknown device 4a6b

.config is attached.

--
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway


.config
Description: Binary data


2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
After execing iptables -A INPUT -j DROP my computer crashes hard. It
dosent hang immediately, but after a couple of seconds.
The machine is an amd64, running a clean x86_64 environment.
uname -a: Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44
CEST 2005 x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD
GNU/Linux
lspci:
:00:00.0 Host bridge: VIA Technologies, Inc. VT8385 [K8T800 AGP]
Host Bridge (rev 01)
:00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge
[K8T800/K8T890 South]
:00:09.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
:00:09.1 Input device controller: Creative Labs SB Audigy
MIDI/Game port (rev 04)
:00:09.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire
Port (rev 04)
:00:0a.0 Ethernet controller: Marvell Technology Group Ltd.
88E8001 Gigabit Ethernet Controller (rev 13)
:00:0e.0 Multimedia video controller: Conexant CX23880/1/2/3 PCI
Video and Audio Decoder (rev 05)
:00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420
SATA RAID Controller (rev 80)
:00:0f.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
:00:10.0 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.1 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.2 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.3 USB Controller: VIA Technologies, Inc. VT82x UHCI USB
1.1 Controller (rev 81)
:00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
:00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge
[KT600/K8T800/K8T890 South]
:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] HyperTransport Technology Configuration
:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Address Map
:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] DRAM Controller
:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Miscellaneous Control
:01:00.0 VGA compatible controller: ATI Technologies Inc R420 JK
[Radeon X800]
:01:00.1 Display controller: ATI Technologies Inc: Unknown device 4a6b

.config is attached.

--
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway


.config
Description: Binary data


Re: 2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
On 07/08/05, Adrian Bunk [EMAIL PROTECTED] wrote:
 On Sun, Aug 07, 2005 at 07:12:00PM +0200, Espen Fjellvær Olsen wrote:
 
  After execing iptables -A INPUT -j DROP my computer crashes hard. It
  dosent hang immediately, but after a couple of seconds.
  The machine is an amd64, running a clean x86_64 environment.
  uname -a: Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44
  CEST 2005 x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD
 ...
 
 Is this reproducible or did it happen only once?

It is reproducible, happens each time when running iptables -A INPUT
-j DROP, other rules like iptables -A INPUT -m --state
ESTABLISHED,RELATED -p tcp --dport 22 -j ACCEPT works well tho.

 Are there any messages that might give a hint where to search for the
 problem?

The kernel log dont give any messages before the crash, and since the
computer crash hard i cant check for relevant messages after the crash
;)
 
 You are reporting this against 2.6.13-rc4-mm1, but are attaching a
 .config of 2.6.13-rc5-mm1. Which kernels are affected, and which are
 not?

Im sorry about that glitch, recently compiled 2.6.13-rc5-mm1, but i
got a kernel panic at boot, havent looked into that yet, but it might
be related to ACPI.

The config for rc4-mm1 and rc5-mm1 is similar.

 Does it still happen if you compile your kernel with preemption
 disabled?

Havent tried this yet, but ill do it right away.

 Please send the output of ./scripts/ver_linux .

If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux gentoo 2.6.13-rc4-mm1 #1 PREEMPT Thu Aug 4 01:01:44 CEST 2005
x86_64 AMD Athlon(tm) 64 Processor 3400+ AuthenticAMD GNU/Linux

Gnu C  3.4.4
Gnu make   3.80
binutils   2.16.1
util-linux 2.12q
mount  2.12q
module-init-tools  3.2-pre7
e2fsprogs  1.38
reiserfsprogs  line
reiser4progs   line
xfsprogs   2.6.25
Linux C Library2.3.5
Dynamic linker (ldd)   2.3.5
Procps 3.2.5
Net-tools  1.60
Kbd1.12
Sh-utils   5.2.1
udev   065
Modules Loaded iptable_filter ip_tables snd_seq_midi
snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_event
snd_seq_midi_emul snd_seq snd_pcm_oss snd_mixer_oss rtc ntfs
snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm
snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore
usb_storage uhci_hcd usbcore


-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.13-rc4-mm1: iptables DROP crashes the computer

2005-08-07 Thread Espen Fjellvær Olsen
On 07/08/05, Patrick McHardy [EMAIL PROTECTED] wrote:
 Could be related to the refcnt underflow with conntrack event
 notifications enabled. If you have CONFIG_IP_NF_CONNTRACK_EVENTS
 enabled please try this patch.
 

I can confirm that that patch solved my problems, thank you :)

-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.11-rc2-mm1

2005-01-25 Thread Espen Fjellvær Olsen
On Tue, 25 Jan 2005 19:41:39 +0100, Pavel Machek <[EMAIL PROTECTED]> wrote:
> Heh, on my system, I get no cursor, and no letters, either (this is
> vga text console). I *can* see the backgrounds, for example if I run
> aumix I see colored blocks... Framebuffer does not seem to work,
> either.
> 
> Letters are present for a while during boot; not sure what makes them
> go away.

I get the same thing, text disappairs after a second or something like
that. Framebuffer has no effect.

-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.11-rc2-mm1

2005-01-25 Thread Espen Fjellvær Olsen
On Tue, 25 Jan 2005 19:41:39 +0100, Pavel Machek [EMAIL PROTECTED] wrote:
 Heh, on my system, I get no cursor, and no letters, either (this is
 vga text console). I *can* see the backgrounds, for example if I run
 aumix I see colored blocks... Framebuffer does not seem to work,
 either.
 
 Letters are present for a while during boot; not sure what makes them
 go away.

I get the same thing, text disappairs after a second or something like
that. Framebuffer has no effect.

-- 
Mvh / Best regards
Espen Fjellvær Olsen
[EMAIL PROTECTED]
Norway
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/