Re: Memory issues E450

2003-06-03 Thread Rene van Dijk
Hi,

 Looks to me bad memory in one of the banks, try booting with less memory
 (remove Dimms).

I am affraid this could be the problem, now I need to find out which bank could 
be faulty.
To bad the crashes are at random and I don't have a trouble shooting turnaround 
time could
be several weeks. I need a stable system before I go on my summer vacation. My 
team
members doesn't have a single clue how to reboot/boot the system... and what to 
do if its
waiting for a fsck by hand...

Memory is 3rd party (solair) not covered by sun waranty


Greets,

Rene
-- 
RT[F]M van Dijk



Re: Memory issues E450

2003-06-03 Thread Igor TAmara
Hi, I'm having problems with an Enterprise 450, it shows the same error
discussed on this thread:

May 20 15:07:07 sirio kernel: CPU[1]: Correctable ECC Error AFSR[18810] 
AFAR[0f551f50] UDBL[dd] UDBH[1c8]
May 20 15:07:07 sirio kernel: CPU[1]: UDBH Syndrome[3] Memory Module 190x

My kernel info
Linux sirio 2.4.18 #2 SMP Thu Dec 12 10:15:11 COT 2002 sparc64 unknown

I ran test-all from the obp, but wasn't able to run test /memory from
there.

The machine shows this message, but usually doesn't freezes.

Which other test would I run in order to get a clue?

Thanks in advance

Martin  I don't know if there is a memtest for sparc, but the Sun hardware 
monitors 
Martin  its memory all the time.
Martin IIRC test /memory from OpenBoot should do the trick.
Martin 
Martin Sweet Dreams,
Martin  - Martin
Martin  
Martin -- 
Martin Martin
Martin [EMAIL PROTECTED]
Martin Seasons change, things come to pass
Martin 
Martin 
Martin -- 
Martin To UNSUBSCRIBE, email to [EMAIL PROTECTED]
Martin with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Martin 

-- 
-- 39 cosas que no quisieras escuchar de tu administrador de red --
 (22/39) compi qué?
http://www.tamarapatino.org/igor/jokes/nomyadminplease.php



Re: Memory issues E450

2003-06-03 Thread Rene van Dijk
 The machine shows this message, but usually doesn't freezes.

In my case the machine does freeze, to be exact, the system isn't reachable by 
ethernet,
normal processes (mrtg collection) are not running. If you hookup a monitor the 
errors are
rolling over the display.

Regards

Rene

-- 
RT[F]M van Dijk



Memory issues E450

2003-06-02 Thread Rene van Dijk
Hi,

I have some issues with a Sun E450 running debian woody. The system is used
as a network collection system, (mrtg/rrdtool/mac-address collection
scripts). Since a few months the E450 is crashing at random times, from
several weeks uptime to a few hours.

Migrating the collection cronjobs from root to a regular user doesn't help,
memory is perhaps the problem.

Is there any way to check the main memory (1Gb)?

uname -a
Linux swizzy 2.4.18 #2 Thu Apr 11 14:37:17 EDT 2002 sparc64 unknown

Regards

Rene



Re: Memory issues E450

2003-06-02 Thread Rene van Dijk
Hi,

Here is the kernal message,

kern.log.0:Apr 28 18:03:45 swizzy kernel: CPU[0]: Correctable ECC Error 
AFSR[18830]
AFAR[866f6bd0] UDBL[c8] UDBH[3f0]
kern.log.0:Apr 28 18:03:45 swizzy kernel: CPU[0]: UDBH Syndrome[4a] Memory 
Module 190x
kern.log.0:Apr 28 18:03:45 swizzy kernel: data_access_exception: 
SFSR[00801009]
SFAR[f80093d1dddc], going.
kern.log.0:Apr 28 18:03:45 swizzy kernel:   \|/  \|/
kern.log.0:Apr 28 18:03:45 swizzy kernel:   @'/ .. \`@
kern.log.0:Apr 28 18:03:45 swizzy kernel:   /_| \__/ |_\
kern.log.0:Apr 28 18:03:45 swizzy kernel:  \__U_/
kern.log.0:Apr 28 18:03:45 swizzy kernel: mrtg(23378): Dax
kern.log.0:Apr 28 18:03:45 swizzy kernel: TSTATE: 009911009607 TPC: 
00479da0
TNPC: 00479da4 Y: Not t
ainted
kern.log.0:Apr 28 18:03:45 swizzy kernel: g0: 07e2 g1: 
0005 g2:
 g3: 
kern.log.0:Apr 28 18:03:45 swizzy kernel: g4: f800 g5: 
0002 g6:
f800907ac000 g7: 0002
kern.log.0:Apr 28 18:03:45 swizzy kernel: o0: 137bc6fa o1: 
03e017fc o2:
f8009fa0 o3: 0001
kern.log.0:Apr 28 18:03:45 swizzy kernel: o4: 00708000 o5: 
00708000 sp:
f800907af141 ret_pc: 0046fd64
kern.log.0:Apr 28 18:03:45 swizzy kernel: l0: eefff80093d1dd78 l1: 
f8009d065b60 l2:
0011 l3: f8009fb49e90
kern.log.0:Apr 28 18:03:45 swizzy kernel: l4: 2b112b3a l5: 
00625720 l6:
f80002bc3024 l7: 7031c2ac
kern.log.0:Apr 28 18:03:45 swizzy kernel: i0: eefff80093d1dd60 i1: 
f800907afb80 i2:
 i3: 
kern.log.0:Apr 28 18:03:45 swizzy kernel: i4: 1000 i5: 
 i6:
f800907af201 i7: 0046ff68
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[0046ff68]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[0047083c]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00470b2c]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00471214]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[004646ec]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[004319a8]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00410af4]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[702b2090]
kern.log.0:Apr 28 18:03:45 swizzy kernel: Instruction DUMP: b0043fe8  11001895  
aa122320
d006207c 80a20014  12480031  e05c  d0
5e2010  80a20011

-- 
RT[F]M van Dijk



Re: Memory issues E450

2003-06-02 Thread Bas van den Heuvel
On Tuesday 03 June 2003 00:55, Rene van Dijk wrote:
 Hi,

 Here is the kernal message,

 kern.log.0:Apr 28 18:03:45 swizzy kernel: CPU[0]: Correctable ECC Error
 AFSR[18830] AFAR[866f6bd0] UDBL[c8] UDBH[3f0]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: CPU[0]: UDBH Syndrome[4a] Memory
 Module 190x kern.log.0:Apr 28 18:03:45 swizzy kernel:
 data_access_exception: SFSR[00801009] SFAR[f80093d1dddc],
 going.
 kern.log.0:Apr 28 18:03:45 swizzy kernel:   \|/  \|/
 kern.log.0:Apr 28 18:03:45 swizzy kernel:   @'/ .. \`@
 kern.log.0:Apr 28 18:03:45 swizzy kernel:   /_| \__/ |_\
 kern.log.0:Apr 28 18:03:45 swizzy kernel:  \__U_/
 kern.log.0:Apr 28 18:03:45 swizzy kernel: mrtg(23378): Dax
 kern.log.0:Apr 28 18:03:45 swizzy kernel: TSTATE: 009911009607 TPC:
 00479da0 TNPC: 00479da4 Y: Not t
 ainted
 kern.log.0:Apr 28 18:03:45 swizzy kernel: g0: 07e2 g1:
 0005 g2:  g3: 
 kern.log.0:Apr 28 18:03:45 swizzy kernel: g4: f800 g5:
 0002 g6: f800907ac000 g7: 0002
 kern.log.0:Apr 28 18:03:45 swizzy kernel: o0: 137bc6fa o1:
 03e017fc o2: f8009fa0 o3: 0001
 kern.log.0:Apr 28 18:03:45 swizzy kernel: o4: 00708000 o5:
 00708000 sp: f800907af141 ret_pc: 0046fd64
 kern.log.0:Apr 28 18:03:45 swizzy kernel: l0: eefff80093d1dd78 l1:
 f8009d065b60 l2: 0011 l3: f8009fb49e90
 kern.log.0:Apr 28 18:03:45 swizzy kernel: l4: 2b112b3a l5:
 00625720 l6: f80002bc3024 l7: 7031c2ac
 kern.log.0:Apr 28 18:03:45 swizzy kernel: i0: eefff80093d1dd60 i1:
 f800907afb80 i2:  i3: 
 kern.log.0:Apr 28 18:03:45 swizzy kernel: i4: 1000 i5:
  i6: f800907af201 i7: 0046ff68
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[0046ff68]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[0047083c]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00470b2c]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00471214]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[004646ec]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[004319a8]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[00410af4]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Caller[702b2090]
 kern.log.0:Apr 28 18:03:45 swizzy kernel: Instruction DUMP: b0043fe8 
 11001895  aa122320 d006207c 80a20014  12480031  e05c  d0
 5e2010  80a20011


Looks to me bad memory in one of the banks, try booting with less memory 
(remove Dimms).

I also saw this messages ( in a solaris kind of way) and that ended to be a 
bad cpu !

I don't know if there is a memtest for sparc, but the Sun hardware monitors 
its memory all the time.

Greetings,

Bas



Re: Memory issues E450

2003-06-02 Thread Martin
 I don't know if there is a memtest for sparc, but the Sun hardware monitors 
 its memory all the time.
IIRC test /memory from OpenBoot should do the trick.

Sweet Dreams,
 - Martin
 
-- 
Martin
[EMAIL PROTECTED]
Seasons change, things come to pass