Re: Freebsd 5.3 problem

2005-03-15 Thread Amandeep Pannu
Hi Jason
 On 03/14/05 15:34:59, Amandeep Pannu wrote:
 Hi Kris,

 I had this problem before and I changed the MB and the memory and
 today it
 did the same thing it did before.
 memtest doesnt give any errors.

 Thanks
 A



 Memtest86 right?  There is another that you run in an os like any other
 program.  Did you leave memtest86 running over night or the weekend?
 How are your temps under load?  Do you use a ups?

The system is in Co-lo so no power problems. Yes memtest86. I did run for
the whole weekend. No errors.
I need to check the loads if the system comes up. it just shut down.
It gave me this error message.
Mar 15 03:52:14 d03 kernel: amr0: bad slot 17 completed

I read under many lists that the amr drives dies under heavy loads.
But what about the system not going through post and giving

RAM R/W failure.

I am confused!!:(


  On Mon, Mar 14, 2005 at 12:23:59PM -0800, Amandeep Pannu wrote:
 
  HI all,
 
  I am running FreeBSd 5.3-REL
 
  Today my system simply locked up.  There was no error sent to
 console,
  to
  any logs, nor the monitor screen.  It was totally unresponsive to
  network,
  serial console, or keyboard.  After 4 power-cycles, we were unable
 to
  get
  past the BIOS as it was reporting RAM R/W error.  I have a
 screen
 shot
  of this from the serial port console, but it is the same as the
 one
 from
  before.  If I hit the F1
 
  Looks like hardware failure.
 
  Kris
 
  --
  In God we Trust -- all others must submit an X.509 certificate.
  -- Charles Forsythe [EMAIL PROTECTED]
  ___
  freebsd-hackers@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
  To unsubscribe, send any mail to freebsd-hackers-
 [EMAIL PROTECTED]
 


 --
 Amandeep.S
 [EMAIL PROTECTED]
 http://aman.chamkila.org
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-
 [EMAIL PROTECTED]









-- 
Amandeep.S
[EMAIL PROTECTED]
http://aman.chamkila.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem

2005-03-14 Thread Amandeep Pannu

HI all,

I am running FreeBSd 5.3-REL

Today my system simply locked up.  There was no error sent to console, to
any logs, nor the monitor screen.  It was totally unresponsive to network,
serial console, or keyboard.  After 4 power-cycles, we were unable to get
past the BIOS as it was reporting RAM R/W error.  I have a screen shot
of this from the serial port console, but it is the same as the one from
before.  If I hit the F1
key to continue, FreeBSD seemingly reports

  ACPI-0277: *** Warning: Invalid checksum in table [APIC] (98, sum
84 is not zero)

just before booting.  It is after the boot screen, but before the
copyright is displayed by the kernel.

Finally, I turned the machine off for about 2 minutes, then turned it back
on.  It was able to get through the BIOS RAM test and reboot cleanly, and
the file systems cleaned themselves up and the database did so as well,
and it appears to be running fine.


Any ideas what is going on.
Thanks
A

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem

2005-03-14 Thread Kris Kennaway
On Mon, Mar 14, 2005 at 12:23:59PM -0800, Amandeep Pannu wrote:
 
 HI all,
 
 I am running FreeBSd 5.3-REL
 
 Today my system simply locked up.  There was no error sent to console, to
 any logs, nor the monitor screen.  It was totally unresponsive to network,
 serial console, or keyboard.  After 4 power-cycles, we were unable to get
 past the BIOS as it was reporting RAM R/W error.  I have a screen shot
 of this from the serial port console, but it is the same as the one from
 before.  If I hit the F1

Looks like hardware failure.

Kris

--
In God we Trust -- all others must submit an X.509 certificate.
-- Charles Forsythe [EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem

2005-03-14 Thread Amandeep Pannu
Hi Kris,

I had this problem before and I changed the MB and the memory and today it
did the same thing it did before.
memtest doesnt give any errors.

Thanks
A


 On Mon, Mar 14, 2005 at 12:23:59PM -0800, Amandeep Pannu wrote:

 HI all,

 I am running FreeBSd 5.3-REL

 Today my system simply locked up.  There was no error sent to console,
 to
 any logs, nor the monitor screen.  It was totally unresponsive to
 network,
 serial console, or keyboard.  After 4 power-cycles, we were unable to
 get
 past the BIOS as it was reporting RAM R/W error.  I have a screen shot
 of this from the serial port console, but it is the same as the one from
 before.  If I hit the F1

 Looks like hardware failure.

 Kris

 --
 In God we Trust -- all others must submit an X.509 certificate.
 -- Charles Forsythe [EMAIL PROTECTED]
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to [EMAIL PROTECTED]



-- 
Amandeep.S
[EMAIL PROTECTED]
http://aman.chamkila.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem

2005-03-14 Thread Kris Kennaway
On Mon, Mar 14, 2005 at 12:34:59PM -0800, Amandeep Pannu wrote:
 Hi Kris,
 
 I had this problem before and I changed the MB and the memory and today it
 did the same thing it did before.

Continue to check power supply, CPU cooling, cabling, etc.

 memtest doesnt give any errors.

OK, that doesn't prove they don't exist though.

Kris
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem

2005-03-14 Thread Jason Henson
On 03/14/05 15:34:59, Amandeep Pannu wrote:
Hi Kris,
I had this problem before and I changed the MB and the memory and
today it
did the same thing it did before.
memtest doesnt give any errors.
Thanks
A

Memtest86 right?  There is another that you run in an os like any other  
program.  Did you leave memtest86 running over night or the weekend?   
How are your temps under load?  Do you use a ups?


 On Mon, Mar 14, 2005 at 12:23:59PM -0800, Amandeep Pannu wrote:

 HI all,

 I am running FreeBSd 5.3-REL

 Today my system simply locked up.  There was no error sent to
console,
 to
 any logs, nor the monitor screen.  It was totally unresponsive to
 network,
 serial console, or keyboard.  After 4 power-cycles, we were unable
to
 get
 past the BIOS as it was reporting RAM R/W error.  I have a  
screen
shot
 of this from the serial port console, but it is the same as the  
one
from
 before.  If I hit the F1

 Looks like hardware failure.

 Kris

 --
 In God we Trust -- all others must submit an X.509 certificate.
 -- Charles Forsythe [EMAIL PROTECTED]
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers- 
[EMAIL PROTECTED]


--
Amandeep.S
[EMAIL PROTECTED]
http://aman.chamkila.org
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers- 
[EMAIL PROTECTED]




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem: SCSI Errors ...Help

2005-03-10 Thread Amandeep Pannu
Hi all,

I am encountering these SCSI errors with FreeBSD 5.3-REL-p5
Any ideas what is going on.

ahd0: Adaptec AIC7902 Ultra320 SCSI adapter port
0x4000-0x40ff,0x4400-0x44ff mem 0xfc30-0xfc301fff irq 28
at device 2.0 on pci3
ahd0: [GIANT-LOCKED]
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs
ahd1: Adaptec AIC7902 Ultra320 SCSI adapter port
0x4800-0x48ff,0x4c00-0x4cff mem 0xfc302000-0xfc303fff irq 29 at device
2.1 on pci3
ahd1: [GIANT-LOCKED]
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs


(probe1:ahd0:0:1:0): No or incomplete CDB sent to device.
(probe1:ahd0:0:1:0): Protocol violation in Message-in phase.
Attempting to abort.
(probe1:ahd0:0:1:0): Abort Message Sent
(probe1:ahd0:0:1:0): SCB 14 - Abort Tag Completed.
found == 0x1
ahd0: Invalid Sequencer interrupt occurred.
  Dump Card State Begins 
ahd0: Dumping Card State at program address 0x23b Mode 0x0
Card was paused INTSTAT[0x0] SELOID[0x1] SELID[0x0] HS_MAILBOX[0x0]
INTCTL[0x80]:(SWTMINTMASK) SEQINTSTAT[0x0] SAVED_MODE[0x11]
DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE)
SCSISIGI[0x0]:(P_DATAOUT) SCSIPHASE[0x0] SCSIBUS[0x0]
LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE) SCSISEQ0[0x0]
SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI) SEQCTL0[0x0]
SEQINTCTL[0x6]:(INTMASK1|INTMASK2)
SEQ_FLAGS[0x0] SEQ_FLAGS2[0x0] QFREEZE_COUNT[0x3]
KERNEL_QFREEZE_COUNT[0x3] MK_MESSAGE_SCB[0xff00]
MK_MESSAGE_SCSIID[0xff] SSTAT0[0x0] SSTAT1[0x0] SSTAT2[0x0]
SSTAT3[0x0] PERRDIAG[0x0]
SIMODE1[0xa4]:(ENSCSIPERR|ENSCSIRST|ENSELTIMO) LQISTAT0[0x0]
LQISTAT1[0x0] LQISTAT2[0x0] LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x0]

SCB Count = 16 CMDS_PENDING = 0 LASTSCB 0x CURRSCB 0x9
NEXTSCB 0xff80 qinstart = 39 qinfifonext = 40
QINFIFO: 0xe
WAITING_TID_QUEUES:
Pending list:
  14 FIFO_USE[0x0] SCB_CONTROL[0x48]:(STATUS_RCVD|DISCENB)
SCB_SCSIID[0x17]
Total 1
Kernel Free SCB list: 9 15 1 2 3 4 5 6 7 8 10 11 12 13 0
Sequencer Complete DMA-inprog list:
Sequencer Complete list:
Sequencer DMA-Up and Complete list:
Sequencer On QFreeze and Complete list:


ahd0: FIFO0 Free, LONGJMP == 0x8000, SCB 0xf
SEQIMODE[0x3f]:
(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00,
SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)

ahd0: FIFO1 Free, LONGJMP == 0x8063, SCB 0x9
SEQIMODE[0x3f]:
(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS)
SEQINTSRC[0x0] DFCNTRL[0x0]
DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
SG_CACHE_SHADOW[0x2]:(LAST_SEG) SG_STATE[0x0] DFFSXFRCTL[0x0]
SOFFCNT[0x0] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00,
SHCNT = 0x0 HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
LQIN: 0x8 0x0 0x0 0xf 0x0 0x1 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
0x0 0x0 0x0 0x0 0x0 0x0
ahd0: LQISTATE = 0x0, LQOSTATE = 0x0, OPTIONMODE = 0x42
ahd0: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x1
ahd0: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0

SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
CCSCBCTL[0x4]:(CCSCBDIR)
ahd0: REG0 == 0x8060, SINDEX = 0x10e, DINDEX = 0x104
ahd0: SCBPTR == 0xf, SCB_NEXT == 0xff80, SCB_NEXT2 == 0xff33
CDB 12 20 0 80 88 86
STACK: 0x236 0x2 0x0 0x0 0x0 0x0 0x0 0x0 
Dump Card State Ends  ses0 at ahd0 bus 0
target 6 lun 0
ses0: SUPER GEM318 0 Fixed Processor SCSI-2 device
ses0: 3.300MB/s transfers
ses0: SAF-TE Compliant Device
Copied 18 bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0
0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0 Copied 18
bytes of sense data offset 12: 0x70 0x0 0x6 0x0 0x0 0x0 0x0
0xa 0x0 0x0 0x0 0x0 0x29 0x2 0x2 0x0 0x0 0x0

The system is running  5.3-REL-p5 with a custom kernel. I have
also tried GENERIC with the same results.

The Drives:

da0: SEAGATE ST336607LC 0007 Fixed Direct Access SCSI-3 device
da0: 320.000MB/s transfers (160.000MHz, offset 63, 16bit),
Tagged Queueing Enabled
da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)
da1 at ahd0 bus 0 target 1 lun 0
da1: SEAGATE ST336607LC 0006 Fixed Direct Access SCSI-3 device
da1: 320.000MB/s transfers (160.000MHz, offset 63, 16bit),
Tagged Queueing Enabled
da1: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C)

./diskinfo -v /dev/da0s1a
/dev/da0s1a
 512 # sectorsize
 28311552000 # mediasize in bytes (26G)
 55296000# mediasize in sectors
 3442# Cylinders according to firmware.
 255 # Heads according to firmware.
 63  # Sectors according to firmware.
Thanks in advance
Aman

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Freebsd 5.3 problem: SCSI Errors ...Help

2005-03-10 Thread Joseph Koshy
 (probe1:ahd0:0:1:0): No or incomplete CDB sent to device.
 (probe1:ahd0:0:1:0): Protocol violation in Message-in phase.
 Attempting to abort.
 (probe1:ahd0:0:1:0): Abort Message Sent
 (probe1:ahd0:0:1:0): SCB 14 - Abort Tag Completed.
 found == 0x1
 ahd0: Invalid Sequencer interrupt occurred.

I've seen these kinds of symptoms when the system was 
overheating or had electrical problems.

-- 
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]