Re: Tracking down BTX halted

2001-11-20 Thread David O'Brien

On Sat, Nov 17, 2001 at 05:41:25PM -0800, Peter Wemm wrote:
 The problem is that you cant *not* get dangerously-dedicated mode.  Our
 boot1 has got a dangerously-dedicated fdisk table unconditionally compiled
 in.  You can fix it so that it doesn't crash stuff, but we still shouldn't
 be forcing it on people like that.

Why haven't we just removed it?
 
-- 
-- David  ([EMAIL PROTECTED])

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-18 Thread Matthew Dillon


:   There is a bug in Adaptec BIOSen that they will not tolerate DD disks.
: 
:  Which controllers have this bug?  I've got a whole bunch of 7880 and 79xx
:  controllers with disks running in DD mode and never have had this problem.
: 
: Happens to me on L440GX+ boards.
:
:Also happens on IBM Netfinity servers with aic7896/97 controllers. It
:*may* be the case that this only happens with certain Adaptec BIOS
:versions, but it's very real.

I started getting this on DELL's a year and a half ago.  The 
dangerously dedicated partition was to blame (which is what eventually
led to my fixing the disklabel auto code).  Well, ok, the Adaptec BIOS
was to blame, but there isn't much we can do about it so...

In my case it wasn't enough to repartition the disk.  The old 
data still screwed up the BIOS.  I had to physically wipe (with dd)
the first couple of sectors before repartitioning it use the
fdisk -BI / disklabel -r -w da0s1 auto combination.

-Matt
Matthew Dillon 
[EMAIL PROTECTED]

:Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]
:
:To Unsubscribe: send mail to [EMAIL PROTECTED]
:with unsubscribe freebsd-hackers in the body of the message
:


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-17 Thread Doug White

On Fri, 16 Nov 2001, Matthew Emmerton wrote:

  There is a bug in Adaptec BIOSen that they will not tolerate DD disks.

 Which controllers have this bug?  I've got a whole bunch of 7880 and 79xx
 controllers with disks running in DD mode and never have had this problem.

Happens to me on L440GX+ boards.

Doug White|  FreeBSD: The Power to Serve
[EMAIL PROTECTED] |  www.FreeBSD.org


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-17 Thread sthaug

   There is a bug in Adaptec BIOSen that they will not tolerate DD disks.
 
  Which controllers have this bug?  I've got a whole bunch of 7880 and 79xx
  controllers with disks running in DD mode and never have had this problem.
 
 Happens to me on L440GX+ boards.

Also happens on IBM Netfinity servers with aic7896/97 controllers. It
*may* be the case that this only happens with certain Adaptec BIOS
versions, but it's very real.

Steinar Haug, Nethelp consulting, [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-17 Thread Peter Wemm

Doug White wrote:
 On Fri, 16 Nov 2001, Sandeep Joshi wrote:
 
  I changed the disklabels on a few SCSI disks and now
  I keep getting these BTX halted messages every time
  I reboot.
 
 Lemme guess, you're running them in 'dangerously dedicated' mode.
 
 There is a bug in Adaptec BIOSen that they will not tolerate DD disks.
 
 Put proper partition tables on them and they should behave.

It's not so much a BIOS bug but the fact that we specify an illegal
geometry in the fake fdisk table.  This can cause bios's that are
about to use the fdisk table to emulate C/H/S geometry to do a divide by
zero.  This is usually what shows up in the BTX faults.

Both of the posted BTX faults have int= which is a divide-by-zero.
see i386/i386/machdep.c for identifying the int values:
  setidt(0, IDTVEC(div),  SDT_SYS386TGT, SEL_KPL, GSEL(GCODE_SEL, SEL_KPL));
 ^^^  int 0 = 'div' (divide by zero)

The specific problem is that the fake fdisk table specifies 256 heads, which
is not possible to represent in the int13 interface.  The maximum allowed
is 255 heads.  (hint: 256 in an 8-bit register rounds to 0, hence the
divide-by-zero).

This is the cause of EFI (ia64) hangs, the infamous Thinkpad T20/A20 series
system lockups, countless bios crashes, etc.

The problem is that you cant *not* get dangerously-dedicated mode.  Our
boot1 has got a dangerously-dedicated fdisk table unconditionally compiled
in.  You can fix it so that it doesn't crash stuff, but we still shouldn't
be forcing it on people like that.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-17 Thread Peter Wemm

Doug White wrote:
 On Fri, 16 Nov 2001, Matthew Emmerton wrote:
 
   There is a bug in Adaptec BIOSen that they will not tolerate DD disks.
 
  Which controllers have this bug?  I've got a whole bunch of 7880 and 79xx
  controllers with disks running in DD mode and never have had this problem.
 
 Happens to me on L440GX+ boards.

It happens randomly all over the place.  I think it also depends on the
default bios settings too.. ie: whether you've got Large Drive support on
or off, and what mode it is in.

FWIW, this puts a legal geometry back into boot1:

+++ boot1.s 2001/11/18 01:42:26
@@ -353,7 +353,7 @@
 
.fill 0x30,0x1,0x0
 part4: .byte 0x80, 0x00, 0x01, 0x00
-   .byte 0xa5, 0xff, 0xff, 0xff
+   .byte 0xa5, 0xfe, 0xff, 0xff# 1023 cyl, 255 heads, 63 sec
.byte 0x00, 0x00, 0x00, 0x00
.byte 0x50, 0xc3, 0x00, 0x00# 5 sectors long, bleh

But that stops the MBR kernel code from recognizing it as a bogus DD table 
and it will try to interpret it, thinking that you only have a 25 meg
drive.

I wish we could take this crud out with a shotgun.  If somebody
is going to boot off a disk, obey the rules with a real MBR and fdisk
table.  If somebody doesn't want a fdisk table, then dont fudge around
with a fake one, and dont pretend that it is bootable.

Cheers,
-Peter
--
Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
All of this is for nothing if we don't go to the stars - JMS/B5


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



RE: Tracking down BTX halted

2001-11-16 Thread John Baldwin


On 16-Nov-01 Sandeep Joshi wrote:
 I changed the disklabels on a few SCSI disks and now 
 I keep getting these BTX halted messages every time 
 I reboot.
 
 They dont occur if I disconnect those disks.  They occur 
 even after I rewrite those labels.  Its not dedicated mode 
 or whatever now..
 
 I am currently running 4.4-REL.
 
 I am willing to post the entire system configuration
 but it would be really nice if instead someone could 
 tell me _HOW_ to determine the problem.
 
 A colleague tells me its possible to track the problem 
 from the registers(cs,es,..) in the message dump.

Yes, especially if you provide that info in the e-mail. :)

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Mike Smith

 I changed the disklabels on a few SCSI disks and now 
 I keep getting these BTX halted messages every time 
 I reboot.

The most likely cause of this is that you're messing up the disks to the
point that your BIOS (probably your SCSI controller BIOS) is crashing when
it tries to read them.

 They dont occur if I disconnect those disks.  They occur 
 even after I rewrite those labels.  Its not dedicated mode 
 or whatever now..

You must have missed something. 8)

 I am willing to post the entire system configuration
 but it would be really nice if instead someone could 
 tell me _HOW_ to determine the problem.

If the above guess is correct, you're looking at a BIOS bug.  You might
try updating your SCSI controller/motherboard BIOS, but there's no
guarantee that'll fix the problem.  The only real solution is to hot-plug
the disks, camcontrol rescan, then dd zeroes over the heads of the disks
and then re-lable them safely.

 A colleague tells me its possible to track the problem 
 from the registers(cs,es,..) in the message dump.

That would tell you where, exactly, the crash is occuring, and would make 
it possible to absolutely finger the culprit.  What's the value of the cs 
register?  If it's above 0xa000, then the BIOS is at fault.

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
   V I C T O R Y   N O T   V E N G E A N C E



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Doug White

On Fri, 16 Nov 2001, Sandeep Joshi wrote:

 I changed the disklabels on a few SCSI disks and now
 I keep getting these BTX halted messages every time
 I reboot.

Lemme guess, you're running them in 'dangerously dedicated' mode.

There is a bug in Adaptec BIOSen that they will not tolerate DD disks.

Put proper partition tables on them and they should behave.

Doug White|  FreeBSD: The Power to Serve
[EMAIL PROTECTED] |  www.FreeBSD.org


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Matthew Emmerton

 Doug Write wrote:
  On Fri, 16 Nov 2001, Sandeep Joshi wrote:

  I changed the disklabels on a few SCSI disks and now
  I keep getting these BTX halted messages every time
  I reboot.

 Lemme guess, you're running them in 'dangerously dedicated' mode.

 There is a bug in Adaptec BIOSen that they will not tolerate DD disks.

Which controllers have this bug?  I've got a whole bunch of 7880 and 79xx
controllers with disks running in DD mode and never have had this problem.

--
Matt Emmerton


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Sandeep Joshi


Mike,

Mike Smith wrote:
 The only real solution is to hot-plug
 the disks, camcontrol rescan, then dd zeroes over the heads of the disks
 and then re-lable them safely.

Yep, that method works for now.  

I was hoping its easy enough to crack this myself (with some online 
tips  references) but John Baldwin convinced me otherwise :-)

So here's the entire configuration attached..

SUMMARY: 
The boot disk (ad0 in attachment) is not the problem.
There are two other IBM SCSI disks attached to two Adaptec cards.
Its these other two SCSI disks-da0,da1 which are empty and
whose disklabels I played with.  These cause a BTX error if
they are plugged in during a boot.

TIA,
-Sandeep


---
Error message when the SCSI disk is attached to 
the AIC-7896 SCSI BIOS v2.20s1B1 

int=  err=  efl=00030246  eip=1d29
eax=  ebx=0386  ecx=  edx=
esi=9e3e  edi=1c09  ebp=038e  esp=0382
cs=c800  ds=0040  ed=9e3efs=  gs=  ss=9e3e
cs:eip=f7 f1 33 d2 8a 4e f6 f7-f1 3d ff 03 76 03 b8 ff
ss:esp=00 00 3f 00 00 00 00 00-00 00 02 00 22 0a 00 c8
---
Error message when the SCSI disk is attached to 
the AHA2940U2W SCSI BIOS v2.20 :

int=  err=  efl=00030246  eip=1d29
eax=  ebx=038e  ecx=  edx=
esi=9e3e  edi=1a3e  ebp=0396  esp=038a
cs=cd80  ds=0040  ed=9e3efs=  gs=  ss=9e3e
cs:eip=f7 f1 33 d2 8a 4e f6 f7-f1 3d ff 03 76 03 b8 ff
ss:esp=00 00 3f 00 00 00 00 00-00 00 02 00 df 09 80 cd
---

Dmesg:

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.4-RELEASE #2: Mon Nov 12 13:12:46 EST 2001
[EMAIL PROTECTED]:/usr/src/sys/compile/NSTG7.SMP
Timecounter i8254  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (696.41-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x681  Stepping = 1
  
Features=0x387fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE
real memory  = 805240832 (786368K bytes)
avail memory = 778305536 (760064K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  1, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  0, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec0
Preloaded elf kernel kernel at 0xc0535000.
Pentium Pro MTRR support enabled
md0: Malloc disk
Using $PIR table, 12 entries at 0xc00fdf00
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Intel 82443GX host to PCI bridge on motherboard
IOAPIC #0 intpin 19 - irq 2
IOAPIC #0 intpin 21 - irq 5
IOAPIC #0 intpin 16 - irq 9
pci0: PCI bus on pcib0
pcib2: Intel 82443GX (440 GX) PCI-PCI (AGP) bridge at device 1.0 on pci0
pci1: PCI bus on pcib2
pcib3: PCI to PCI bridge (vendor=1011 device=0023) at device 15.0 on pci1
IOAPIC #0 intpin 20 - irq 10
pci2: PCI bus on pcib3
pci2: unknown card (vendor=0x12ae, dev=0x0001) at 4.0 irq 10
ahc0: Adaptec aic7896/97 Ultra2 SCSI adapter port 0x2000-0x20ff mem 
0xf410-0xf4100fff irq 2 at device 12.0 on pci0
aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
ahc1: Adaptec aic7896/97 Ultra2 SCSI adapter port 0x2400-0x24ff mem 
0xf4101000-0xf4101fff irq 2 at device 12.1 on pci0
aic7896/97: Ultra2 Wide Channel B, SCSI Id=4, 32/255 SCBs
fxp0: Intel Pro 10/100B/100+ Ethernet port 0x2c00-0x2c3f mem 
0xf400-0xf40f,0xf4102000-0xf4102fff irq 5 at device 14.0 on pci0
fxp0: Ethernet address 00:90:27:e0:18:5d
inphy0: i82555 10/100 media interface on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc2: Adaptec 2940 Ultra2 SCSI adapter port 0x2800-0x28ff mem 0xf4103000-0xf4103fff 
irq 9 at device 16.0 on pci0
aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs
isab0: Intel 82371AB PCI to ISA bridge at device 18.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel PIIX4 ATA33 controller port 0x2c60-0x2c6f at device 18.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x2c40-0x2c5f irq 5 at device 
18.2 on pci0
usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Timecounter PIIX  frequency 3579545 Hz
chip1: Intel 82371AB Power management controller port 0x1040-0x104f at device 18.3 
on pci0
pci0: Cirrus Logic GD5480 SVGA controller at 20.0
pcib1: Intel 82443GX host to AGP bridge on motherboard
pci3: PCI bus on pcib1
orm0: Option ROMs at iomem 
0xc-0xc7fff,0xc8000-0xccfff,0xcd000-0xcd7ff,0xcd800-0xcdfff on isa0
fdc0: NEC 72065B or clone at port 

Re: Tracking down BTX halted

2001-11-16 Thread John Baldwin


On 16-Nov-01 Sandeep Joshi wrote:
 
 Mike,
 
 Mike Smith wrote:
 The only real solution is to hot-plug
 the disks, camcontrol rescan, then dd zeroes over the heads of the disks
 and then re-lable them safely.
 
 Yep, that method works for now.  
 
 I was hoping its easy enough to crack this myself (with some online 
 tips  references) but John Baldwin convinced me otherwise :-)
 
 So here's the entire configuration attached..
 
 SUMMARY: 
 The boot disk (ad0 in attachment) is not the problem.
 There are two other IBM SCSI disks attached to two Adaptec cards.
 Its these other two SCSI disks-da0,da1 which are empty and
 whose disklabels I played with.  These cause a BTX error if
 they are plugged in during a boot.

Were these disks ever formatted with dangerously dedicated mode.  If so, you
would need to re-fdisk them to get the bogus fdisk table out of the way. 
dd'ing zeros over the table is one way of doing this, although you will still
need to fdisk afterwards.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Sandeep Joshi

John Baldwin wrote:
 
  SUMMARY:
  The boot disk (ad0 in attachment) is not the problem.
  There are two other IBM SCSI disks attached to two Adaptec cards.
  Its these other two SCSI disks-da0,da1 which are empty and
  whose disklabels I played with.  These cause a BTX error if
  they are plugged in during a boot.
 
 Were these disks ever formatted with dangerously dedicated mode.  If so, you
 would need to re-fdisk them to get the bogus fdisk table out of the way.
 dd'ing zeros over the table is one way of doing this, although you will still
 need to fdisk afterwards.

They had a vinum partition but cant recall if they were dedicated.

I had dd'ed zeroes all over but had forgotten fdisk!
Yes, it works now :-)

thanks again
-Sandeep

 
 --
 
 John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
 Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Mike Smith

 I was hoping its easy enough to crack this myself (with some online 
 tips  references) but John Baldwin convinced me otherwise :-)

The evidence suggests that my original analysis is correct:

 Error message when the SCSI disk is attached to 
 the AIC-7896 SCSI BIOS v2.20s1B1 
...
 cs=c800  ds=0040  ed=9e3efs=  gs=  ss=9e3e
...
 Error message when the SCSI disk is attached to 
 the AHA2940U2W SCSI BIOS v2.20 :
...
 cs=cd80  ds=0040  ed=9e3efs=  gs=  ss=9e3e

Both of these are failures inside the BIOS on these SCSI cards.

You'll have to come up with a correct MBR partition table for these
controllers.  You might get away with 'disklabel auto', but if not, you'll
have to use fdisk.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread John Baldwin


On 16-Nov-01 Mike Smith wrote:
 I was hoping its easy enough to crack this myself (with some online 
 tips  references) but John Baldwin convinced me otherwise :-)
 
 The evidence suggests that my original analysis is correct:
 
 Error message when the SCSI disk is attached to 
 the AIC-7896 SCSI BIOS v2.20s1B1 
 ...
 cs=c800  ds=0040  ed=9e3efs=  gs=  ss=9e3e
 ...
 Error message when the SCSI disk is attached to 
 the AHA2940U2W SCSI BIOS v2.20 :
 ...
 cs=cd80  ds=0040  ed=9e3efs=  gs=  ss=9e3e
 
 Both of these are failures inside the BIOS on these SCSI cards.
 
 You'll have to come up with a correct MBR partition table for these
 controllers.  You might get away with 'disklabel auto', but if not, you'll
 have to use fdisk.

Nah, 'disklabel auto' is DD mode, so that will hose him.  'fdisk -I da0 ;
disklabel da0s1 auto' is what you want.

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message



Re: Tracking down BTX halted

2001-11-16 Thread Greg Black

Doug White wrote:

| On Fri, 16 Nov 2001, Sandeep Joshi wrote:
| 
|  I changed the disklabels on a few SCSI disks and now
|  I keep getting these BTX halted messages every time
|  I reboot.
| 
| Lemme guess, you're running them in 'dangerously dedicated' mode.
| 
| There is a bug in Adaptec BIOSen that they will not tolerate DD disks.

This may be true for some Adaptec controllers, but is certainly
not true for all of them.  I run a mixture of SCSI and IDE disks
which have always been dangerously dedicated since day 1 (which
is around 10 years ago).  All my SCSI disks have always run on
Adaptec controllers of various models and not one has had any
kind of problem.

That doesn't mean that all Adaptec controllers like DD disks,
but it certainly shows that some are comfortable with them.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-hackers in the body of the message