Re: Tracking down BTX halted
On Sat, Nov 17, 2001 at 05:41:25PM -0800, Peter Wemm wrote: The problem is that you cant *not* get dangerously-dedicated mode. Our boot1 has got a dangerously-dedicated fdisk table unconditionally compiled in. You can fix it so that it doesn't crash stuff, but we still shouldn't be forcing it on people like that. Why haven't we just removed it? -- -- David ([EMAIL PROTECTED]) To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
: There is a bug in Adaptec BIOSen that they will not tolerate DD disks. : : Which controllers have this bug? I've got a whole bunch of 7880 and 79xx : controllers with disks running in DD mode and never have had this problem. : : Happens to me on L440GX+ boards. : :Also happens on IBM Netfinity servers with aic7896/97 controllers. It :*may* be the case that this only happens with certain Adaptec BIOS :versions, but it's very real. I started getting this on DELL's a year and a half ago. The dangerously dedicated partition was to blame (which is what eventually led to my fixing the disklabel auto code). Well, ok, the Adaptec BIOS was to blame, but there isn't much we can do about it so... In my case it wasn't enough to repartition the disk. The old data still screwed up the BIOS. I had to physically wipe (with dd) the first couple of sectors before repartitioning it use the fdisk -BI / disklabel -r -w da0s1 auto combination. -Matt Matthew Dillon [EMAIL PROTECTED] :Steinar Haug, Nethelp consulting, [EMAIL PROTECTED] : :To Unsubscribe: send mail to [EMAIL PROTECTED] :with unsubscribe freebsd-hackers in the body of the message : To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
On Fri, 16 Nov 2001, Matthew Emmerton wrote: There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Which controllers have this bug? I've got a whole bunch of 7880 and 79xx controllers with disks running in DD mode and never have had this problem. Happens to me on L440GX+ boards. Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Which controllers have this bug? I've got a whole bunch of 7880 and 79xx controllers with disks running in DD mode and never have had this problem. Happens to me on L440GX+ boards. Also happens on IBM Netfinity servers with aic7896/97 controllers. It *may* be the case that this only happens with certain Adaptec BIOS versions, but it's very real. Steinar Haug, Nethelp consulting, [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
Doug White wrote: On Fri, 16 Nov 2001, Sandeep Joshi wrote: I changed the disklabels on a few SCSI disks and now I keep getting these BTX halted messages every time I reboot. Lemme guess, you're running them in 'dangerously dedicated' mode. There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Put proper partition tables on them and they should behave. It's not so much a BIOS bug but the fact that we specify an illegal geometry in the fake fdisk table. This can cause bios's that are about to use the fdisk table to emulate C/H/S geometry to do a divide by zero. This is usually what shows up in the BTX faults. Both of the posted BTX faults have int= which is a divide-by-zero. see i386/i386/machdep.c for identifying the int values: setidt(0, IDTVEC(div), SDT_SYS386TGT, SEL_KPL, GSEL(GCODE_SEL, SEL_KPL)); ^^^ int 0 = 'div' (divide by zero) The specific problem is that the fake fdisk table specifies 256 heads, which is not possible to represent in the int13 interface. The maximum allowed is 255 heads. (hint: 256 in an 8-bit register rounds to 0, hence the divide-by-zero). This is the cause of EFI (ia64) hangs, the infamous Thinkpad T20/A20 series system lockups, countless bios crashes, etc. The problem is that you cant *not* get dangerously-dedicated mode. Our boot1 has got a dangerously-dedicated fdisk table unconditionally compiled in. You can fix it so that it doesn't crash stuff, but we still shouldn't be forcing it on people like that. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
Doug White wrote: On Fri, 16 Nov 2001, Matthew Emmerton wrote: There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Which controllers have this bug? I've got a whole bunch of 7880 and 79xx controllers with disks running in DD mode and never have had this problem. Happens to me on L440GX+ boards. It happens randomly all over the place. I think it also depends on the default bios settings too.. ie: whether you've got Large Drive support on or off, and what mode it is in. FWIW, this puts a legal geometry back into boot1: +++ boot1.s 2001/11/18 01:42:26 @@ -353,7 +353,7 @@ .fill 0x30,0x1,0x0 part4: .byte 0x80, 0x00, 0x01, 0x00 - .byte 0xa5, 0xff, 0xff, 0xff + .byte 0xa5, 0xfe, 0xff, 0xff# 1023 cyl, 255 heads, 63 sec .byte 0x00, 0x00, 0x00, 0x00 .byte 0x50, 0xc3, 0x00, 0x00# 5 sectors long, bleh But that stops the MBR kernel code from recognizing it as a bogus DD table and it will try to interpret it, thinking that you only have a 25 meg drive. I wish we could take this crud out with a shotgun. If somebody is going to boot off a disk, obey the rules with a real MBR and fdisk table. If somebody doesn't want a fdisk table, then dont fudge around with a fake one, and dont pretend that it is bootable. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] All of this is for nothing if we don't go to the stars - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
RE: Tracking down BTX halted
On 16-Nov-01 Sandeep Joshi wrote: I changed the disklabels on a few SCSI disks and now I keep getting these BTX halted messages every time I reboot. They dont occur if I disconnect those disks. They occur even after I rewrite those labels. Its not dedicated mode or whatever now.. I am currently running 4.4-REL. I am willing to post the entire system configuration but it would be really nice if instead someone could tell me _HOW_ to determine the problem. A colleague tells me its possible to track the problem from the registers(cs,es,..) in the message dump. Yes, especially if you provide that info in the e-mail. :) -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
I changed the disklabels on a few SCSI disks and now I keep getting these BTX halted messages every time I reboot. The most likely cause of this is that you're messing up the disks to the point that your BIOS (probably your SCSI controller BIOS) is crashing when it tries to read them. They dont occur if I disconnect those disks. They occur even after I rewrite those labels. Its not dedicated mode or whatever now.. You must have missed something. 8) I am willing to post the entire system configuration but it would be really nice if instead someone could tell me _HOW_ to determine the problem. If the above guess is correct, you're looking at a BIOS bug. You might try updating your SCSI controller/motherboard BIOS, but there's no guarantee that'll fix the problem. The only real solution is to hot-plug the disks, camcontrol rescan, then dd zeroes over the heads of the disks and then re-lable them safely. A colleague tells me its possible to track the problem from the registers(cs,es,..) in the message dump. That would tell you where, exactly, the crash is occuring, and would make it possible to absolutely finger the culprit. What's the value of the cs register? If it's above 0xa000, then the BIOS is at fault. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
On Fri, 16 Nov 2001, Sandeep Joshi wrote: I changed the disklabels on a few SCSI disks and now I keep getting these BTX halted messages every time I reboot. Lemme guess, you're running them in 'dangerously dedicated' mode. There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Put proper partition tables on them and they should behave. Doug White| FreeBSD: The Power to Serve [EMAIL PROTECTED] | www.FreeBSD.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
Doug Write wrote: On Fri, 16 Nov 2001, Sandeep Joshi wrote: I changed the disklabels on a few SCSI disks and now I keep getting these BTX halted messages every time I reboot. Lemme guess, you're running them in 'dangerously dedicated' mode. There is a bug in Adaptec BIOSen that they will not tolerate DD disks. Which controllers have this bug? I've got a whole bunch of 7880 and 79xx controllers with disks running in DD mode and never have had this problem. -- Matt Emmerton To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
Mike, Mike Smith wrote: The only real solution is to hot-plug the disks, camcontrol rescan, then dd zeroes over the heads of the disks and then re-lable them safely. Yep, that method works for now. I was hoping its easy enough to crack this myself (with some online tips references) but John Baldwin convinced me otherwise :-) So here's the entire configuration attached.. SUMMARY: The boot disk (ad0 in attachment) is not the problem. There are two other IBM SCSI disks attached to two Adaptec cards. Its these other two SCSI disks-da0,da1 which are empty and whose disklabels I played with. These cause a BTX error if they are plugged in during a boot. TIA, -Sandeep --- Error message when the SCSI disk is attached to the AIC-7896 SCSI BIOS v2.20s1B1 int= err= efl=00030246 eip=1d29 eax= ebx=0386 ecx= edx= esi=9e3e edi=1c09 ebp=038e esp=0382 cs=c800 ds=0040 ed=9e3efs= gs= ss=9e3e cs:eip=f7 f1 33 d2 8a 4e f6 f7-f1 3d ff 03 76 03 b8 ff ss:esp=00 00 3f 00 00 00 00 00-00 00 02 00 22 0a 00 c8 --- Error message when the SCSI disk is attached to the AHA2940U2W SCSI BIOS v2.20 : int= err= efl=00030246 eip=1d29 eax= ebx=038e ecx= edx= esi=9e3e edi=1a3e ebp=0396 esp=038a cs=cd80 ds=0040 ed=9e3efs= gs= ss=9e3e cs:eip=f7 f1 33 d2 8a 4e f6 f7-f1 3d ff 03 76 03 b8 ff ss:esp=00 00 3f 00 00 00 00 00-00 00 02 00 df 09 80 cd --- Dmesg: Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.4-RELEASE #2: Mon Nov 12 13:12:46 EST 2001 [EMAIL PROTECTED]:/usr/src/sys/compile/NSTG7.SMP Timecounter i8254 frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (696.41-MHz 686-class CPU) Origin = GenuineIntel Id = 0x681 Stepping = 1 Features=0x387fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,PN,MMX,FXSR,SSE real memory = 805240832 (786368K bytes) avail memory = 778305536 (760064K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 - irq 0 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 1, version: 0x00040011, at 0xfee0 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee0 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec0 Preloaded elf kernel kernel at 0xc0535000. Pentium Pro MTRR support enabled md0: Malloc disk Using $PIR table, 12 entries at 0xc00fdf00 npx0: math processor on motherboard npx0: INT 16 interface pcib0: Intel 82443GX host to PCI bridge on motherboard IOAPIC #0 intpin 19 - irq 2 IOAPIC #0 intpin 21 - irq 5 IOAPIC #0 intpin 16 - irq 9 pci0: PCI bus on pcib0 pcib2: Intel 82443GX (440 GX) PCI-PCI (AGP) bridge at device 1.0 on pci0 pci1: PCI bus on pcib2 pcib3: PCI to PCI bridge (vendor=1011 device=0023) at device 15.0 on pci1 IOAPIC #0 intpin 20 - irq 10 pci2: PCI bus on pcib3 pci2: unknown card (vendor=0x12ae, dev=0x0001) at 4.0 irq 10 ahc0: Adaptec aic7896/97 Ultra2 SCSI adapter port 0x2000-0x20ff mem 0xf410-0xf4100fff irq 2 at device 12.0 on pci0 aic7896/97: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs ahc1: Adaptec aic7896/97 Ultra2 SCSI adapter port 0x2400-0x24ff mem 0xf4101000-0xf4101fff irq 2 at device 12.1 on pci0 aic7896/97: Ultra2 Wide Channel B, SCSI Id=4, 32/255 SCBs fxp0: Intel Pro 10/100B/100+ Ethernet port 0x2c00-0x2c3f mem 0xf400-0xf40f,0xf4102000-0xf4102fff irq 5 at device 14.0 on pci0 fxp0: Ethernet address 00:90:27:e0:18:5d inphy0: i82555 10/100 media interface on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ahc2: Adaptec 2940 Ultra2 SCSI adapter port 0x2800-0x28ff mem 0xf4103000-0xf4103fff irq 9 at device 16.0 on pci0 aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/255 SCBs isab0: Intel 82371AB PCI to ISA bridge at device 18.0 on pci0 isa0: ISA bus on isab0 atapci0: Intel PIIX4 ATA33 controller port 0x2c60-0x2c6f at device 18.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0x2c40-0x2c5f irq 5 at device 18.2 on pci0 usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered Timecounter PIIX frequency 3579545 Hz chip1: Intel 82371AB Power management controller port 0x1040-0x104f at device 18.3 on pci0 pci0: Cirrus Logic GD5480 SVGA controller at 20.0 pcib1: Intel 82443GX host to AGP bridge on motherboard pci3: PCI bus on pcib1 orm0: Option ROMs at iomem 0xc-0xc7fff,0xc8000-0xccfff,0xcd000-0xcd7ff,0xcd800-0xcdfff on isa0 fdc0: NEC 72065B or clone at port
Re: Tracking down BTX halted
On 16-Nov-01 Sandeep Joshi wrote: Mike, Mike Smith wrote: The only real solution is to hot-plug the disks, camcontrol rescan, then dd zeroes over the heads of the disks and then re-lable them safely. Yep, that method works for now. I was hoping its easy enough to crack this myself (with some online tips references) but John Baldwin convinced me otherwise :-) So here's the entire configuration attached.. SUMMARY: The boot disk (ad0 in attachment) is not the problem. There are two other IBM SCSI disks attached to two Adaptec cards. Its these other two SCSI disks-da0,da1 which are empty and whose disklabels I played with. These cause a BTX error if they are plugged in during a boot. Were these disks ever formatted with dangerously dedicated mode. If so, you would need to re-fdisk them to get the bogus fdisk table out of the way. dd'ing zeros over the table is one way of doing this, although you will still need to fdisk afterwards. -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
John Baldwin wrote: SUMMARY: The boot disk (ad0 in attachment) is not the problem. There are two other IBM SCSI disks attached to two Adaptec cards. Its these other two SCSI disks-da0,da1 which are empty and whose disklabels I played with. These cause a BTX error if they are plugged in during a boot. Were these disks ever formatted with dangerously dedicated mode. If so, you would need to re-fdisk them to get the bogus fdisk table out of the way. dd'ing zeros over the table is one way of doing this, although you will still need to fdisk afterwards. They had a vinum partition but cant recall if they were dedicated. I had dd'ed zeroes all over but had forgotten fdisk! Yes, it works now :-) thanks again -Sandeep -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
I was hoping its easy enough to crack this myself (with some online tips references) but John Baldwin convinced me otherwise :-) The evidence suggests that my original analysis is correct: Error message when the SCSI disk is attached to the AIC-7896 SCSI BIOS v2.20s1B1 ... cs=c800 ds=0040 ed=9e3efs= gs= ss=9e3e ... Error message when the SCSI disk is attached to the AHA2940U2W SCSI BIOS v2.20 : ... cs=cd80 ds=0040 ed=9e3efs= gs= ss=9e3e Both of these are failures inside the BIOS on these SCSI cards. You'll have to come up with a correct MBR partition table for these controllers. You might get away with 'disklabel auto', but if not, you'll have to use fdisk. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
On 16-Nov-01 Mike Smith wrote: I was hoping its easy enough to crack this myself (with some online tips references) but John Baldwin convinced me otherwise :-) The evidence suggests that my original analysis is correct: Error message when the SCSI disk is attached to the AIC-7896 SCSI BIOS v2.20s1B1 ... cs=c800 ds=0040 ed=9e3efs= gs= ss=9e3e ... Error message when the SCSI disk is attached to the AHA2940U2W SCSI BIOS v2.20 : ... cs=cd80 ds=0040 ed=9e3efs= gs= ss=9e3e Both of these are failures inside the BIOS on these SCSI cards. You'll have to come up with a correct MBR partition table for these controllers. You might get away with 'disklabel auto', but if not, you'll have to use fdisk. Nah, 'disklabel auto' is DD mode, so that will hose him. 'fdisk -I da0 ; disklabel da0s1 auto' is what you want. -- John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/ Power Users Use the Power to Serve! - http://www.FreeBSD.org/ To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message
Re: Tracking down BTX halted
Doug White wrote: | On Fri, 16 Nov 2001, Sandeep Joshi wrote: | | I changed the disklabels on a few SCSI disks and now | I keep getting these BTX halted messages every time | I reboot. | | Lemme guess, you're running them in 'dangerously dedicated' mode. | | There is a bug in Adaptec BIOSen that they will not tolerate DD disks. This may be true for some Adaptec controllers, but is certainly not true for all of them. I run a mixture of SCSI and IDE disks which have always been dangerously dedicated since day 1 (which is around 10 years ago). All my SCSI disks have always run on Adaptec controllers of various models and not one has had any kind of problem. That doesn't mean that all Adaptec controllers like DD disks, but it certainly shows that some are comfortable with them. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message