Re: PANIC: vinum / atacontrol (5.0-STABLE / 4.8-RC2)
On Saturday, 29 March 2003 at 10:45:43 +, james wrote: > Hi Greg > > Thanks for your response! > >> #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", >> flags=0x2, sspp=0xc1328568) >> at ../../kern/subr_diskslice.c:356 >> #7 0xc021fd5b in diskioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 >> "\001", fflag=0x2, p=0xca10fee0) >> at ../../kern/subr_disk.c:267 >> #8 0xc140b5af in ?? () >> #9 0xc140969b in ?? () >> #10 0xc140988e in ?? () >> #11 0xc140c461 in ?? () >> #12 0xc024efa2 in spec_ioctl (ap=0xcb307de4) at ../../miscfs/specfs/spec_vnops.c:306 >> >> This is different from the other crash. It looks like it happens in >> Vinum. Take a look at vinum(4) or >> http://www.vinumvm.org/vinum/how-to-debug.html for details of how to >> bring life into them. > > I have been looking at this page, and I'm not clear what else is needed, or what > you mean by "bringing life into them". I followed the steps to analyse a kernel > panic, and provided the output... unfortunately I don't have the knowledge to > fix it myself. You need to load the symbols from the kld. Both pages tell you how to do that. >> It's not clear what's trying to write the label, but looking at the >> locals of the dsioctl frame would help. > > I provided this output in gdb.txt - the code and locals for frame 6 are: > > (kgdb) f 6 > #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 > "\001", flags=0x2, sspp=0xc1328568) > at ../../kern/subr_diskslice.c:356 > 356 sp = &ssp->dss_slices[slice]; > > (kgdb) l > 351 struct diskslices *ssp; > 352 struct partition *pp; > 353 > 354 slice = dkslice(dev); > 355 ssp = *sspp; > 356 sp = &ssp->dss_slices[slice]; > 357 lp = sp->ds_label; > 358 switch (cmd) { > 359 > 360 case DIOCGDVIRGIN: > > cmd = 0x0 > data = 0xcb307d58 "\001" > flags = 0x2 > error = 0xcb307d58 > lp = (struct disklabel *) 0x > old_wlabel = 0x0 > openmask = 0x4b > part = 0x2 > slice = 0x2 > sp = (struct diskslice *) 0x8c > ssp = (struct diskslices *) 0x0 > > I must admit, I was a little confused as to why cmd was suddenly 0x0 > considering that dsioctl was called with cmd = 0x8004646d, This is possibly a strangeness of gdb. It's not relevant, however. > but as I'm no kernel hacker there's probably a very good reason for > it! Is it possible it's being overwritten and that's why we panic? No. Look at the line where it happened: > 356 sp = &ssp->dss_slices[slice]; Now look at ssp: > ssp = (struct diskslices *) 0x0 ssp is taken indirectly from sspp: > 355 ssp = *sspp; So the caller is passing invalid data in sspp. That's potentially Vinum; you need to get those symbols loaded and find out what Vinum is doing. To the rest of -questions: is this interesting? If not, I'll take it offline. I need at least one "yes please" reply to continue beyond the next exchange of messages. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply or reply to the original recipients. For more information, see http://www.lemis.com/questions.html See complete headers for address and phone numbers pgp0.pgp Description: PGP signature ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: PANIC: vinum / atacontrol (5.0-STABLE / 4.8-RC2)
Hi Greg Thanks for your response! > #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", > flags=0x2, sspp=0xc1328568) > at ../../kern/subr_diskslice.c:356 > #7 0xc021fd5b in diskioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", > fflag=0x2, p=0xca10fee0) > at ../../kern/subr_disk.c:267 > #8 0xc140b5af in ?? () > #9 0xc140969b in ?? () > #10 0xc140988e in ?? () > #11 0xc140c461 in ?? () > #12 0xc024efa2 in spec_ioctl (ap=0xcb307de4) at ../../miscfs/specfs/spec_vnops.c:306 > > This is different from the other crash. It looks like it happens in > Vinum. Take a look at vinum(4) or > http://www.vinumvm.org/vinum/how-to-debug.html for details of how to > bring life into them. I have been looking at this page, and I'm not clear what else is needed, or what you mean by "bringing life into them". I followed the steps to analyse a kernel panic, and provided the output... unfortunately I don't have the knowledge to fix it myself. > > The kenel is panicking in dsioctl(), kern/subr_diskslice.c:356. I've > > had a look in there, but I really have no idea what it's trying to > > do - I can't even work out what the ioctl is. I'm no kernel guy :( > > The ioctl is the cmd parameter passed to diskioctl, 0x8004646d. > That's DIOCWLABEL. Finding them isn't easy, but basically: > > 0x8 -> _IOW macro. We're writing. > 004length to write > 64 ioctl type ('d'). You'd go looking for a regular expression >_IOW.*'d'. > 6d ioctl number (109). This one is in /sys/sys/disklabel.h: > > #define DIOCWLABEL_IOW('d', 109, int) /* write en/disable label */ Thanks - that's helpful :) > It's not clear what's trying to write the label, but looking at the > locals of the dsioctl frame would help. I provided this output in gdb.txt - the code and locals for frame 6 are: (kgdb) f 6 #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", flags=0x2, sspp=0xc1328568) at ../../kern/subr_diskslice.c:356 356 sp = &ssp->dss_slices[slice]; (kgdb) l 351 struct diskslices *ssp; 352 struct partition *pp; 353 354 slice = dkslice(dev); 355 ssp = *sspp; 356 sp = &ssp->dss_slices[slice]; 357 lp = sp->ds_label; 358 switch (cmd) { 359 360 case DIOCGDVIRGIN: cmd = 0x0 data = 0xcb307d58 "\001" flags = 0x2 error = 0xcb307d58 lp = (struct disklabel *) 0x old_wlabel = 0x0 openmask = 0x4b part = 0x2 slice = 0x2 sp = (struct diskslice *) 0x8c ssp = (struct diskslices *) 0x0 I must admit, I was a little confused as to why cmd was suddenly 0x0 considering that dsioctl was called with cmd = 0x8004646d, but as I'm no kernel hacker there's probably a very good reason for it! Is it possible it's being overwritten and that's why we panic? > > I appreciate this may not be a bug in Vinum, but it certainly seems > > like it's being triggered by vinum. > > Yes, that's reasonable. > > Greg Cheers, James ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: PANIC: vinum / atacontrol (5.0-STABLE / 4.8-RC2)
On Friday, 28 March 2003 at 11:32:48 +, james wrote: > On Fri, 28 Mar 2003, Greg 'groggy' Lehey wrote: > >> [Format recovered--see http://www.lemis.com/email/email-format.html] >> >> Computer output wrapped. >> >> On Thursday, 27 March 2003 at 14:18:43 +, james wrote: >>> Hi >>> >>> I am trying to configure hotswap-raid and vinum on my machine, and have found I >>> can cause the kernel to panic at will. >>> >>> Ideally I would like to be able to stop a plex, use atacontrol attach/detach to >>> replace the disk, and rebuild the plex. Would this work in theory? >> >> Apparently. There was a time when people claimed that ATA drives >> couldn't be hot swapped, but that seems to be incorrect nowadays. >> >>> Now I stop and unload vinum, and try to run atacontrol: >>> >>> eddie# vinum stop >>> vinum unloaded >>> eddie# kldstat | grep vinum >>> eddie# >>> eddie# atacontrol detach 3 >>> >>> >>> I have built a debug kernel, and have a core. The backtrace is below. >>> >>> If you need any more info please let me know! >>> >>> James >>> >>> Now follows the gdb-output: >>> >>> (kgdb) bt >>> #9 0xc01a9223 in panic () at /usr/src/sys/kern/kern_shutdown.c:517 >>> #10 0xc02e311e in trap_fatal (frame=0xc0b94e00, eva=0x0) at >>> /usr/src/sys/i386/i386/trap.c:844 >>> #11 0xc02e2e32 in trap_pfault (frame=0xc873fa74, usermode=0x0, eva=0x24) at >>> /usr/src/sys/i386/i386/trap.c:758 >>> #12 0xc02e2a1d in trap (frame= >>> {tf_fs = 0xc0380018, tf_es = 0xc0b90010, tf_ds = 0x10, tf_edi = 0x0, >>> tf_esi = 0xc1857530, tf_ebp = 0xc873fab4, tf_isp = 0xc873faa0, tf_ >>> ebx = 0xc19a6c00, tf_edx = 0xe7, tf_ecx = 0xc032a340, tf_eax = 0x0, tf_trapno = >>> 0xc, tf_err = 0x0, tf_eip = 0xc01c6de6, tf_cs = 0x8, tf_eflag >>> s = 0x10292, tf_esp = 0xc873faf0, tf_ss = 0xc01296ae}) >>> at /usr/src/sys/i386/i386/trap.c:445 >>> #13 0xc02d44f8 in calltrap () at {standard input}:98 >>> #14 0xc01296ae in ata_command (atadev=0xc1857530, command=0xe7, lba=0x0, >>> count=0x0, feature=0x0, flags=0x4) >>> at bus_at386.h:526 >>> #15 0xc01396df in adclose (dev=0x0, flags=0x3, fmt=0x0, td=0x0) at >>> /usr/src/sys/dev/ata/ata-disk.c:292 >> >> (etc) >> >> The trap occurred between frames 12 and 13 at address 0xc873faa0, in >> the ATA code. Depending on your prowess with kernel code, you may be >> able to find out what has gone wrong. I'd be inclined to look at >> frame 13: >> >> (gdb) f 13 select frame >> (gdb) l list the code >> (gdb) i loc show local variables >> >> My guess is that something has not been initialized. It's probably >> worth submitting a bug report. > > I will be able to analyse the 5.0-stable panic a little more when I get home. > In the meantime I've been doing simliar tests with 4.8-RC2. I get slightly more > progress, but still a panic at the end. > > Sequence of events: > > 1, create volume, 2 plexes on 2 disks > 2, vinum stop volume.p1 > 3, atacontrol detach 1 (drive b) > 4, atacontrol attach 1 - this WORKS, doesn't panic like 5.0-STABLE > 5, vinum start volume.p1 > OK, that's good background information, but first we need to look at the dump. > As before, I have a debug kernel and core dump. I can't seem to > configure my mailer to not wrap lines, so I've posted all relevant > information to http://web.hisser.org/vinum/4.8-crash/ . They are > plain text files. You should consider getting a different MUA. #6 0xc0220332 in dsioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", flags=0x2, sspp=0xc1328568) at ../../kern/subr_diskslice.c:356 #7 0xc021fd5b in diskioctl (dev=0xc1383200, cmd=0x8004646d, data=0xcb307d58 "\001", fflag=0x2, p=0xca10fee0) at ../../kern/subr_disk.c:267 #8 0xc140b5af in ?? () #9 0xc140969b in ?? () #10 0xc140988e in ?? () #11 0xc140c461 in ?? () #12 0xc024efa2 in spec_ioctl (ap=0xcb307de4) at ../../miscfs/specfs/spec_vnops.c:306 This is different from the other crash. It looks like it happens in Vinum. Take a look at vinum(4) or http://www.vinumvm.org/vinum/how-to-debug.html for details of how to bring life into them. > The kenel is panicking in dsioctl(), kern/subr_diskslice.c:356. I've > had a look in there, but I really have no idea what it's trying to > do - I can't even work out what the ioctl is. I'm no kernel guy :( The ioctl is the cmd parameter passed to diskioctl, 0x8004646d. That's DIOCWLABEL. Finding them isn't easy, but basically: 0x8 -> _IOW macro. We're writing. 004length to write 64 ioctl type ('d'). You'd go looking for a regular expression _IOW.*'d'. 6d ioctl number (109). This one is in /sys/sys/disklabel.h: #define DIOCWLABEL _IOW('d', 109, int) /* write en/disable label */ It's not clear what's trying to write the label, but looking at the locals of the dsioctl frame would help. > I appreciate this may not be a bug in Vinum, but it certainly seems > like it's being triggered by vinum. Yes, that's reasonable. Greg -- When replying to this me
Re: PANIC: vinum / atacontrol (5.0-STABLE / 4.8-RC2)
Hi Greg I will be able to analyse the 5.0-stable panic a little more when I get home. In the meantime I've been doing simliar tests with 4.8-RC2. I get slightly more progress, but still a panic at the end. Sequence of events: 1, create volume, 2 plexes on 2 disks 2, vinum stop volume.p1 3, atacontrol detach 1 (drive b) 4, atacontrol attach 1 - this WORKS, doesn't panic like 5.0-STABLE 5, vinum start volume.p1 As before, I have a debug kernel and core dump. I can't seem to configure my mailer to not wrap lines, so I've posted all relevant information to http://web.hisser.org/vinum/4.8-crash/ . They are plain text files. The kenel is panicking in dsioctl(), kern/subr_diskslice.c:356. I've had a look in there, but I really have no idea what it's trying to do - I can't even work out what the ioctl is. I'm no kernel guy :( I appreciate this may not be a bug in Vinum, but it certainly seems like it's being triggered by vinum. I can access the drive, query the disklabel, and format the partition after the attach - it's only when I restart the plex does it panic. If I detach the plex before removing it, I get the same panic in dsioctl() when I re-attach the plex, without even getting the opportunity to start the plex. I would love to be able to help get this working, as I'm unable to hotswap at all in Linux, which is what has made me move to FreeBSD. Thanks James On Fri, 28 Mar 2003, Greg 'groggy' Lehey wrote: > [Format recovered--see http://www.lemis.com/email/email-format.html] > > Computer output wrapped. > > On Thursday, 27 March 2003 at 14:18:43 +, james wrote: > > Hi > > > > I am trying to configure hotswap-raid and vinum on my machine, and have found I > > can cause the kernel to panic at will. > > > > Ideally I would like to be able to stop a plex, use atacontrol attach/detach to > > replace the disk, and rebuild the plex. Would this work in theory? > > Apparently. There was a time when people claimed that ATA drives > couldn't be hot swapped, but that seems to be incorrect nowadays. > > > Now I stop and unload vinum, and try to run atacontrol: > > > > eddie# vinum stop > > vinum unloaded > > eddie# kldstat | grep vinum > > eddie# > > eddie# atacontrol detach 3 > > > > > > I have built a debug kernel, and have a core. The backtrace is below. > > > > If you need any more info please let me know! > > > > James > > > > Now follows the gdb-output: > > > > (kgdb) bt > > #9 0xc01a9223 in panic () at /usr/src/sys/kern/kern_shutdown.c:517 > > #10 0xc02e311e in trap_fatal (frame=0xc0b94e00, eva=0x0) at > > /usr/src/sys/i386/i386/trap.c:844 > > #11 0xc02e2e32 in trap_pfault (frame=0xc873fa74, usermode=0x0, eva=0x24) at > > /usr/src/sys/i386/i386/trap.c:758 > > #12 0xc02e2a1d in trap (frame= > > {tf_fs = 0xc0380018, tf_es = 0xc0b90010, tf_ds = 0x10, tf_edi = 0x0, > > tf_esi = 0xc1857530, tf_ebp = 0xc873fab4, tf_isp = 0xc873faa0, tf_ > > ebx = 0xc19a6c00, tf_edx = 0xe7, tf_ecx = 0xc032a340, tf_eax = 0x0, tf_trapno = > > 0xc, tf_err = 0x0, tf_eip = 0xc01c6de6, tf_cs = 0x8, tf_eflag > > s = 0x10292, tf_esp = 0xc873faf0, tf_ss = 0xc01296ae}) > > at /usr/src/sys/i386/i386/trap.c:445 > > #13 0xc02d44f8 in calltrap () at {standard input}:98 > > #14 0xc01296ae in ata_command (atadev=0xc1857530, command=0xe7, lba=0x0, > > count=0x0, feature=0x0, flags=0x4) > > at bus_at386.h:526 > > #15 0xc01396df in adclose (dev=0x0, flags=0x3, fmt=0x0, td=0x0) at > > /usr/src/sys/dev/ata/ata-disk.c:292 > > (etc) > > The trap occurred between frames 12 and 13 at address 0xc873faa0, in > the ATA code. Depending on your prowess with kernel code, you may be > able to find out what has gone wrong. I'd be inclined to look at > frame 13: > > (gdb) f 13 select frame > (gdb) l list the code > (gdb) i loc show local variables > > My guess is that something has not been initialized. It's probably > worth submitting a bug report. > > Greg > -- > When replying to this message, please copy the original recipients. > If you don't, I may ignore the reply or reply to the original recipients. > For more information, see http://www.lemis.com/questions.html > See complete headers for address and phone numbers > ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"