Re: [PATCH] Re: Junior Kernel Hacker page updated...
Stefan Farfeleder wrote: > Imagine this scenario where CPU 0 inserts a knote kn1 (the marker) in > knote_scan and CPU 1 kn2 in kqueue_enqueue: > > CPU 0 | CPU 1 > +--- > kn1->kn_tqe.tqe_next = NULL;| > | > +--- > kn1->kn_tqe.tqe_prev = | kn2->kn_tqe.tqe_next = NULL; > kq_head.tqh_last; | > +--- > *kq_head.tqh_last = kn1;| kn2->kn_tqe.tqe_prev = > | kq_head.tqh_last; > +--- > kq_head.tqh_last = | *kq_head.tqh_last = kn2; > &kn1->kn_tqe.tqe_next; | > +--- > | kq_head.tqh_last = > | &kn2->kn_tqe.tqe_next; > > The marker would never appear on the queue. The macro is broken. Here is a fix. #define TAILQ_INSERT_TAIL(head, elm, field) do {\ void **savepp; \ (elm)->field.tqe_next = NULL; \ (elm)->field.tqe_prev = (head)->tqh_last; \ savepp = (head)->tqh_last; \ (head)->tqh_last = &(elm)->field.tqe_next; \ (elm)->field.tqe_prev = savepp; \ *savepp = (elm);\ } while (0) I'm not sure that it's possible to totally close the race; I think the way you would have to do it is to traverse the tqh_last pointer until the tqh_last->file.tqe_next == NULL, before reference it (ugly). The failure mode with the race is an inverted insertion order, which I think, in this case, is not a problem. The double setting of the prev guarantees it has a "harmless" value for a traversal during insertion; it acts as if the insertion had not occurred, if there are two that occurred simultaneously, until both sets of data are valid. This is doable in Intel assembly, I think (CMPXCHGL). Ugh. On other architectures, there's no way to avoid a mutex (e.g. MIPS). I hate tailq's. 8-(. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On Wed, Oct 09, 2002 at 04:07:45PM -0700, Terry Lambert wrote: > Stefan Farfeleder wrote: > > Is it just a warning or does it pose a real problem? > > > > I think the problem with the current code is that knote_{en,de}queue can > > be executed in parallel (on another CPU, spl*() can't prevent that, can > > it?) with kqueue_scan and that kq->kq_head thus can be corrupted. > > Or am I totally wrong? > > My patch would have worked, in that case, since it would ensure > one marker entry with a unique stack address per simultaneous > scanner. > > It has to be that the queue itself is being deleted out from > under it: the problem is not the scan, nor the insert or the > delete. > > Most likely, this is for an object whose queue is not tracked > by process, or for a process queue that's being examined by > another process (e.g. kevent's on fork/exit/etc.). > > You can verify this for your own satisfaction by looking at the > pointer manipulation order for the insertion and deletion; the > insertion sets the next pointer before setting the pointer to > the inserted object, and the deletion sets the pointer that > used to point to the deleted object to the delete object's next, > before deleting the object. Thus, traversals in progress should > not result in an error. Imagine this scenario where CPU 0 inserts a knote kn1 (the marker) in knote_scan and CPU 1 kn2 in kqueue_enqueue: CPU 0 | CPU 1 +--- kn1->kn_tqe.tqe_next = NULL;| | +--- kn1->kn_tqe.tqe_prev = | kn2->kn_tqe.tqe_next = NULL; kq_head.tqh_last; | +--- *kq_head.tqh_last = kn1;| kn2->kn_tqe.tqe_prev = | kq_head.tqh_last; +--- kq_head.tqh_last = | *kq_head.tqh_last = kn2; &kn1->kn_tqe.tqe_next; | +--- | kq_head.tqh_last = | &kn2->kn_tqe.tqe_next; The marker would never appear on the queue. Regards, Stefan Farfeleder To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On 10 Oct, Stefan Farfeleder wrote: > On Tue, Oct 08, 2002 at 09:26:29PM -0700, Don Lewis wrote: >> On 8 Oct, Stefan Farfeleder wrote: >> > However, WITNESS complains (only once) about this: >> > lock order reversal >> > 1st 0xc662140c kqueue mutex (kqueue mutex) @ >/freebsd/current/src/sys/kern/kern_event.c:714 >> > 2nd 0xc6727d00 pipe mutex (pipe mutex) @ >/freebsd/current/src/sys/kern/sys_pipe.c:1478 >> >> That's pretty similar to the lock order reversal I've seen in the pipe >> code and it's interaction with sigio, which is not suprising since >> pipeselwakeup() calls both pgsigio() and KNOTE(), often while the pipe >> lock is held. Correctly fixing this doesn't look easy ... > > Is it just a warning or does it pose a real problem? It's a warning of a potential problem. If someone else was trying to grab the locks in the opposite order at the same time, you'd get a deadlock. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
Stefan Farfeleder wrote: > Is it just a warning or does it pose a real problem? > > I think the problem with the current code is that knote_{en,de}queue can > be executed in parallel (on another CPU, spl*() can't prevent that, can > it?) with kqueue_scan and that kq->kq_head thus can be corrupted. > Or am I totally wrong? My patch would have worked, in that case, since it would ensure one marker entry with a unique stack address per simultaneous scanner. It has to be that the queue itself is being deleted out from under it: the problem is not the scan, nor the insert or the delete. Most likely, this is for an object whose queue is not tracked by process, or for a process queue that's being examined by another process (e.g. kevent's on fork/exit/etc.). You can verify this for your own satisfaction by looking at the pointer manipulation order for the insertion and deletion; the insertion sets the next pointer before setting the pointer to the inserted object, and the deletion sets the pointer that used to point to the deleted object to the delete object's next, before deleting the object. Thus, traversals in progress should not result in an error. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On Tue, Oct 08, 2002 at 09:26:29PM -0700, Don Lewis wrote: > On 8 Oct, Stefan Farfeleder wrote: > > On Mon, Oct 07, 2002 at 03:48:45AM -0700, Terry Lambert wrote: > > > Following the advice from the spl* man page I turned the spl* calls to a > > mutex and was surprised to see it working. My SMP -current survived a 'make > > -j16 buildworld' with make using kqueue() (which it did not a single > > time out of >30 times before). Further testings will follow tomorrow. Building 6 worlds in a row with -j ranging from 4 to 128 didn't crash it. > > However, WITNESS complains (only once) about this: > > lock order reversal > > 1st 0xc662140c kqueue mutex (kqueue mutex) @ >/freebsd/current/src/sys/kern/kern_event.c:714 > > 2nd 0xc6727d00 pipe mutex (pipe mutex) @ >/freebsd/current/src/sys/kern/sys_pipe.c:1478 > > That's pretty similar to the lock order reversal I've seen in the pipe > code and it's interaction with sigio, which is not suprising since > pipeselwakeup() calls both pgsigio() and KNOTE(), often while the pipe > lock is held. Correctly fixing this doesn't look easy ... Is it just a warning or does it pose a real problem? I think the problem with the current code is that knote_{en,de}queue can be executed in parallel (on another CPU, spl*() can't prevent that, can it?) with kqueue_scan and that kq->kq_head thus can be corrupted. Or am I totally wrong? @Poul: Since you are the only person who reported a kernel crash too, does the version with the mutex work for you? Regards, Stefan Farfelder To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On 8 Oct, Stefan Farfeleder wrote: > On Mon, Oct 07, 2002 at 03:48:45AM -0700, Terry Lambert wrote: > Following the advice from the spl* man page I turned the spl* calls to a > mutex and was surprised to see it working. My SMP -current survived a 'make > -j16 buildworld' with make using kqueue() (which it did not a single > time out of >30 times before). Further testings will follow tomorrow. > > However, WITNESS complains (only once) about this: > lock order reversal > 1st 0xc662140c kqueue mutex (kqueue mutex) @ >/freebsd/current/src/sys/kern/kern_event.c:714 > 2nd 0xc6727d00 pipe mutex (pipe mutex) @ >/freebsd/current/src/sys/kern/sys_pipe.c:1478 That's pretty similar to the lock order reversal I've seen in the pipe code and it's interaction with sigio, which is not suprising since pipeselwakeup() calls both pgsigio() and KNOTE(), often while the pipe lock is held. Correctly fixing this doesn't look easy ... To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On Mon, Oct 07, 2002 at 03:48:45AM -0700, Terry Lambert wrote: > > *** OK, it's very hard to believe you didn't break into the > *** debugger and manually call "pnaic" to get this to happen. You're right, this is exactly what I did. > I can't personally repeat the problem, so you're elected to do > the legwork on this one. 8-(. Following the advice from the spl* man page I turned the spl* calls to a mutex and was surprised to see it working. My SMP -current survived a 'make -j16 buildworld' with make using kqueue() (which it did not a single time out of >30 times before). Further testings will follow tomorrow. However, WITNESS complains (only once) about this: lock order reversal 1st 0xc662140c kqueue mutex (kqueue mutex) @ /freebsd/current/src/sys/kern/kern_event.c:714 2nd 0xc6727d00 pipe mutex (pipe mutex) @ /freebsd/current/src/sys/kern/sys_pipe.c:1478 Regards, Stefan Farfeleder Index: sys/eventvar.h === RCS file: /home/ncvs/src/sys/sys/eventvar.h,v retrieving revision 1.4 diff -u -r1.4 eventvar.h --- sys/eventvar.h 18 Jul 2000 19:31:48 - 1.4 +++ sys/eventvar.h 8 Oct 2002 16:06:46 - @@ -35,6 +35,7 @@ struct kqueue { TAILQ_HEAD(kqlist, knote) kq_head; /* list of pending event */ int kq_count; /* number of pending events */ + struct mtx kq_mtx; /* protect kq_head */ struct selinfo kq_sel; struct filedesc *kq_fdp; int kq_state; Index: kern/kern_event.c === RCS file: /home/ncvs/src/sys/kern/kern_event.c,v retrieving revision 1.46 diff -u -r1.46 kern_event.c --- kern/kern_event.c 3 Oct 2002 06:03:26 - 1.46 +++ kern/kern_event.c 8 Oct 2002 19:22:27 - @@ -375,7 +375,7 @@ if (error) goto done2; kq = malloc(sizeof(struct kqueue), M_KQUEUE, M_WAITOK | M_ZERO); - TAILQ_INIT(&kq->kq_head); + mtx_init(&kq->kq_mtx, "kqueue mutex", NULL, MTX_DEF); FILE_LOCK(fp); fp->f_flag = FREAD | FWRITE; fp->f_type = DTYPE_KQUEUE; @@ -709,13 +709,15 @@ error = 0; goto done; } + splx(s); + mtx_lock(&kq->kq_mtx); TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); while (count) { kn = TAILQ_FIRST(&kq->kq_head); TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); if (kn == &marker) { - splx(s); + mtx_unlock(&kq->kq_mtx); if (count == maxevents) goto retry; goto done; @@ -737,10 +739,10 @@ if (kn->kn_flags & EV_ONESHOT) { kn->kn_status &= ~KN_QUEUED; kq->kq_count--; - splx(s); + mtx_unlock(&kq->kq_mtx); kn->kn_fop->f_detach(kn); knote_drop(kn, td); - s = splhigh(); + mtx_lock(&kq->kq_mtx); } else if (kn->kn_flags & EV_CLEAR) { kn->kn_data = 0; kn->kn_fflags = 0; @@ -751,19 +753,19 @@ } count--; if (nkev == KQ_NEVENTS) { - splx(s); + mtx_unlock(&kq->kq_mtx); error = copyout(&kq->kq_kev, ulistp, sizeof(struct kevent) * nkev); ulistp += nkev; nkev = 0; kevp = kq->kq_kev; - s = splhigh(); + mtx_lock(&kq->kq_mtx); if (error) break; } } TAILQ_REMOVE(&kq->kq_head, &marker, kn_tqe); - splx(s); + mtx_unlock(&kq->kq_mtx); done: if (nkev != 0) error = copyout(&kq->kq_kev, ulistp, @@ -887,6 +889,7 @@ } } FILEDESC_UNLOCK(fdp); + mtx_destroy(&kq->kq_mtx); free(kq, M_KQUEUE); fp->f_data = NULL; @@ -1051,14 +1054,14 @@ knote_enqueue(struct knote *kn) { struct kqueue *kq = kn->kn_kq; - int s = splhigh(); KASSERT((kn->kn_status & KN_QUEUED) == 0, ("knote already queued")); + mtx_lock(&kq->kq_mtx); TAILQ_INSERT_TAIL(&kq->kq_head, kn, kn_tqe); kn->kn_status |= KN_QUEUED; kq->kq_count++; - splx(s); + mtx_unlock(&kq->kq_mtx); kqueue_wakeup(kq); } @@ -1066,14 +1069,14 @@ knote_dequeue(struct knote *kn) { struct kqueue *kq = kn->kn_kq; - int s = splhigh(); KASSERT(kn->kn_status & KN_QUEUED, ("knote not queued")); + mtx_lock(&kq->kq_mtx); T
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On Mon, 7 Oct 2002, Terry Lambert wrote: > Stefan Farfeleder wrote: > > > > I'm confused why marker - if it was removed by TAILQ_REMOVE - hasn't > > kn_tqe.tqe_next and kn_tqe.tqe_prev set to (void *)-1. because that only happens if the debug code in queue.h is enabled, which it is not.. > > OK, what this means is that the marker queue entry was removed > by something else going in there. > > THis shouldn't happen. > > Try adding this before the initialization of the marker data: > > bzero(&marker, sizeof(marker)); > > That should keep it from matching any removal criteria. THe only > way this could keep crashing after this mod is if the queue is > being destroyed out from under you. > > The implication here is that the queue should be protected by the > object lock for the object for which the pointer to the queue > instance is an element. > > Fixing this would be very hard (IMO). > > The next step (assuming it still panics) is to add: > > #define KQ_FREE 0x80 > > ...and set it into kq_state on a kqueue that has been freed and/or > deallocated somewhere (then check to see if it's set, after the > panic). Ugly, but it will tell you whether or not that's what's > happening (scanning a dead queue). > > The worst case is scanning a dead queue quose memory has been > reused for some other purpose. 8-(. > > I can't personally repeat the problem, so you're elected to do > the legwork on this one. 8-(. > > -- Terry > > To Unsubscribe: send mail to [EMAIL PROTECTED] > with "unsubscribe freebsd-current" in the body of the message > To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: [PATCH] Re: Junior Kernel Hacker page updated...
Stefan Farfeleder wrote: > On Sun, Oct 06, 2002 at 11:14:26PM -0700, Terry Lambert wrote: > > Stefan: Did the patch fix it, or not? > > Sorry for the long delay. No, it did not. But I now have a rather > interesting core dump. I inserted a KASSERT, so that the code looks like > this: > > TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); > while (count) { > kn = TAILQ_FIRST(&kq->kq_head); > KASSERT(kn != NULL, ("TAILQ_FIRST returned NULL")); [ ... ] > panic: bremfree: bp 0xd2adf6f0 not locked Second panic, during debugger sync. > panic: TAILQ_FIRST returned NULL See below... > panic: from debugger You, manually calling "panic" inside the debugger... > syncing disks... panic: bremfree: bp 0xd2adf6f0 not locked The second panic (again). > #2 0xc01babe7 in panic () at /freebsd/current/src/sys/kern/kern_shutdown.c:508 2nd panic. > #10 0xc01babe7 in panic () at /freebsd/current/src/sys/kern/kern_shutdown.c:508 Manual panic (no arguments). > #18 0xc01babcf in panic (fmt=0x0) > > at /freebsd/current/src/sys/kern/kern_shutdown.c:494 > > #19 0xc01a1212 in kqueue_scan (fp=0x0, maxevents=4, ulistp=0xbfbfeb90, > > tsp=0xc754f828, td=0xc6426b60) > > at /freebsd/current/src/sys/kern/kern_event.c:717 *** OK, it's very hard to believe you didn't break into the *** debugger and manually call "pnaic" to get this to happen. Why? Because the "fmt" string is 0x0, which indicates that you called the panic manually, instead of being the address of the string "TAILQ_FIRST returned NULL", like you'd expect. > #19 0xc01a1212 in kqueue_scan (fp=0x0, maxevents=4, ulistp=0xbfbfeb90, > > tsp=0xc754f828, td=0xc6426b60) > > at /freebsd/current/src/sys/kern/kern_event.c:717 > > 717 KASSERT(kn != NULL, ("TAILQ_FIRST returned NULL")); > > (kgdb) info locals > > kq = (struct kqueue *) 0xc754f800 > > kevp = (struct kevent *) 0xc754f828 > > atv = {tv_sec = 0, tv_usec = 0} > > rtv = {tv_sec = 434, tv_usec = -1070420864} > > ttv = {tv_sec = 1, tv_usec = -1070411616} > > kn = (struct knote *) 0x0 > > marker = {kn_link = {sle_next = 0xc01b0d37}, kn_selnext = { > > sle_next = 0xc0368a20}, kn_tqe = {tqe_next = 0x0, tqe_prev = 0xc6650ac8}, > > kn_kq = 0xc6426bcc, kn_kevent = {ident = 3344374324, filter = -30080, > > flags = 49206, fflags = 3224546432, data = 431, udata = 0xe2c9dca0}, > > kn_status = 16, kn_sfflags = -1070167424, kn_sdata = 8, kn_ptr = { > > p_fp = 0xc032ac80, p_proc = 0xc032ac80}, kn_fop = 0x1af, kn_hook = 0x3} > > count = 4 > > timeout = 0 > > nkev = 0 > > error = 0 > > (kgdb) p *kq > > $2 = {kq_head = {tqh_first = 0x0, tqh_last = 0xc754f800}, kq_count = 1, > > kq_sel = {si_thrlist = {tqe_next = 0x0, tqe_prev = 0x0}, si_thread = 0x0, > > si_note = {slh_first = 0x0}, si_flags = 0}, kq_fdp = 0xc7571a00, > > kq_state = 0, kq_kev = {{ident = 23, filter = -1, flags = 1, fflags = 0, > > data = 69, udata = 0x80cd800}, {ident = 23, filter = -1, flags = 1, > > fflags = 0, data = 164, udata = 0x80cd800}, {ident = 27, filter = -1, > > flags = 1, fflags = 0, data = 218, udata = 0x80cf800}, {ident = 19, > > filter = -1, flags = 1, fflags = 0, data = 182, udata = 0x80cc800}, { > > ident = 0, filter = 0, flags = 0, fflags = 0, data = 0, udata = 0x0}, { > > ident = 0, filter = 0, flags = 0, fflags = 0, data = 0, udata = 0x0}, { > > ident = 0, filter = 0, flags = 0, fflags = 0, data = 0, udata = 0x0}, { > > ident = 0, filter = 0, flags = 0, fflags = 0, data = 0, udata = 0x0}}} > > (kgdb) q > > frog# ^Dexit > > Script done on Mon Oct 7 11:32:50 2002 > > I'm confused why marker - if it was removed by TAILQ_REMOVE - hasn't > kn_tqe.tqe_next and kn_tqe.tqe_prev set to (void *)-1. OK, what this means is that the marker queue entry was removed by something else going in there. THis shouldn't happen. Try adding this before the initialization of the marker data: bzero(&marker, sizeof(marker)); That should keep it from matching any removal criteria. THe only way this could keep crashing after this mod is if the queue is being destroyed out from under you. The implication here is that the queue should be protected by the object lock for the object for which the pointer to the queue instance is an element. Fixing this would be very hard (IMO). The next step (assuming it still panics) is to add: #define KQ_FREE 0x80 ...and set it into kq_state on a kqueue that has been freed and/or deallocated somewhere (then check to see if it's set, after the panic). Ugly, but it will tell you whether or not that's what's happening (scanning a dead queue). The worst case is scanning a dead queue quose memory has been reused for some other purpose. 8-(. I can't personally repeat the problem, so you're elected to do the legwork on this one. 8-(. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body
Re: [PATCH] Re: Junior Kernel Hacker page updated...
On Sun, Oct 06, 2002 at 11:14:26PM -0700, Terry Lambert wrote: > > Stefan: Did the patch fix it, or not? Sorry for the long delay. No, it did not. But I now have a rather interesting core dump. I inserted a KASSERT, so that the code looks like this: TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); while (count) { kn = TAILQ_FIRST(&kq->kq_head); KASSERT(kn != NULL, ("TAILQ_FIRST returned NULL")); /* * Skip over all markers which are not ours. This looks * unsafe, but we can't hit the end of the list without * hitting our own marker. */ while ((kn->kn_status & KN_MARKER) && (kn != &marker)) { kn = TAILQ_NEXT(kn, kn_tqe); } TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); if (kn == &marker) { [...] Script started on Mon Oct 7 11:26:10 2002 frog# ../bin/gdb -k crash/kernel.debug.3 crash/vmcore.3 GNU gdb 5.2.0 (FreeBSD) 20020627 Copyright 2002 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-undermydesk-freebsd"... panic: bremfree: bp 0xd2adf6f0 not locked panic messages: --- panic: TAILQ_FIRST returned NULL cpuid = 1; lapic.id = 0100 panic: from debugger cpuid = 1; lapic.id = 0100 boot() called on cpu#1 syncing disks... panic: bremfree: bp 0xd2adf6f0 not locked cpuid = 1; lapic.id = 0100 boot() called on cpu#1 Uptime: 13m27s pfs_vncache_unload(): 1 entries remaining Dumping 1023 MB ata0: resetting devices .. done ad0: timeout sending command=c5 s=d0 e=00 ad0: error executing commandata0: resetting devices .. done 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 384 400 416 432 448 464 480 496 512 528 544 560 576 592 608 624 640 656 672 688 704 720 736 752 768 784 800 816 832 848 864 880 896 912 928 944 960 976 992 1008 --- #0 doadump () at /freebsd/current/src/sys/kern/kern_shutdown.c:223 223 dumping++; (kgdb) bt #0 doadump () at /freebsd/current/src/sys/kern/kern_shutdown.c:223 #1 0xc01ba92a in boot (howto=260) at /freebsd/current/src/sys/kern/kern_shutdown.c:355 #2 0xc01babe7 in panic () at /freebsd/current/src/sys/kern/kern_shutdown.c:508 #3 0xc01fcc77 in bremfree (bp=0xd2adf6f0) at /freebsd/current/src/sys/kern/vfs_bio.c:632 #4 0xc01fe798 in vfs_bio_awrite (bp=0x3) at /freebsd/current/src/sys/kern/vfs_bio.c:1633 #5 0xc02a7afa in ffs_fsync (ap=0xe2c9d8fc) at /freebsd/current/src/sys/ufs/ffs/ffs_vnops.c:252 #6 0xc02a7829 in VOP_FSYNC (vp=0x0, cred=0x0, waitfor=0, td=0x0) at vnode_if.h:612 #7 0xc02a6d3b in ffs_sync (mp=0xc642ba00, waitfor=2, cred=0xc22b2e80, td=0xc03643a0) at /freebsd/current/src/sys/ufs/ffs/ffs_vfsops.c:1127 #8 0xc0210998 in sync (td=0xc03643a0, uap=0x0) at /freebsd/current/src/sys/kern/vfs_syscalls.c:130 #9 0xc01ba52b in boot (howto=256) at /freebsd/current/src/sys/kern/kern_shutdown.c:264 #10 0xc01babe7 in panic () at /freebsd/current/src/sys/kern/kern_shutdown.c:508 #11 0xc013b1d2 in db_panic () at /freebsd/current/src/sys/ddb/db_command.c:450 #12 0xc013b152 in db_command (last_cmdp=0xc035db40, cmd_table=0x0, aux_cmd_tablep=0xc03577fc, aux_cmd_tablep_end=0xc0357800) at /freebsd/current/src/sys/ddb/db_command.c:346 ---Type to continue, or q to quit--- #13 0xc013b266 in db_command_loop () at /freebsd/current/src/sys/ddb/db_command.c:472 #14 0xc013deca in db_trap (type=3, code=0) at /freebsd/current/src/sys/ddb/db_trap.c:72 #15 0xc02e9f60 in kdb_trap (type=3, code=0, regs=0xe2c9db94) at /freebsd/current/src/sys/i386/i386/db_interface.c:166 #16 0xc0302027 in trap (frame= {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = -968725664, tf_esi = 256, tf_ebp = -490087456, tf_isp = -490087488, tf_ebx = 0, tf_edx = 0, tf_ecx = 32, tf_eax = 18, tf_trapno = 3, tf_err = 0, tf_eip = -1070685611, tf_cs = 8, tf_eflags = 658, tf_esp = -1070272669, tf_ss = -1070406694}) at /freebsd/current/src/sys/i386/i386/trap.c:605 #17 0xc02eb768 in calltrap () at {standard input}:99 #18 0xc01babcf in panic (fmt=0x0) at /freebsd/current/src/sys/kern/kern_shutdown.c:494 #19 0xc01a1212 in kqueue_scan (fp=0x0, maxevents=4, ulistp=0xbfbfeb90, tsp=0xc754f828, td=0xc6426b60) at /freebsd/current/src/sys/kern/kern_event.c:717 #20 0xc01a0ad1 in kevent (td=0xc6426b60, uap=0xe2c9dd10) at /freebsd/current/src/sys/kern/kern_event.c:470 #21 0xc030299e in syscall (frame= {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077937792, tf_esi = 4, tf_ebp = -1077941256, tf_isp = -490087052, tf_ebx = -1077937772, tf_edx = 2184, tf_---Type to continue, or q to quit--- ecx = 0, tf_eax = 363, tf_trapno = 0, tf_err =
Re: [PATCH] Re: Junior Kernel Hacker page updated...
Terry Lambert wrote: > Stefan Farfeleder wrote: > > (kgdb) l *kqueue_scan+0x242 > > 0xc01a1212 is in kqueue_scan > > (/freebsd/current/src/sys/kern/kern_event.c:716). > > 713 TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); > > 714 while (count) { > > 715 kn = TAILQ_FIRST(&kq->kq_head); > > translates to: mov(%edi),%ebx > > 716 TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); > > translates to: cmpl $0x0,0x8(%ebx) > > > > This line causes the page fault because %ebx is 0. [ ... ] > Please try the attached patch. > > -- Terry Stefan: Did the patch fix it, or not? -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
[PATCH] Re: Junior Kernel Hacker page updated...
Stefan Farfeleder wrote: > (kgdb) l *kqueue_scan+0x242 > 0xc01a1212 is in kqueue_scan > (/freebsd/current/src/sys/kern/kern_event.c:716). > 713 TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); > 714 while (count) { > 715 kn = TAILQ_FIRST(&kq->kq_head); > translates to: mov(%edi),%ebx > 716 TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); > translates to: cmpl $0x0,0x8(%ebx) > > This line causes the page fault because %ebx is 0. This can't happen, at least from an "empty list" perspective, even if kqueue_scan() is reentered, since the "marker" is an auto allocation on the stack, and a different stack means a different marker gets inserted (marker isn't static, so having more than one insert of the marker won't result in only a single insertion). I suspect that what is hapening is that the code is being reentered, and one marker is being treated as an event, because of whatever garbage happens to be on the stack in the allocated marker. The marker is removed, and then it is not found before you hit the end of the list. Please try the attached patch. -- Terry Index: sys/event.h === RCS file: /cvs/src/sys/sys/event.h,v retrieving revision 1.21 diff -c -r1.21 event.h *** sys/event.h 29 Jun 2002 19:14:52 - 1.21 --- sys/event.h 5 Oct 2002 15:12:24 - *** *** 160,165 --- 160,166 #define KN_QUEUED 0x02/* event is on queue */ #define KN_DISABLED 0x04/* event is disabled */ #define KN_DETACHED 0x08/* knote is detached */ + #define KN_MARKER 0x10/* knote is a scan marker */ #define kn_id kn_kevent.ident #define kn_filter kn_kevent.filter Index: kern/kern_event.c === RCS file: /cvs/src/sys/kern/kern_event.c,v retrieving revision 1.45 diff -c -r1.45 kern_event.c *** kern/kern_event.c 17 Aug 2002 02:36:16 - 1.45 --- kern/kern_event.c 5 Oct 2002 15:13:26 - *** *** 653,658 --- 653,659 FILE_LOCK_ASSERT(fp, MA_NOTOWNED); + marker.kn_status = KN_MARKER; kq = (struct kqueue *)fp->f_data; count = maxevents; if (count == 0) *** *** 713,718 --- 714,727 TAILQ_INSERT_TAIL(&kq->kq_head, &marker, kn_tqe); while (count) { kn = TAILQ_FIRST(&kq->kq_head); + /* +* Skip over all markers which are not ours. This looks +* unsafe, but we can't hit the end of the list without +* hitting our own marker. +*/ + while ((kn->kn_status & KN_MARKER) && (kn != &marker)) { + kn = TAILQ_NEXT(kn, kn_tqe); + } TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe); if (kn == &marker) { splx(s);