Re: How to identify specific wait-state for a "DE" process?
> the arguments are printed if they're available. it's just that > on amd64 (which i'm assuming Paul is using) doesn't give you > those since they're passed in registers and aren't stored or > kept anywhere. i386 shows them.. (or at least, their current > value on their stack location): > > crash> bt/t 1 > trace: pid 1 lid 1 at 0xdc9ebe2c > sleepq_block(0,1,c0b9b25c,c0d6d2f0,0,c453ad40,0,,c4c97d80,dc9ebec0) > at sleepq_block+0x8f > cv_wait_sig(c4c97d98,c4520f00,c4d7ef00,c4d7ec30,dc9ebeb0,0,0,c4d7ec30,0,) > at cv_wait_sig+0x101 > do_sys_wait(dc9ebedc,dc9ebed8,0,0,0,,c4c97d80,2,dc9ebf00,40) at > do_sys_wait+0x1c2 > sys___wait450(c4d80d40,dc9ebf68,dc9ebf60,2,bbba3000,c0d6cdb4,dc9ebf68,1c1,0,0) > at sys___wait450+0x37 > syscall() at syscall+0x8a > --- syscall (number 449) --- > bba76937: Hmmm, I´ve never seen crash or ddb print any arguments. Either way, this feature is certainly architecture-dependant, since they all have their own calling convention(s).
re: How to identify specific wait-state for a "DE" process?
> Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? filemon(4) as written should just be replaced with a method that works without replacing system calls or borrowing fds or any of these other hacks that we have in place. i don't think it's worth spending a lot of time working on it in the current form. it's severely broken and not really fixable without a largely complete rewrite. i'm not sure 100% what the replacement should look like, but it probably will require real hooks in the replace system calls at the layer that *all* emulations use. eg, do_open() not sys_open(). .mrg.
re: How to identify specific wait-state for a "DE" process?
Stephan writes: > > # crash > > Crash version 7.99.25, image version 7.99.25. > > Output from a running system is unreliable. > > crash> trace/t 0t455 > > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > > sleepq_block() at sleepq_block+0xa2 > > cv_wait() at cv_wait+0x116 > > fd_close() at fd_close+0x39a > > fd_free() at fd_free+0x178 > > exit1() at exit1+0x10a > > sys_exit() at sys_exit+0x3a > > syscall() at syscall+0x9c > > --- syscall (number 1) --- > > > > So I guess I need to figure out which/what condvar it is waiting on... > > It would be great if crash and ddb could print the parameters. Also > most of the "show" commands do only work in ddb, but not in crash, > unfortunately. the arguments are printed if they're available. it's just that on amd64 (which i'm assuming Paul is using) doesn't give you those since they're passed in registers and aren't stored or kept anywhere. i386 shows them.. (or at least, their current value on their stack location): crash> bt/t 1 trace: pid 1 lid 1 at 0xdc9ebe2c sleepq_block(0,1,c0b9b25c,c0d6d2f0,0,c453ad40,0,,c4c97d80,dc9ebec0) at sleepq_block+0x8f cv_wait_sig(c4c97d98,c4520f00,c4d7ef00,c4d7ec30,dc9ebeb0,0,0,c4d7ec30,0,) at cv_wait_sig+0x101 do_sys_wait(dc9ebedc,dc9ebed8,0,0,0,,c4c97d80,2,dc9ebf00,40) at do_sys_wait+0x1c2 sys___wait450(c4d80d40,dc9ebf68,dc9ebf60,2,bbba3000,c0d6cdb4,dc9ebf68,1c1,0,0) at sys___wait450+0x37 syscall() at syscall+0x8a --- syscall (number 449) --- bba76937: .mrg.
Re: How to identify specific wait-state for a "DE" process?
On Sat, 9 Jan 2016, Rhialto wrote: On Wed 06 Jan 2016 at 17:44:45 +, Taylor R Campbell wrote: This only fixes the problem for certain orderings of file descriptors. I was thinking of a different hack. Given tha filemon now knows there are issues if it has to use a fd lower than its own fd, it can avoid the situation. If it happens, it might dup2() the output fd so that it gets one that is high enough, to use instead. That ought to work, since as I understand the issue is references to the file descriptor, not references to the file structure. Of course I can immediately see some disadvantages to this apprroach: - this will use an extra fd and that is observable by the process - and the process might even close that fd if it is doing some blanket close-all-fds action. Maybe these potential issues can be avoided somehow? Yes, we avoid all these issues by taking the reference on the file itself, rather than on the descriptor. +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
On Wed 06 Jan 2016 at 17:44:45 +, Taylor R Campbell wrote: > This only fixes the problem for certain orderings of file descriptors. I was thinking of a different hack. Given tha filemon now knows there are issues if it has to use a fd lower than its own fd, it can avoid the situation. If it happens, it might dup2() the output fd so that it gets one that is high enough, to use instead. That ought to work, since as I understand the issue is references to the file descriptor, not references to the file structure. Of course I can immediately see some disadvantages to this apprroach: - this will use an extra fd and that is observable by the process - and the process might even close that fd if it is doing some blanket close-all-fds action. Maybe these potential issues can be avoided somehow? -Olaf. -- ___ Olaf 'Rhialto' Seibert -- The Doctor: No, 'eureka' is Greek for \X/ rhialto/at/xs4all.nl-- 'this bath is too hot.' signature.asc Description: PGP signature
Re: How to identify specific wait-state for a "DE" process?
On Wed, 6 Jan 2016, David Holland wrote: On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? A better suggestion: remove the broken behavior of close(). Hmm, perhaps. But that sounds like a much more intrusive, and much more risky, approach. :) +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? A better suggestion: remove the broken behavior of close(). -- David A. Holland dholl...@netbsd.org
Re: How to identify specific wait-state for a "DE" process?
Date: Wed, 6 Jan 2016 09:22:44 -0800 From: Brian Buhrow hello. Is there a particular reason file descriptors are closed in ascending order? Traditionally, file descriptors 2, 1 and 0 are always in use and it seems like it might be a good idea to have those be the last to get closed. I've seen some applications that close all their descriptors in descending order. I thought that was odd, but I think Paul just came up with a good reason to do such a thing. This only fixes the problem for certain orderings of file descriptors. I think the best way to fix this properly will be to just modify sys_exit to call a new filemon_prepare_exit routine that pre-closes any filemon-owned references to files, so that filemon_close itself will not hang.
Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]
hello. Is there a particular reason file descriptors are closed in ascending order? Traditionally, file descriptors 2, 1 and 0 are always in use and it seems like it might be a good idea to have those be the last to get closed. I've seen some applications that close all their descriptors in descending order. I thought that was odd, but I think Paul just came up with a good reason to do such a thing. -Brian On Jan 6, 11:38am, Paul Goyette wrote: } Subject: Re: workqueue semantics [was Re: How to identify specific wait-st } On Wed, 6 Jan 2016, Taylor R Campbell wrote: } } > Date: Tue, 5 Jan 2016 21:48:42 -0500 } > From: Thor Lancelot Simon } > } > You can probably use workqueues for this. Looking at the manual page } > again for the first time in years, I think it's a little misleading -- } > what I believe is meant by "A work must not be enqueued again until the } > callback is called..." is really "a work item must not be re-enqueued } > before it has been processed by the *func callback", not the alternate, } > crazy reading that would imply workqueues can only have one enqueued } > item at a time. } > } > Your reading of the man page is correct: it is the struct work, not } > the struct workqueue *, that may not be reused until the callback is } > run. } > } > (I'm not sure how this would help for pgoyette's application, though.) } } I don't know how it would help, either. The best I can think of is to } have a periodic task run which checks to see if the file descriptor is } being closed; if yes, then the code could release the reference and } wake up the condvar waiter. But is this really a good thing to do? And } what would be an appropriate interval? } } } +--+--++ } | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | } | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | } | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | } +--+--++ >-- End of excerpt from Paul Goyette
Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]
On Tue, 5 Jan 2016, Thor Lancelot Simon wrote: On Wed, Jan 06, 2016 at 11:38:00AM +0800, Paul Goyette wrote: On Wed, 6 Jan 2016, Taylor R Campbell wrote: Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is meant by "A work must not be enqueued again until the callback is called..." is really "a work item must not be re-enqueued before it has been processed by the *func callback", not the alternate, crazy reading that would imply workqueues can only have one enqueued item at a time. Your reading of the man page is correct: it is the struct work, not the struct workqueue *, that may not be reused until the callback is run. (I'm not sure how this would help for pgoyette's application, though.) I don't know how it would help, either. The best I can think of is to have a periodic task run which checks to see if the file descriptor is being closed; if yes, then the code could release the reference and wake up the condvar waiter. But is this really a good thing to do? And what would be an appropriate interval? Why do you need a periodic task? When you enqueue the work to release the reference, the workqueue framework will run the callback function -- in a different thread. Then it can wake you up. Isn't that what you wanted? The problem is, what triggers "enqueue to the work queue"? It needs to happen only _after_ entry to sys_exit(), but _before_ the exit code gets around to calling fd_free() which sequentially calls fd_close(fd) ... I don't want to release the reference while things are running, only when they're done. Something like an atexit(3) for the kernel! :) +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]
On Wed, Jan 06, 2016 at 11:38:00AM +0800, Paul Goyette wrote: > On Wed, 6 Jan 2016, Taylor R Campbell wrote: > > > Date: Tue, 5 Jan 2016 21:48:42 -0500 > > From: Thor Lancelot Simon > > > > You can probably use workqueues for this. Looking at the manual page > > again for the first time in years, I think it's a little misleading -- > > what I believe is meant by "A work must not be enqueued again until the > > callback is called..." is really "a work item must not be re-enqueued > > before it has been processed by the *func callback", not the alternate, > > crazy reading that would imply workqueues can only have one enqueued > > item at a time. > > > >Your reading of the man page is correct: it is the struct work, not > >the struct workqueue *, that may not be reused until the callback is > >run. > > > >(I'm not sure how this would help for pgoyette's application, though.) > > I don't know how it would help, either. The best I can think of is to have > a periodic task run which checks to see if the file descriptor is being > closed; if yes, then the code could release the reference and wake up the > condvar waiter. But is this really a good thing to do? And what would be > an appropriate interval? Why do you need a periodic task? When you enqueue the work to release the reference, the workqueue framework will run the callback function -- in a different thread. Then it can wake you up. Isn't that what you wanted? Thor
Re: workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]
On Wed, 6 Jan 2016, Taylor R Campbell wrote: Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is meant by "A work must not be enqueued again until the callback is called..." is really "a work item must not be re-enqueued before it has been processed by the *func callback", not the alternate, crazy reading that would imply workqueues can only have one enqueued item at a time. Your reading of the man page is correct: it is the struct work, not the struct workqueue *, that may not be reused until the callback is run. (I'm not sure how this would help for pgoyette's application, though.) I don't know how it would help, either. The best I can think of is to have a periodic task run which checks to see if the file descriptor is being closed; if yes, then the code could release the reference and wake up the condvar waiter. But is this really a good thing to do? And what would be an appropriate interval? +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
workqueue semantics [was Re: How to identify specific wait-state for a "DE" process?]
Date: Tue, 5 Jan 2016 21:48:42 -0500 From: Thor Lancelot Simon You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is meant by "A work must not be enqueued again until the callback is called..." is really "a work item must not be re-enqueued before it has been processed by the *func callback", not the alternate, crazy reading that would imply workqueues can only have one enqueued item at a time. Your reading of the man page is correct: it is the struct work, not the struct workqueue *, that may not be reused until the callback is run. (I'm not sure how this would help for pgoyette's application, though.)
Re: How to identify specific wait-state for a "DE" process?
On Wed, Jan 06, 2016 at 08:10:36AM +0800, Paul Goyette wrote: > > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? You can probably use workqueues for this. Looking at the manual page again for the first time in years, I think it's a little misleading -- what I believe is meant by "A work must not be enqueued again until the callback is called..." is really "a work item must not be re-enqueued before it has been processed by the *func callback", not the alternate, crazy reading that would imply workqueues can only have one enqueued item at a time. -- Thor Lancelot Simont...@panix.com "We cannot usually in social life pursue a single value or a single moral aim, untroubled by the need to compromise with others." - H.L.A. Hart
Re: How to identify specific wait-state for a "DE" process?
This scenario reminds me of: https://www.sqlite.org/compile.html#minimum_file_descriptor -bch On 1/5/16, Paul Goyette wrote: > On Wed, 6 Jan 2016, Paul Goyette wrote: > >> I need to figure out why this is a problem when filemon(4) "borrows" the >> fd >> for stdout, but is not a problem when it borrows a real file. > > OK, I figured out what's going on. > > In the failure scenario, we have the following events: > > 1. Process opens /dev/filemon and gets fd #3 > 2. Process tells filemon to log activity to fd #1 (stdout) > 3. Process calls sys_exit(), which starts process cleanup > 4. Clean-up code tries to fd_close all open descriptors, in > order, so handles fd #0 and then fd #1 > 5. fd #1 has another reference, so we wait on the condvar, > which never gets broadcast since there's no other thread > to run. We hang here forever. > > In the success scenario, we have a slightly different sequence: > > 1. Process opens /dev/filemon and gets fd #3 > 2. Process opens up a temp file (or simply calls dup(stdout)) > and gets fd #4; the process tells filemon to log activity > to fd #4 > 3. Process calls sys_exit(), which starts process cleanup > 4. Clean-up code tries to fd_close all open descriptors, in > order, so handles fd #0 and then fd #1 > 5. In this scenario, fd#1 has no extra references, so it can > close normally. > 6. Cleanup proceeds with fd #2, and then gets to fd#3, where > /dev/filemon is open > 7. We call filemon_close() which calls fd_putfile() on fd #4. > This removes the additional reference on fd #4 > 8. Cleanup moves on to fd #4 which now has only a single > reference, so it, too, can be successfully closed! > > As long as the /dev/filemon file descriptor is numerically smaller than > the logging fd, it gets closed first, and everything works fine. But we > will hang if we try to close the logging file first because of the extra > reference. > > Does anyone have any good suggestions for how to arrange for another > thread/lwp to run so it can remove the extra reference to the logging > descriptor? > > > +--+--++ > | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | > | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | > | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | > +--+--++ >
Re: How to identify specific wait-state for a "DE" process?
On Wed, 6 Jan 2016, Paul Goyette wrote: I need to figure out why this is a problem when filemon(4) "borrows" the fd for stdout, but is not a problem when it borrows a real file. OK, I figured out what's going on. In the failure scenario, we have the following events: 1. Process opens /dev/filemon and gets fd #3 2. Process tells filemon to log activity to fd #1 (stdout) 3. Process calls sys_exit(), which starts process cleanup 4. Clean-up code tries to fd_close all open descriptors, in order, so handles fd #0 and then fd #1 5. fd #1 has another reference, so we wait on the condvar, which never gets broadcast since there's no other thread to run. We hang here forever. In the success scenario, we have a slightly different sequence: 1. Process opens /dev/filemon and gets fd #3 2. Process opens up a temp file (or simply calls dup(stdout)) and gets fd #4; the process tells filemon to log activity to fd #4 3. Process calls sys_exit(), which starts process cleanup 4. Clean-up code tries to fd_close all open descriptors, in order, so handles fd #0 and then fd #1 5. In this scenario, fd#1 has no extra references, so it can close normally. 6. Cleanup proceeds with fd #2, and then gets to fd#3, where /dev/filemon is open 7. We call filemon_close() which calls fd_putfile() on fd #4. This removes the additional reference on fd #4 8. Cleanup moves on to fd #4 which now has only a single reference, so it, too, can be successfully closed! As long as the /dev/filemon file descriptor is numerically smaller than the logging fd, it gets closed first, and everything works fine. But we will hang if we try to close the logging file first because of the extra reference. Does anyone have any good suggestions for how to arrange for another thread/lwp to run so it can remove the extra reference to the logging descriptor? +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
On Tue, 5 Jan 2016, Michael van Elst wrote: p...@vps1.whooppee.com (Paul Goyette) writes: cv_wait() at cv_wait+0x116 fd_close() at fd_close+0x39a fd_free() at fd_free+0x178 exit1() at exit1+0x10a sys_exit() at sys_exit+0x3a syscall() at syscall+0x9c --- syscall (number 1) --- So I guess I need to figure out which/what condvar it is waiting on... There is only one condvar that fd_close waits for: /* * Wait for other references to drain. This is typically * an application error - the descriptor is being closed * while still in use. * (Or just a threaded application trying to unblock its * thread that sleeps in (say) accept()). */ ... while ((ff->ff_refcnt & FR_MASK) != 0) { cv_wait(&ff->ff_closing, &fdp->fd_lock); } Yep, I found that chunk of code. Perhaps filemon needs to run in a separate thread/lwp so that it can have a chance to close its access to the "borrowed" fd? I need to figure out why this is a problem when filemon(4) "borrows" the fd for stdout, but is not a problem when it borrows a real file. +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
On Tue, 5 Jan 2016, Stephan wrote: # crash Crash version 7.99.25, image version 7.99.25. Output from a running system is unreliable. crash> trace/t 0t455 trace: pid 455 lid 1 at 0xfe8002ff0ce0 sleepq_block() at sleepq_block+0xa2 cv_wait() at cv_wait+0x116 fd_close() at fd_close+0x39a fd_free() at fd_free+0x178 exit1() at exit1+0x10a sys_exit() at sys_exit+0x3a syscall() at syscall+0x9c --- syscall (number 1) --- So I guess I need to figure out which/what condvar it is waiting on... Can´t you see that in top or "ps -l" (wchan)? No. ps only shows ps axlww UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND 0 455 4360 0 0 00 - DE tty00 0:00.00 (FM) +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
p...@vps1.whooppee.com (Paul Goyette) writes: >cv_wait() at cv_wait+0x116 >fd_close() at fd_close+0x39a >fd_free() at fd_free+0x178 >exit1() at exit1+0x10a >sys_exit() at sys_exit+0x3a >syscall() at syscall+0x9c >--- syscall (number 1) --- >So I guess I need to figure out which/what condvar it is waiting on... There is only one condvar that fd_close waits for: /* * Wait for other references to drain. This is typically * an application error - the descriptor is being closed * while still in use. * (Or just a threaded application trying to unblock its * thread that sleeps in (say) accept()). */ ... while ((ff->ff_refcnt & FR_MASK) != 0) { cv_wait(&ff->ff_closing, &fdp->fd_lock); } -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
Re: How to identify specific wait-state for a "DE" process?
> # crash > Crash version 7.99.25, image version 7.99.25. > Output from a running system is unreliable. > crash> trace/t 0t455 > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > sleepq_block() at sleepq_block+0xa2 > cv_wait() at cv_wait+0x116 > fd_close() at fd_close+0x39a > fd_free() at fd_free+0x178 > exit1() at exit1+0x10a > sys_exit() at sys_exit+0x3a > syscall() at syscall+0x9c > --- syscall (number 1) --- > > So I guess I need to figure out which/what condvar it is waiting on... > Can´t you see that in top or "ps -l" (wchan)?
Re: How to identify specific wait-state for a "DE" process?
> # crash > Crash version 7.99.25, image version 7.99.25. > Output from a running system is unreliable. > crash> trace/t 0t455 > trace: pid 455 lid 1 at 0xfe8002ff0ce0 > sleepq_block() at sleepq_block+0xa2 > cv_wait() at cv_wait+0x116 > fd_close() at fd_close+0x39a > fd_free() at fd_free+0x178 > exit1() at exit1+0x10a > sys_exit() at sys_exit+0x3a > syscall() at syscall+0x9c > --- syscall (number 1) --- > > So I guess I need to figure out which/what condvar it is waiting on... It would be great if crash and ddb could print the parameters. Also most of the "show" commands do only work in ddb, but not in crash, unfortunately.
Re: How to identify specific wait-state for a "DE" process?
On Tue, 5 Jan 2016, Michael van Elst wrote: p...@whooppee.com (Paul Goyette) writes: I'm pretty sure that the device in question is the console terminal driver /dev/console since the problem does not happen if filemon is sending the entries to a "real" file. But I can't figure why it is waiting, so I don't know what I should do to satisfy the wait state and continue. You can use crash(8) to backtrace the process in the kernel. Yeah, that does give me some additional clues; thanks for the hint. # crash Crash version 7.99.25, image version 7.99.25. Output from a running system is unreliable. crash> trace/t 0t455 trace: pid 455 lid 1 at 0xfe8002ff0ce0 sleepq_block() at sleepq_block+0xa2 cv_wait() at cv_wait+0x116 fd_close() at fd_close+0x39a fd_free() at fd_free+0x178 exit1() at exit1+0x10a sys_exit() at sys_exit+0x3a syscall() at syscall+0x9c --- syscall (number 1) --- So I guess I need to figure out which/what condvar it is waiting on... +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++
Re: How to identify specific wait-state for a "DE" process?
p...@whooppee.com (Paul Goyette) writes: >I'm pretty sure that the device in question is the console terminal >driver /dev/console since the problem does not happen if filemon is >sending the entries to a "real" file. But I can't figure why it is >waiting, so I don't know what I should do to satisfy the wait state and >continue. You can use crash(8) to backtrace the process in the kernel. -- -- Michael van Elst Internet: mlel...@serpens.de "A potential Snark may lurk in every tree."
How to identify specific wait-state for a "DE" process?
Continuing on the saga of using filemon(4) and specifying STDOUT_FILENO for the activity log ... (I got the answer to my earlier question so quickly, I figure maybe I'll get lucky again!) With the recently-committed change to spec_vnops.c rev 1.60, the filemon code now successfully writes activity entries to stdout. Everything is fine until the monitoring process (the one which has opened the filemon device) tries to exit. Whether the exit is caused by a "return" from the main() procedure, or as a result of a ^C/SIGINT, as soon as it calls sys_exit() the process hangs. A 'ps' shows the state to be "DE" (uninterruptable device wait, exitting). The process cannot be killed, and it never seems to finish whatever it is waiting for. I'm pretty sure that the device in question is the console terminal driver /dev/console since the problem does not happen if filemon is sending the entries to a "real" file. But I can't figure why it is waiting, so I don't know what I should do to satisfy the wait state and continue. I do have a suspicion, however! :) When the monitoring process is told to use a particular file descriptor for the activity log, filemon(4) uses fd_getfile(fd) to access the internal 'struct file'. At some point later, when it is time to write an activity entry, it calls (*filemon->fm_fp->f_ops->fo_write) (filemon->fm_fp, &(filemon->fm_fp->f_offset), &auio, curlwp->l_cred, FOF_UPDATE_OFFSET); I'm guessing that there's some sort of refcount somewhere that isn't getting decremented? So the process cannot completely close its stdout descriptor. And because it hangs here, the filemon(4) exit code never gets a chance to clean up and call fd_putfile() (and decrement the ref count) If the refcount issue is the correct diagnosis, what would be the best way to avoid it? Should the filemon(4) code install an at_exit() handler to take care of the call to fd_putfile() ? Is there something better? - SIDEBAR #1: The man page for fd_getfile(9) still shows two arguments for fd_getfile(): struct file * fd_getfile(struct filedesc *fdp, int fd); - SIDEBAR #2: There does not seem to be any man page file fd_putfile(), and fd_putfile() is not mentioned on filedesc(9) page. SIDEBAR #3: The man page entry for fd_getfile(9) does not mention the fact that a refcount is incremented! The code in kern/kern_descrip.c is, however, pretty clear in its comments: /* * Look up the file structure corresponding to a file descriptor * and return the file, holding a reference on the descriptor. */ file_t * fd_getfile(unsigned fd) { ... - +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +--+--++