> From: Chris Garrigues <[EMAIL PROTECTED]> > Date: Fri, 31 Aug 2007 13:09:52 -0500 > > > From: Charlie Brady <[EMAIL PROTECTED]> > > Date: Fri, 31 Aug 2007 13:49:02 -0400 (EDT) > > > > > > On Fri, 31 Aug 2007, Chris Garrigues wrote: > > > > >> From: Chris Garrigues <[EMAIL PROTECTED]> > > >> Date: Wed, 29 Aug 2007 09:27:42 -0500 > > ... > > >> Any idea what's going on here? It requires a -9 to kill the processes. > > ... > > > and then it hangs forever and requires a -9. > > > > Are you quite sure of that? What happens if you use a TERM or QUIT signal? > > Have you attached strace (or whatever syscall tracing tool is appropriate > > for your platform)? > > I just confirmed it. If I strace I just see that it's hanging on "read(0, ". > > Since I hadn't diagnosed what was triggering it until just now, I didn't know > how to provide a test case. Now that I do, I'll try to get a real strace. > > > > Am I doing the wrong thing, is this a bug, or is there something odd > > > about my > > > system? > > > > Last time I looked the qpsmtpd timeout alarm only applied while parsing > > SMTP or while receiving messages, but not while plugins were executing. I > > haven't seen any discussion about possible fixes for that (but I haven't > > checked that it hasn't been fixed). That could explain qpsmtpd waiting > > forever, but wouldn't explain faulure to terminate on TERM and QUIT > > signals. > > Note that it's no longer in my plugin at this point.
Okay, I did an strace while telnetting to the smtp port. If I type "quit" after I get the 550, it does the right thing, but if I just let it sit there, it never times out. According to the strace: write(2, "21349 Plugin tarpit, hook deny r"..., 51) = 51 write(2, "21349 550 No such user as utterl"..., 60) = 60 write(1, "550 No such user as utterlybogus"..., 55) = 55 alarm(120) = 0 read(0, 0x8c43798, 4096) = ? ERESTARTSYS (To be restarted) --- SIGALRM (Alarm clock) @ 0 (0) --- sigreturn() = ? (mask now []) rt_sigprocmask(SIG_BLOCK, [ALRM], NULL, 8) = 0 rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0 read(0, The read following the alarm does indeed wait for 120 seconds, but then the alarm is blocked.... ...and I've just concluded that the problem must be in a library of my own called by my code. It's code that I wrote years ago...I'll have to figure out why I blocked the signals that I did when I did. Thanks for helping me find the problem in my own code. Chris -- Chris Garrigues Trinsic Solutions President 710-B West 14th Street Austin, TX 78701-1798 http://www.trinsics.com/blog http://www.trinsics.com 512-322-0180 Would you rather proactively pay for uptime or reactively pay for downtime? Trinsic Solutions Your Trusted Friends in Proactive IT.
pgpZsDDAX5nTU.pgp
Description: PGP signature