It turns out my $transaction->body_spool; workaround didn't work around the problem at all. After getting a few blank bodies, I added more debugging code in Transaction.pm. I noticed that I was getting a 0 byte file spooled to disk. I added more debugging code and learned that _body_array was empty. Handy.
Something was deleting the file and emptying the Transaction. Since the unlink command only exists once in Transaction::DESTROY, I added some debug calls to see when and why it was getting called. 92689 dispatching DATA 92689 354 go ahead 92689 spooling message to disk 92689 (data_post) bogus_bounce: pass, not a bounce 92689 (data_post) basicheaders: pass 92689 (data_post) domainkeys: pass: good 92689 (data_post) spamassassin: pass, Ham, -0.5 < 100 92700 DESTROY called by main, ./qpsmtpd-forkserver, 0 92700 unlinked /usr/home/qpsmtpd/tmp/1338420262:92689:0 92689 (data_post) dspam: pass, agree, Innocent, 0.85 c 92689 (data_post) virus::clamdscan: pass, clamav reports clean 92703 (queue) queue::qmail_2dqueue: (for 92689) Queuing to /var/qmail/bin/qmail-queue 92689 250 Queued! 1338420264 qp 92703 <e675ded20d54ddb7d4c5b6d2b97d8...@r15.ashford.edu> 92689 DESTROY called by Qpsmtpd::SMTP, lib/Qpsmtpd/SMTP.pm, 67 92689 unlinked /usr/home/qpsmtpd/tmp/1338420262:92689:0 92689 dispatching QUIT 92689 221 mail.theartfarm.com closing connection. Have a wonderful day. 92689 click, disconnecting 92689 (disconnect) karma: unknown 92689 DESTROY called by Qpsmtpd::TcpServer, lib/Qpsmtpd/TcpServer.pm, 139 92689 (post-connection) connection_time: 8.250 s. 92689 DESTROY called by main, ./qpsmtpd-forkserver, 0 92633 cleaning up after 92689 It turns out the file exists on disk when the dspam plugin starts to run, but it's gone by the time the clamdscan plugin runs. More research revealed a code locations that tickles this bug. In the dspam program, I fork to enable bidirectional communication with the dspam command line program. (There isn't a documented way to talk directly to the dspam server). The code looks like this: my $in_fh; if ( ! open($in_fh, '-|') ) { # forks a child process open(my $out_fh, "|$filtercmd") or die "Can't run $filtercmd: $!\n"; print $out_fh $message; close $out_fh; exit(0); }; my $response = <$in_fh>; close $in_fh; This is code that I found in the perl cookbook for safely performing a bidirectional open on a piped process without deadlocking. It works perfectly and reliably. But it's somewhat subtle, as open with a hyphen (the '-|') forks the process. The child does the writing, and the parent does the reading. It's one of the few ways to perform bidirectional communication with an external process. Another is open2, and there's an alternate method in the dspam plugin that uses open2, but I noticed failures in testing that I wasn't able to resolve. When the child process calls exit(0), the qpsmtpd server detects that and runs Transaction::DESTROY. That in turn caused the failures I observed in the parent process. How can qpsmtpd be altered such that a child process exit doesn't harm the parent and subsequent plugins? Matt PS: This also affects two of the queue plugins that fork and have exit calls in a child process. You can see that in the logs above. In the queue plugins, it just happens to be late enough, and nobody bothers running two queue plugins, that this bug hasn't caused anyone else a problem. ````````````````````````````````````````````````````````````````````````` Matt Simerson http://matt.simerson.net/ Systems Engineer http://www.tnpi.net/ Mail::Toaster - http://mail-toaster.org/ NicTool - http://www.nictool.com/ ````````````````````````````````````````````````````````````````````````` On May 30, 2012, at 10:07 AM, Matt Simerson wrote: > > I don't understand why the spool file is disappearing, but I have a > workaround. I modified the get_filename method as follows: > > sub get_filename { > my $self = shift; > my $transaction = shift || $self->qp->transaction; > > my $filename = $transaction->body_filename; > > if ( ! $filename ) { > $self->log( LOGWARN, "Cannot process due to lack of filename" ); > return; > } > > if ( ! -f $filename ) { > $self->log( LOGERROR, "spool file missing! Attempting to respool" ); > $transaction->body_spool; > $filename = $transaction->body_filename; > if ( ! -f $filename ) { > $self->log( LOGERROR, "skip: failed spool to $filename! Giving up" > ); > return; > }; > $self->log( LOGINFO, "Spooled successfully as $filename" ); > } > > return $filename; > }; > > > Instead of failing to scan about 2/3 of the messages, they get respooled and > scanned as they should. But why is respooling necessary? > > Matt > > On May 29, 2012, at 10:03 PM, Matt Simerson wrote: > >> For quite a long time (as long as I've used qpsmtpd), I've been getting >> chmod errors in my logs from the clamdscan plugin. The errors look like this: >> >> (data_post) virus::clamdscan: chmod: >> /usr/home/qpsmtpd/tmp/1338343891:23427:0: No such file or directory >> >> I've also received a few virus emails, that I expected clamav to have >> caught. I'm 99% certain the reason they snuck through is because under some >> circumstances, the clamdscan plugin isn't working. Today I did some research. >> >> To make it easier to test, I abstracted the portions of the clamdscan plugin >> code that extract the filename and test it. The subroutine looks like this: >> >> sub get_filename { >> my $self = shift; >> >> my $filename = $self->qp->transaction->body_filename; >> >> if ( ! $filename ) { >> $self->log( LOGWARN, "Cannot process due to lack of filename" ); >> return; >> } >> >> if ( ! -f $filename ) { >> $self->log( LOGERROR, "skip, $filename not on disk?" ); >> return; >> } >> >> $self->log( LOGINFO, "pass, found $filename on disk." ); >> return $filename; >> }; >> >> Then I waited for a ham to get delivered, and I posted the full logs here: >> http://www.tnpi.net/internet/mail/qpsmtpd.html >> The most relevant portions are: >> 92028 354 go ahead >> 92028 spooling message to disk >> 92028 (data_post) bogus_bounce: pass, not a bounce >> 92028 (data_post) basicheaders: pass >> 92028 (data_post) domainkeys: pass, good >> 92028 (data_post) spamassassin: pass, Ham, -0.2 < 100 >> 92028 (data_post) dspam: pass, agree, Innocent, 0.99 c >> 92028 (data_post) virus::clamdscan: skip, >> /usr/home/qpsmtpd/tmp/1338351693:92028:0 not on disk? >> 92038 (queue) queue::qmail_2dqueue: (for 92028 ) Queuing qp 92038 to >> /var/qmail/bin/qmail-queue >> >> In tracking this down, I read through Transaction.pm, where I found this pod: >> >> =head2 body_filename ( ) >> >> Returns the temporary filename used to store the message contents; useful for >> virus scanners so that an additional copy doesn't need to be made. >> >> Calling C<body_filename()> also forces spooling to disk. A message is not >> spooled to disk if it's size is smaller than >> I<$self-E<gt>config("size_threshold")>, default threshold is 0, the sample >> config file sets this to 10000. >> >> and this code: >> >> sub body_filename { >> my $self = shift; >> $self->body_spool() unless $self->{_filename}; >> $self->{_body_file}->flush(); # so contents won't be cached >> return $self->{_filename}; >> } >> >> And it sure looks like that code is working as advertised. But why is the >> file disappearing from the spool directory before the clamdscan plugin can >> get at it? Or is it ever written there at all? So I copied and pasted >> that same get_filename() into the dspam plugin, and added a call to it. >> Here's the results: >> >> 2768 354 go ahead >> 2768 spooling message to disk >> 2768 (data_post) bogus_bounce: pass, not a bounce >> 2768 (data_post) basicheaders: pass >> 2768 (data_post) domainkeys: skip: unsigned >> 2768 (data_post) spamassassin: pass, Ham, 1.0 < 100 >> 2768 (data_post) dspam: pass, found /usr/home/qpsmtpd/tmp/1338352889:2768:0 >> on disk. >> 2768 (data_post) dspam: pass, agree, Innocent, 0.99 c >> 2768 (data_post) virus::clamdscan: skip, >> /usr/home/qpsmtpd/tmp/1338352889:2768:0 not on disk? >> 2779 (queue) queue::qmail_2dqueue: (for 2768 ) Queuing qp 2779 to >> /var/qmail/bin/qmail-queue >> 2768 250 Queued! 1338352891 qp 2779 >> <b77c472e-2cb4-4c9f-8096-81866b1b0...@cox.net> >> 2768 dispatching QUIT >> 2768 221 mail.theartfarm.com closing connection. Have a wonderful day. >> 2768 click, disconnecting >> >> Huh? The file is there when dspam runs but it's gone when the clamdscan >> plugin runs. >> >> Ideas? >> >> Matt >> >> PS: I am not running async. Yet. >> >> ````````````````````````````````````````````````````````````````````````` >> Matt Simerson http://matt.simerson.net/ >> Systems Engineer http://www.tnpi.net/ >> >> Mail::Toaster - http://mail-toaster.org/ >> NicTool - http://www.nictool.com/ >> ````````````````````````````````````````````````````````````````````````` >> >