Re: CVE-2015-0235 exposure via qpsmtpd?

2015-01-28 Thread Jared Johnson
It looks like QP core uses gethostbyaddr() but not gethostbyname().  the fcrdns 
plugin uses Net::DNS, and as far as I can tell Net::DNS never calls 
gethostbyname() either.  So I *think* we're good.

-Jared


From: Charlie Brady charlieb-qpsm...@budge.apana.org.au
Sent: Wednesday, January 28, 2015 7:24 AM
To: qpsmtpd@perl.org
Subject: CVE-2015-0235 exposure via qpsmtpd?

As you can see in the advisory:

http://www.openwall.com/lists/oss-security/2015/01/27/9

exim allows remote exploit of a buffer overflow in glibc.

Has anybody done an analysis of qpsmtpd to see whether there is a code
path via qpsmtpd (and plugins) and perl which allows the same exploit?


Re: Is there any example of a Qmail::Deliverable plugin

2015-01-21 Thread Jared Johnson
Looks like this might do what you need:

https://github.com/smtpd/qpsmtpd/blob/72f1a7a9620b5d732749a2c84e71998113f5d1fc/plugins/qmail_deliverable

-Jared


From: Reinhard Seifert reinh...@currerius.com
Sent: Wednesday, January 21, 2015 5:18 AM
To: qpsmtpd@perl.org
Subject: Is there any example of a Qmail::Deliverable plugin

Hi,

first of all, thank you guys for developing/maintaining qpstmpd, really
a great piece of work.

I am a newbee qpstmpd user and also new to Perl... and also quite new to
qmail at that.

It took me some time to dpkg-reconfigure the debian package, but now it
is working excellent together with spamassassin.


I found the Qmail::Deliverable module for perl and installed it using
cpanminus.

But what now?

I looked at perldoc Qmail::Deliverable and found those code lines to use
in a qpstmpd plugin

quote
use Qmail::Deliverable ':all';

return DECLINED if not qmail_local $recip;
return DECLINED if deliverable $recip;
return DENY, Who's that?;
/quote

I pasted those lines into /usr/share/qpstmpd/plugins/check_deliverable
and added check_deliverable to the /etc/qpsmtpd/plugins config file.

But when restarting the service I get an error message complaining about
the $recip variable.

I tried to find an example plugin which utilizes Qmail::Deliverable, but
did not succeed.

Can anyone give me a hint or even provide a working plugin? Also welcome
are hints regarding the startup of the qmail-deliverabled daemon. Do I
need to start it and will the plugin know about the port?

Thanks,
Reinhard


Re: Issues in the current HEAD

2015-01-03 Thread Jared Johnson
I'm about 51% sure that this will fix it, but I can't arrange a quick test just 
now because apparently xinetd is gone from os x mavericks.  If you feel like 
testing this I would be interested in your results:


index 3675829..141214c 100755
--- a/qpsmtpd
+++ b/qpsmtpd
@@ -17,8 +17,8 @@ delete $ENV{ENV};
 $ENV{PATH} = '/bin:/usr/bin:/var/qmail/bin';

 my $qpsmtpd = Qpsmtpd::TcpServer-new();
-$SIG{__WARN__} = sub { $qpsmtpd-warn_handler(@_) };
 $qpsmtpd-load_plugins();
+$SIG{__WARN__} = sub { $qpsmtpd-warn_handler(@_) };
 $qpsmtpd-start_connection();
 $qpsmtpd-run(\*STDIN);# pass the socket like -prefork/-forkserver
 $qpsmtpd-run_hooks(post-connection);

Master has #168 reverted so that will probably work for you now as well, but if 
the above change works I can re-submit this PR in a working state.

-Jared



From: Jared Johnson jjohn...@efolder.net
Sent: Saturday, January 3, 2015 11:11 PM
To: salvi...@gmx.ch
Cc: qpsmtpd@perl.org
Subject: Re: Issues in the current HEAD


Regarding the warn_handler message, I probably managed to mess up my recently 
merged https://github.com/smtpd/qpsmtpd/pull/168 for xinetd mode. Unfortunately 
my availability will be spotty this week. As a temporary fix, you could use a 
different mode (I recorded prefork) or un-merge that PR. Of course this may be 
the least of your problems.

On Jan 3, 2015 5:12 PM, salvi...@gmx.ch wrote:
I'm using xinetd according to http://wiki.qpsmtpd.org/doku.php?id=deploy:start, 
i.e. no daemon mode but just running

exec qpsmtpd


-Ursprüngliche Nachricht-
Gesendet: Saturday, 03 January 2015 um 23:31:06 Uhr
Von: Jared Johnson jjohn...@efolder.net
An: salvi...@gmx.ch salvi...@gmx.ch
Betreff: Re: Issues in the current HEAD

What daemon mode are you using? Prefork?
--
*** DoubleCheck identified this as CLEAN. Give feedback:
*** This is SPAM: http://filter.emailportal.com
*** More options: http://filter.emailportal.com
DoubleCheck identified this as CLEAN. Give feedback: This is 
SPAMhttp://filter.emailportal.com · Morehttp://filter.emailportal.com


Seeking senior perl dev with QP experience

2014-05-26 Thread Jared Johnson

Hi all,

My employer (eFolder, Inc - www.efolder.net) is seeking a senior Perl 
developer to join our team working on our email security software.  I 
respect a lot of the regulars on this list, and our software is built 
with qpsmtpd, so I'm hoping some among you might be available and 
interested in the position.  If so, you can find further details here:


http://goo.gl/rgecQJ

Just a quick plug, this is a 100% telecommute position with amazing 
benefits, and in my personal experience, eFolder is the best employer 
I've ever worked for.  Feel free to contact me directly with informal 
questions about the position, I would be working alongside whoever fills 
this role.


Thanks for reading!

-Jared


Re: qpsmtpd-dev

2012-06-29 Thread Jared Johnson
 At any rate, if a branch like this exists and someone else had the time
 and motivation to do the leg work of applying enhancements from the
 various forks to the core branch, I would most definitely ask my
 employers
 to put most of our codebase at your disposal to do this.  Even if the
 result wasn't something we could use, and we had to remain forked... at
 least it would be useful to _someone_ :)

 https://github.com/qpsmtpd-dev/qpsmtpd-dev

OK, as promised, I've asked my employers to put most of our codebase at
your disposal to do this :)  I'm waiting on a response -- my boss is a
very busy guy, but he's also very friendly to FOSS.  Hopefully I'll be
able to make most or all of our code available via public read-only SVN
soon.

-Jared



Re: qpsmtpd-dev

2012-06-25 Thread Jared Johnson
 I am to the point where I need to deploy. I was hoping that I could get my
 changes merged into the main branch, but that appears impossible. I hate
 the prospect of forking, but I see few alternatives.

 I hereby propose the creation of qpsmtpd-dev.

 The primary aim is merging the forks maintained by myself, Steve, Chris,
 Jared, and others. I am willing to do much of the work to accomplish this.
 I believe there is much to be gained by doing so, not the least of which
 is saving others the pain of learning they are wasting their time trying
 to improve QP in any way that doesn't fit with the unstated goals and
 vision of the project. A secondary goal is to provide a development branch
 of QP in which ideas can be rapidly tested, deployed, and when desired,
 backported to the main branch.

I like the idea, although I'm not absolutely sure that it would accomplish
everything it would mean to.  A primary example to me would be the
divergent methods used by various forks for supporting per-recipient
configuration.  That thread demonstrates how each of our various methods
made some sort of sense to the others, but were also completely
unacceptable to the others for various reasons.  The only solutions I can
think of would be (1) maintain more than one 'aggressive development'
fork, perhaps eventually letting the forks die that don't get used or
hacked on much; (2) somehow support multiple options for how to accomplish
the same thing, selected within the config; (3) pick a single option that
is blessed by upstream -- forcing some to remain quite forked if the
single option is not useful for them.

At any rate, if a branch like this exists and someone else had the time
and motivation to do the leg work of applying enhancements from the
various forks to the core branch, I would most definitely ask my employers
to put most of our codebase at your disposal to do this.  Even if the
result wasn't something we could use, and we had to remain forked... at
least it would be useful to _someone_ :)

-Jared



Re: validating from

2012-06-04 Thread Jared Johnson
 In addition to whatever value it might have for Bayesian filters, it may
 be useful to always add an X-From: header, so that diagnosing email
 problems like my client with the forged From: header would be easier. I
 had to grep through his server logs to see how the spammer bypassed the
 SPF and SA tests. (SA only sees From: and SPF only uses MAIL FROM).

 I wonder if X-Rcpt-To should be similarly added.

 Has this been done before?  Should it be?

Our forked stuff adds X-Envelope-Recipient and X-Envelope-Sender headers
to messages, and we've found them useful.  But note that we inject a
separate message into the queue for each recipient, which means that every
message has only one X-Envelope-Recipient header which should already be
known in some way to the real recipient of the message.  If you use the
normal method of injecting a single message into the queue for multiple
recipients, and therefore include multiple X-Envelope-Recipient headers,
then you're going to wind up violating the intentions of a sender who
BCC'd multiple recipients to avoid letting all the recipients in on each
others' email addresses.

-Jared



Re: validating from

2012-06-02 Thread Jared Johnson
 What problems might I encounter if I were to do this?

 I ask because I have a client who is currently getting spammed viciously
 by spammers who use one address in MAIL FROM (to pass SPF tests) and they
 use the senders email address in the From: header so they can get
 whitelist scoring by SpamAssassin. It's pretty clever.

Having perused a lot of spam and non-spam, I would generally expect you to
have problems if you were very strict about this.  Our solution to
self-whitelisting issues is that our whitelisting plugin ignores any
whitelist with the recipient's own domain in it.  You could perhaps do the
same thing with some sanity checks after the fact (or before the fact, if
you have a UI through which people enter SA whitelists).  We got tired of
tip-toeing around SA, though, so we simply changed our UI to save
whitelist in our own native table rather than SA, and wrote our own
plugins.

 Another way to solve part of this problem is that if MAIL FROM contains a
 local domain, reject it unless relay_client is set and the local user
 exists.

This might still be too strict, but it would probably go over better.  At
least there would be recourse if there are FP's

 If the To header exists, shouldn't that also be validated against RCPT TO?

One would think.. but again I'd expect FP's to happen because of
legitimate senders doing strange, interesting, and foolish things.

-Jared



Re: relay plugins

2012-06-02 Thread Jared Johnson
You could ask the core developers to give you direct access to the main github 
repo :-)

It seems apparent that you're invested in moving QP forward, and have the time, 
talent, and motivation to do so while still playing by the rules. If I was on 
the core team I think I'd be fine with letting you use your own better judgment 
on when changes are ready and I'd rather review them after the fact at my 
leisure than delay them till I get around to signing off.

Just my $0.02

Matt Simerson m...@tnpi.net wrote:


Is there anything else I can do to move this forward?

Matt

On May 21, 2012, at 12:06 PM, Matt Simerson wrote:

 
 https://github.com/smtpd/qpsmtpd/pull/13
 
 
 replaces 3 relay plugins (check_relay, check_norelay, relay_only) with a 
 single plugin (relay).
 
 Same functionality. 
 Bonus: includes tests. And a brief description of how the settings in the 3 
 config files interact. :)
 
 
 `
   Matt Simerson   http://matt.simerson.net/
   Systems Engineerhttp://www.tnpi.net/
 
   Mail::Toaster  - http://mail-toaster.org/
   NicTool  - http://www.nictool.com/
 `
 

-- 
*** eFolder Email Security identified this as CLEAN. Give feedback:
*** This is SPAM: http://mx11.dcmx.net/ms?k=O50aNhaosAqB
*** More options: http://mx11.dcmx.net/md?k=O50aNhaosAqB


Re: new plugin: naughty nice

2012-05-26 Thread Jared Johnson
 Been there, tried that. I've had mx2.freebsd.org soft 4xx blocked for 1
 day at a time for over a week. Besides the increased number of connections
 to my server, using a 4xx error makes almost no difference.  As soon as
 the penalty box expires, the queued ham and spam pour in and mx2 gets
 tossed back into the penalty box. For most messages, a 4xx error ends up
 being the same as 5xx error except with a delay.

I don't know if I'm typical, but I would personally be very hesitant to
reject or even defer a server such as mx2.freebsd.org, which is known to
send legitimate mail, even if I know that it is going to try to send some
spam as well (assuming of course that it is not intentionally originating
the spam or infected with a trojan/botnet, at which point it could no
longer be called a legitimate server).  I would be more inclined to use
the 'penalty box' to filter out illegitimate servers, and rely on content
filtering to do as much as can be done about spam on legitimate mailing
lists, etc.

A server operated or controlled by spammers or botnets is not going to
retry messages indefinitely like this, so it should be much more
susceptible to a 'penalty box' regardless of whether you reject or defer.

-Jared



Re: new plugin: naughty nice

2012-05-24 Thread Jared Johnson
We do something not exactly similar to this, but which might be somewhat
instructive.  We modified the banner delay plugin so that the amount of
delay we apply to a server depends on its previous behavior with us.  The
default delay is 15 seconds; if you send a message that we reject as spam
or virus, we increase the delay by 5 seconds.  If you send us a message
that we deliver as clean (whether that's because of spam scanners,
whitelists, or whatever), we reduce the delay by 5 seconds.  We attached
other values to some other rejections; for instance, if you try to send
mail to an invalid user, the delay is increased by 1 or 2 seconds.  We do
not increase the delay to more than around 300 seconds, so that at least
some very RFC-compliant MTA's at least have a chance to deliver, even if
they manage to get themselves horribly delayed.

What we've found is that the main strength of the banner delay for us is
not that it exposes early talkers (it exposes some), but that it
encourages early disconnectors :)  Some spambots give up after just 5
seconds of delay; the one that wait out our 15 second delay will often try
to send a few messages and get rejected, then once we make our way up to
20 or 30 seconds they start giving up on us altogether.

So after we implemented this we noticed that what we had on our hands was
a rough score card for all connecting IP's indicating how well we know
them and how naughty they have been.  From this we've been able to
further reduce the number of transactions for which scanning is required;
for instance, if a client gets the default delay (e.g. for all intents and
purposes we haven't heard from them), we only let them have a single
concurrent connection with us.  We can also be more lenient with clients
that have won and maintained a banner delay of zero by sending us some
good mail and not screwing it up with bad mail.  If your 'naughty' plugin
is pretty successful, perhaps we could benefit from adding a 'penalty box'
like this based on the delay -- clients that have earned, say, a 30 or 60
second delay by trying to send us too much spam and not enough ham get
sent to the penalty box and are therefore no longer able to increase our
concurrency.  I had an idea similar to this a while ago (we haven't got
around to implementing it), where we *randomly* disconnect at the outset a
client that's earned an increased banner delay, and the frequency of
disconnection depends on long high a banner delay they've earned.  This
would allow us to throw away connections while, again, still giving good
clients a chance to eventually connect to us and send legit mail, thereby
decreasing their delay.

Others may or may not find it worthwhile to implement this sort of mucking
with banner delays (I'll list a couple of problems with it below); but
simply scoring clients might be an answer to the problem of legit clients
being rejected.  Rather than putting a client in the penalty box at the
first violation, give them a few chances, and let clean mail put them back
into good graces.

For storage of these scores we have been using a table Postgres with three
fields: IP, delay (aka score), and timestamp, with entries expired after a
client has not talked to us at all for 24 hours.  We've talked about using
Cache::FastMMap for this because there is not really any benefit to
keeping it in the database (we don't even try to synchronize it across
clusters, if your score is effectively reset because you started talking
to a new node then so be it).  But honestly, we haven't seen any big
bottlenecks with the DB, so we haven't bothered doing the coding and
testing work to move it to obviously more efficient mechanisms.

For anyone considering the delay method itself, one big weakness with this
method is that since it does not get rid of naughty connections as
quickly as possible, unless you're using the async daemon (we aren't), it
can wind up taking up a lot of resources.  Obviously it will still save
plenty of CPU on account of spammers giving up instead of giving us
something to scan; but without async, the sharp increase in concurrency
will hurt memory usage.  One thing we recently added which I think will
really help in this area is that we do our RBL lookup before starting the
delay, and if a server is RBL'd we skip the delay and go right to
disconnecting them.  Since so many servers are RBL'd this should help
concurrency a lot.  If the server is not RBL'd, we subtract the amount of
time it took for us to do the RBL lookup (and anything else we might have
done first) from their banner delay -- after all, we've already been
delaying our banner for that amount of time.

Aside from that, simply doing a banner delay has caused problems for a few
legit servers which give up before they even get to our default delay. 
We've pretty much completely gotten around this by excluding all private
IP's, and clients that are given relay permission by our IP-based rules,
from any banner delay whatsoever.

HTH!


Re: [PATCH 3/7] basicheaders, add reject option, loglevel

2012-05-21 Thread Jared Johnson
 There's several plugins that have something like what you've suggested:

   plugin action [ add-header | deny | denysoft ]

 And here's a sampling of the arguments that various plugins use with
 action:

   add-header, log, continue, reject, deny, denysoft, accept, delete, add

 Reject has a simple, explicit, and universal meaning in every plugin, and
 thus my preference for it.

This makes sense to me, but what I thought I had seen come down the line
(and what I just noticed in another commit, at any rate) was that 'reject'
is not exactly boolean; it's tied to the reject-type field.  it seems like
this field is just as easy to get crufty over time as the 'action' field,
although it at least has fewer potential values.  I would contend that
reject=1,reject_type=defer is not really rejecting at all, which makes it
confusing that you have to turn on rejection in order to defer.

I have to admit I haven't been reading your patches closely, so I could be
missing things that make this an obviously better idea than I'm
perceiving.  I might have just kept my mouth shut, but I thought I'd see
what you think :)

-Jared



Re: [PATCH 3/7] basicheaders, add reject option, loglevel

2012-05-18 Thread Jared Johnson
a little OT:  'reject N' seems a little clunky; why not something more
like 'action [reject | add-header | ...]'?

-Jared



Re: A basket of changes

2012-04-06 Thread Jared Johnson
 I also changed the per-user feature to only take effect if the
 message has a single recipient.

Perhaps it would be useful to give people the option of only accepting one
recipient per SMTP transaction?  This seems a bit messy/inefficient, we
don't do it ourselves, but I've heard from at least one QP user with a big
installation that has had success with this; anything after the first
(accepted) recipient is simply deferred, so the sending MTA is forced to
send messages to multiple recipients with one transaction per recipient. 
This would allow people to always take advantage of per-user spam rules,
if they wished.  I should note that our installation does institute a
higher recipient limit (since we queue once per recipient we were running
into problems with client timeouts being reached while waiting for our
response to the DATA command...) and sending MTA's have generally obeyed
this, but it has generated some questions from sending organizations as
their queued take a little longer to empty out, especially in the case
where our clients are doing outbound filtering.  ...Actually, if it was
also possible to turn of SA scanning on relay-allowed transactions, then
it would perhaps be reasonable to make it a one-inbound-recipient limit
rather than simply a one-recipient limit.  For inbound mail it's much less
common to have a whole lot of recipients in a single transactions.

-Jared



Re: A basket of changes

2012-04-03 Thread Jared Johnson
 6. spamassassin plugin: added support for per-user SA config settings,
 rewrote header processing logic in spamassassin plugin so it adds all the
 SA generated X-Spam-* headers to the message. Refactored the code for
 better maintainability.

Adding all the X-Spam-* headers to all recipients' messages could
technically be thought of as too transparent in some situations.  In
practical terms the worst thing I can think of would be that one recipient
would be able to see that some other recipient whose preferences they are
not allowed to look at has whitelisted or blacklisted some sender address.
 That's assuming you have added information to each header to indicate
which recipient it is for -- but if you haven't, how would each recipient
know which X-Spam-* header applies to them?

Anyway this isn't exactly an objection, just pointing it out in case
anyone can think of a more serious problem with such transparency.  The
only other option is to queue multiple messages, one for each recipient. 
This is quite a lot of trouble to go to just to avoid exposing X-Spam-*
headers to other recipients... although our organization's fork does go to
this trouble in order to deal with other problems, like Subject header to
use when different recipients have differing spam scores and subject
tagging is on :)

-Jared



Re: [PATCH] tweak QP's header handling for messages with no body

2011-08-16 Thread Jared Johnson
There's already a special case for something similar to this:

# Reject messages that have either bare LF or CR. rjkaes noticed a
# lot of spam that is malformed in the header.

($_ eq .\n or $_ eq .\r)
and $self-respond(421, See
http://smtpd.develooper.com/barelf.html;)
and return $self-disconnect;

you could just make it ( /\r\r\n$/ or $_ eq .\n or $_ eq .\r ) maybe? 
and update the link to point to a URL that explains both?

-Jared

 Yup there's a lot of this going around right now. Just to be explicit
 though, the header lines end in \r\r\n. Worth rejecting the bloody lot,
 frankly :)

 

  Chris Lewis mailto:cle...@nortel.com
 August 15, 2011 4:21 PM




 As a FYI, I've been seeing bot-emitted spam that appears to have extra
 \r at the end of _all_ header lines, and the qpsmtpd parser seems to
 be treating all of it as part of the _body_.  IOW: except for the
 received line inserted by qpsmtpd, qpsmtpd doesn't see _any_ headers.

 This implementation is backrev (0.80 I think), and as it's only spam
 from one particular bot, we don't care about that particular wierdness
 enough to investigate further.  But it's worth being aware of.

 

  Jared Johnson mailto:jjohn...@efolder.net
 August 15, 2011 3:39 PM


 Hi,

 We got a bug report from someone using IBM's Lotus suite (I think for
 both
 their MUA and MTA). Their users would often send messages where all the
 content was in the subject and they didn't bother sending any message
 content. I'm not sure if it's due to an apparently uncommon behavior for
 their particular MUA or their MTA, but every one of these messages was
 coming through with data in a form that looked like this:

 Subject: howdy\r\n.\r\n

 Rather than including the blank line that one might expect to follow
 headers, since it's required in the event that a message body is
 present:

 Subject: howdy\r\n\r\n.\r\n

 The customer reported these messages were having their subjects
 stripped;
 additional testing indicted all existing headers were being stripped. It
 looks like this is because the loop that processes message data in
 Qpsmtpd::SMTP::data_respond() and creates a Mail::Header object which is
 later used to write out the header in delivery, only works if a blank
 line
 exists after the header, e.g. the second form above. The following is
 all
 I could find in RFC 5322 that elaborated on this blank line, which
 obviously must exist if a message body is included:

 A message consists of header fields (collectively called the header
 section of the message) followed, optionally, by a body. The header
 section is a sequence of lines of characters with special syntax as
 defined in this specification. The body is simply a sequence of
 characters that follows the header section and is separated from the
 header section by an empty line (i.e., a line with nothing preceding the
 CRLF).

 I read this as implicitly allowing the exclusion of this blank line if
 there is no message body: the specification for the blank line is only
 mentioned in the description of the body, which is itself described as
 optional. Considering we haven't run into this bug in years of usage, I
 assume it's unconventional to exclude the blank line, but it looks like
 it
 is legitimate syntax.

 At any rate this was effecting multiple legitimate end users so we put
 together the attached patch, which pulls the header building into its
 own
 sub which is then called inside the loop if we reach the blank line
 indicating the header section is complete; otherwise, it's called
 outside
 of the loop if we have no more message data, indicating the header
 section
 is complete. Sorry I'm not putting this on a github fork, I still don't
 have my git stuff together, I may never get around to it but I thought
 you
 guys might find this useful.

 -Jared





Re: [PATCH] tweak QP's header handling for messages with no body

2011-08-16 Thread Jared Johnson
True, I was led astray by the comment that seemed to indicate it was all
about headers.  if the intention of the comment is correct, then it should
probably if ( $in_header and ... ), whether or not the new regex is added.
 That block would also need to be moved to after the block that sets
$in_header = 0 if we've reached the end, to avoid catching the blank line
that separates headers and newlines.

At any rate, Chris has a point -- I was actually thinking the other day
that code like that original deferral would be better of sitting in a
plugin, so I could muck around with it in a more straightforward manner in
order to, say, log what we did to our own database, or change it to a
rejection, or whatever :)

Anyway, this is all kind of minutia compared to the original patch, which
I feel I should remind everyone has to do with accepting legitimate mail
from Lotus software without inadvertently stripping the headers :)

-Jared

 That line of code doesn't look at the headers though, just at the final
 dot at the end-of-data.

 

  Jared Johnson mailto:jjohn...@efolder.net
 August 16, 2011 3:00 PM


 There's already a special case for something similar to this:

 # Reject messages that have either bare LF or CR. rjkaes noticed a
 # lot of spam that is malformed in the header.

 ($_ eq .\n or $_ eq .\r)
 and $self-respond(421, See
 http://smtpd.develooper.com/barelf.html;)
 and return $self-disconnect;

 you could just make it ( /\r\r\n$/ or $_ eq .\n or $_ eq .\r )
 maybe?
 and update the link to point to a URL that explains both?

 -Jared



 

  Matt Sergeant mailto:m...@sergeant.org
 August 16, 2011 11:28 AM


 Yup there's a lot of this going around right now. Just to be explicit
 though, the header lines end in \r\r\n. Worth rejecting the bloody
 lot, frankly :)

 

  Chris Lewis mailto:cle...@nortel.com
 August 15, 2011 4:21 PM




 As a FYI, I've been seeing bot-emitted spam that appears to have extra
 \r at the end of _all_ header lines, and the qpsmtpd parser seems to
 be treating all of it as part of the _body_.  IOW: except for the
 received line inserted by qpsmtpd, qpsmtpd doesn't see _any_ headers.

 This implementation is backrev (0.80 I think), and as it's only spam
 from one particular bot, we don't care about that particular wierdness
 enough to investigate further.  But it's worth being aware of.

 

  Jared Johnson mailto:jjohn...@efolder.net
 August 15, 2011 3:39 PM


 Hi,

 We got a bug report from someone using IBM's Lotus suite (I think for
 both
 their MUA and MTA). Their users would often send messages where all the
 content was in the subject and they didn't bother sending any message
 content. I'm not sure if it's due to an apparently uncommon behavior for
 their particular MUA or their MTA, but every one of these messages was
 coming through with data in a form that looked like this:

 Subject: howdy\r\n.\r\n

 Rather than including the blank line that one might expect to follow
 headers, since it's required in the event that a message body is
 present:

 Subject: howdy\r\n\r\n.\r\n

 The customer reported these messages were having their subjects
 stripped;
 additional testing indicted all existing headers were being stripped. It
 looks like this is because the loop that processes message data in
 Qpsmtpd::SMTP::data_respond() and creates a Mail::Header object which is
 later used to write out the header in delivery, only works if a blank
 line
 exists after the header, e.g. the second form above. The following is
 all
 I could find in RFC 5322 that elaborated on this blank line, which
 obviously must exist if a message body is included:

 A message consists of header fields (collectively called the header
 section of the message) followed, optionally, by a body. The header
 section is a sequence of lines of characters with special syntax as
 defined in this specification. The body is simply a sequence of
 characters that follows the header section and is separated from the
 header section by an empty line (i.e., a line with nothing preceding the
 CRLF).

 I read this as implicitly allowing the exclusion of this blank line if
 there is no message body: the specification for the blank line is only
 mentioned in the description of the body, which is itself described as
 optional. Considering we haven't run into this bug in years of usage, I
 assume it's unconventional to exclude the blank line, but it looks like
 it
 is legitimate syntax.

 At any rate this was effecting multiple legitimate end users so we put
 together the attached patch, which pulls the header building into its
 own
 sub which is then called inside the loop if we reach the blank line
 indicating the header section

Re: [PATCH] tweak QP's header handling for messages with no body

2011-08-15 Thread Jared Johnson
Oh and FYI, this patch is well tested on our own variant of SMTP.pm, but
that variant is slightly forked, and this version is itself not
specifically tested.  I'm fairly positive it'll work though :)

 Hi,

 We got a bug report from someone using IBM's Lotus suite (I think for both
 their MUA and MTA).  Their users would often send messages where all the
 content was in the subject and they didn't bother sending any message
 content.  I'm not sure if it's due to an apparently uncommon behavior for
 their particular MUA or their MTA, but every one of these messages was
 coming through with data in a form that looked like this:

 Subject: howdy\r\n.\r\n

 Rather than including the blank line that one might expect to follow
 headers, since it's required in the event that a message body is present:

 Subject: howdy\r\n\r\n.\r\n

 The customer reported these messages were having their subjects stripped;
 additional testing indicted all existing headers were being stripped.  It
 looks like this is because the loop that processes message data in
 Qpsmtpd::SMTP::data_respond() and creates a Mail::Header object which is
 later used to write out the header in delivery, only works if a blank line
 exists after the header, e.g. the second form above.  The following is all
 I could find in RFC 5322 that elaborated on this blank line, which
 obviously must exist if a message body is included:

 A message consists of header fields (collectively called the header
 section of the message) followed, optionally, by a body.  The header
 section is a sequence of lines of characters with special syntax as
 defined in this specification.  The body is simply a sequence of
 characters that follows the header section and is separated from the
 header section by an empty line (i.e., a line with nothing preceding the
 CRLF).

 I read this as implicitly allowing the exclusion of this blank line if
 there is no message body:  the specification for the blank line is only
 mentioned in the description of the body, which is itself described as
 optional.  Considering we haven't run into this bug in years of usage, I
 assume it's unconventional to exclude the blank line, but it looks like it
 is legitimate syntax.

 At any rate this was effecting multiple legitimate end users so we put
 together the attached patch, which pulls the header building into its own
 sub which is then called inside the loop if we reach the blank line
 indicating the header section is complete; otherwise, it's called outside
 of the loop if we have no more message data, indicating the header section
 is complete.  Sorry I'm not putting this on a github fork, I still don't
 have my git stuff together, I may never get around to it but I thought you
 guys might find this useful.

 -Jared




Re: thoughts about a new module called check_spammer_connect

2011-07-26 Thread Jared Johnson
 I thought about a module which learns from the plugin dnsbl.

 Maybe we call it check_known_dnsbl_spammer ;-) and use the module

 http://search.cpan.org/~robm/Cache-FastMmap-1.39/lib/Cache/FastMmap.pm

Hi Aleks,

We discussed doing this once at my organization and someone astutely
pointed out that considering that this is DNS, the better solution would
really be to run a local caching DNS server, e.g. bind9 and point
resolv.conf to 127.0.0.1 with forwarders to your ISP's nameservers.  Well,
what they actually pointed out was that since we already do this
everywhere, we're already getting in-memory caching that's designed
expressly for DNS results, for free :)

running a local caching DNS server is basically a good idea anyway, and
obsoletes the need for caching DNS results in any particular software.

-Jared



[Fwd: Re: [Fwd: STARTTLS vulnerabilty and qmail-spamcontrol ucspi-ssl qpsmtpd]]

2011-06-05 Thread Jared Johnson
I've been otherwise occupied but I forwarded this to the rest of our dev
team and our resident security guru had this to say

 Original Message 
Subject: Re: [Fwd: STARTTLS vulnerabilty and qmail-spamcontrol ucspi-ssl
qpsmtpd]
From:Peter Samuelson psamuel...@efolder.net
Date:Thu, June 2, 2011 1:32 pm
To:  dc...@efolder.net
--


 Subject: [Fwd: STARTTLS vulnerabilty and qmail-spamcontrol ucspi-ssl
qpsmtpd]
 From:Matt Sergeant m...@sergeant.org

Heh, this class of vulnerability made the rounds some months ago -
a (perhaps) surprising number of open source software out there that
deals in SSL/TLS had this class of bug.

 I'm pretty sure this is enough to fix it for async:

 diff --git a/plugins/tls b/plugins/tls
 index 37fbc9a..f850d2c 100644
 --- a/plugins/tls
 +++ b/plugins/tls
 @@ -275,6 +275,7 @@ sub upgrade_socket {
   my UpgradeClientSSL $self = shift;

   unless ( $self-{_ssl_started} ) {
 +$self-{_stashed_qp}-clear_data();
   IO::Socket::SSL-start_SSL(
   $self-{_stashed_qp}-{sock}, {
   SSL_use_cert = 1,

Yeah, looks right to me.

 But for the non-async scenario I think it's a lot more complex because
 the caching would be done at the C-level, so a fix more like the fix
 (posted below) for postfix is required (switch to non-blocking, get
 whatever data is remaining, flush it, switch back off non-blocking).

I ... disagree.  From my reading of plugins/tls, it looks like there is
no problem at all, in the non-async code path.  It resets STDIN and
STDOUT to a socket created from scratch by the IO::Socket::SSL module.

I haven't looked at IO::Socket::SSL to see if it has this sort of
issue, but it seems unlikely to me.

Peter




Re: smtp proxy to external smtp server

2011-05-20 Thread Jared Johnson
We do the sort of signing that is a huge doozy, and Matt is right, it's a
doozy :)  There are a couple of ways we've accomplished re-writing the
body from a MIME::Entity.  Honestly it seems a bit non-standard to me but
we have some special requirements, like leaving the original body around;
I could simplify the examples but I might as well show you known-working
code :P  We currently do:

use File::Temp qw(tempfile);
( $txn-{_body_file}, my $filename ) = tempfile( DIR = $self-temp_dir() );
$txn-body_fh-print(\n);
$mime_entity-print_body( $txn-body_fh );
$txn-{_body_size} = (stat($txn-{_body_file}))[7];
$txn-{_body_start} = 0;

An older method that we used to use (I can't remember whether we switched
to the above because it's more efficient, or because of some of the other
weird things we're doing -- at least the above doesn't require
stringify_body() which is probably slightly undesirable):

$txn-body_resetpos;
$txn-{_body_file}-truncate($txn-{_body_start});
$txn-{_body_size} = $txn-{_body_start};
$txn-body_write( $mime_entity-stringify_body() );

Hope this helps.

-Jared

 On 05/20/2011 04:19 AM, Matt Sergeant wrote:
 What do you mean by signed?
 Signing with a gnugp key.

 Actually by signing the original email I get a new MIME::Entity which
 then I need to pass on the queue to get delivered to the real smtp server.
 In other words, the qpsmtpd proxy signs the email and delivers it to the
 real smtp server.
 --
 Thanks for any hint.
 Mike






Re: smtp proxy to external smtp server

2011-05-20 Thread Jared Johnson
 On 05/20/2011 02:56 PM, Jared Johnson wrote:
 We do the sort of signing that is a huge doozy, and Matt is right, it's
 a
 doozy :)  There are a couple of ways we've accomplished re-writing the
 body from a MIME::Entity.  Honestly it seems a bit non-standard to me
 Why do you think this is non-standard? From a conceptual perspective or
 just from a tooling view?
 If conceptually, which other approach would you recommend?

just non-standard in that I don't use public API's, so e.g. if they
decided to store the body file in $txn-notes('body_file') instead of
$txn-{_body_file}, for instance, you'd be SOL.  Other than that, though,
we haven't had any problems with our method :)

I'm not sure if it can be done with a public API, but if you want to look
for a method, i'd check out the POD for Qpsmtpd::Transaction, for the
various body* methods.

 I did some preliminary tests and it works.
 That also means there is no API or any other way to put an email into
 the queue, isn't it?

well, the standard way to put a message into the queue is to have the
message in a state you want it in before the queue hook gets ahold of it,
and then let the queueing plugin do its work... I think smtp-forward is an
example of a queueing plugin?

 Well, can I use this approach in a productive environment or could there
 be any side effects?

YMMV, but we've been using that method in production a while with no side
effects that I know of :)

-Jared



Tumbleweed TLS compatibility patch

2011-02-22 Thread Jared Johnson
Hi,

We've been chasing issues with Tumbleweed (now Axway) MailGate appliances
sending messages using TLS through us.  Finally Tumbleweed told one of our
clients that this was probably due to a bug in their software when the
receipient MTA makes multiple writes to the socket.  On a guess, we
patched QP to send multi-line EHLO reponses in a single string rather than
multiple strings.  This seems to have cleared up the issue!

I'm attaching the patch, in hopes that (1) it is useful to you guys; and
(2) if there's anything stupid about it that we haven't been able to
catch, someone will notice and help us improve it, since we're about to
include it in our forthcoming point release :)

-Jared--- Prefork.pm.orig	2011-02-22 17:35:39.809587002 -0600
+++ Prefork.pm	2011-02-22 17:34:58.849587003 -0600
@@ -59,10 +59,12 @@ sub respond {
 return(0);
   }
 
-  while (my $msg = shift @messages) {
-my $line = $code . (@messages?-: ).$msg;
-$self-log(LOGINFO, $line);
-print $line\r\n or ($self-log(LOGERROR, Could not print [$line]: $!), return 0);
+  $messages[$_] = $code-$messages[$_] for 0..$#messages-1;
+  $messages[-1] = $code $messages[-1];
+  $self-log(LOGINFO, $_) for @messages;
+  unless (print join '', map { $_\r\n } @messages) {
+$self-log(LOGERROR, Could not print [ . join(' | ', @messages) . ]: $!);
+return 0;
   }
   return 1;
 }

Re: rolling uribl and rbl plugins into a single plugin

2011-01-28 Thread Jared Johnson
 I agree that sharing code should be possible here (and that would be
 desirable); but as Matt suggested using the plugin inheritance stuff would
 be better than (eventually) making the One True Spam Fighting plugin.  :-)

OK, I agree.  Hopefully the integration will happen for our next release
(which we're going to test pretty soon), and then someday i'll get the
async bit right and dnsbl lookups will live happily ever after :)

-Jared



rolling uribl and rbl plugins into a single plugin

2011-01-27 Thread Jared Johnson
So our organization is planning doing some big changes to the rbl plugin
and it dawned on us that it seems a lot easier to just add an earlier hook
to the existing uribl plugin (and rename it to rbl?  or bl? or
something?).  But of course I still have in mind that someday I'll get the
plugin completely in shape and it will be in QP, and we'll get to share
code and all.  Would this be an undesirable direction to take the uribl
plugin in for QP proper, in the event that the uribl plugin was integrated
into QP proper?  It seems like it would be handy in terms of both code
organization and features (the main feature I can think of is proper mask
evaluation along with custom actions per mask, just like in the uribl
plugin).

Thoughts?

-Jared



Re: lastest uribl plugin

2011-01-24 Thread Jared Johnson
Specifically regarding the quoted-printable stuff.. my diff must have been
off an older version.  This is probably a better representation of your
changes:

@@ -574,15 +574,22 @@
 my $line;
 my @qp_continuations;
 while ( $line = $txn-body_getline ) {
+chomp $line;
 if ( $line =~ /(.*)=$/ ) {
 push @qp_continuations, $1;
 next;
 } elsif ( @qp_continuations ) {
-$line = join '', @qp_continuations, $line;
+$line = join '', @qp_continuations, $1;
 @qp_continuations = ();
 }
 $self-find_uris($line);
 }
+if ( @qp_continuations ) {
+$self-log(LOGINFO, uribl: WARNING: scan_body exiting with line
continuations left.  Bad Email?);
+$line = join('', @qp_continuations, $line);
+@qp_continuations = ();
+$self-find_uris($line);
+}
 }

The first and last change should certainly go in, but what about this:


 if ( $line =~ /(.*)=$/ ) {
 push @qp_continuations, $1;
 next;
 } elsif ( @qp_continuations ) {
-$line = join '', @qp_continuations, $line;
+$line = join '', @qp_continuations, $1;
 @qp_continuations = ();
 }

Won't $1 only be defined when the condition in the 'if' is met, thus
making your change equivalent to:

$line = join '', $qp_continuations;

I'm not sure but it seems like that change doesn't belong..

-Jared



Re: How to get both sender and recipient address for mail-hook

2011-01-11 Thread Jared Johnson
Pretty sure he meant:

my ( $self, $transaction, $recipient) = @_;
my $sender = $transaction-sender;

Then you can do:

my $recipientdomain = $recipient-host;
my $senderdomain = $sender-host;

-Jared 

Steve Freegard steve.freeg...@fsl.com wrote:

Hi Tapio,

On 11 Jan 2011, at 11:34, Tapio Salonsaari wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Hello!
 
 I'm building an web-based gui for qpsmtpd and currently I'm
 writing/modifying plugin for white/blacklists per domain.
 
 My web gui stores black / white lists into psql-database and the idea is
 to do an SQL-search like:
 SELECT * FROM blacklist WHERE sender=$senderdomain AND
 recipient=$recipientdomain;
 
 Then if the query returns any rows the plugin would return DENY. If
 there's no rows on blacklist search the same for whitelist -table and
 process the data accordingly.
 
 I'm using http://devin.com/qpsmtpd/whitelist as an base (don't know if
 there's something more suitable to start with) and currently I'm stuck
 with sub mail_handler, since I can't figure out how to pass both sender
 and recipient addresses to function.
 

You can't do anything with the recipient addresses in the mail handler because 
they haven't been received yet!  In an SMTP transaction the recipients are 
sent after the MAIL command (see rfc5321).

If you're going to do black/whitelisting using SQL like that; then you'll need 
to do it in the hook_rcpt handler.  As per the wiki docs at 
http://wiki.qpsmtpd.org/api:plugin_hooks the hook_rcpt function is passed:

my ($self,$transaction, $sender) = @_;
  # $sender: an Qpsmtpd::Address object for sender of the message

You can access the sender in the same format (as a Qpsmtpd::Address object) 
via $transaction-sender at this point.

Hope that helps.

Kind regards,
Steve.

Re: white-listing in hosts_allow plugin

2010-12-15 Thread Jared Johnson
 IIRC, the connections-per-ip code isn't in a plugin right now -- it's
 part of the main binary -- because it requires some sort of shared
 state which we don't have for plugins.

The checks do exist in the binaries but they don't have to, you can use
$arg{child_addrs} and $arg{remote_ip} in hook_pre_connection to get at it,
at least in prefork, we've been doing it forever.  Our code is heavily
modified for our db-based prefs but it's a lot like this:

my %per_ip_limits = ( # fake magic per-ip config hashref
1.2.3.4 = 10,
);

my %class_c_limits = (
1.2.3.4 = 20,
);

sub ip_to_int {
my ( $ip ) = @_;
$ip = inet_aton($ip) if length($ip)  4;
return unpack 'N', $ip;
}

sub hook_pre_connection {
my ( $self, $txn, %arg ) = @_;
return NEXT unless @{ $arg{child_addrs };
my $ip_limit = $per_ip_limits{ $arg{remote_ip} } || 5;
return DENYSOFT_DISCONNECT, 'Too many connections from your IP'
if ( grep { $_ eq $arg{remote_ip} }
 @{$arg{child_addrs}} )
 $ip_limit;
my $c_limit = $class_c_limits{ $arg{remote_ip} } || 10;
my $ip = ip_to_int( $arg{remote_ip} );
my $mask = $ip  0xff00;
return DENYSOFT_DISCONNECT, 'Too many connections from your class C'
if ( grep {(ip_to_int($_)0xff00) == $mask}
 @{$arg-{child_addrs}} )
 $c_limit;
return NEXT;
}

another interesting thing we support is allowing people to set a custom
max concurrency override for an arbitrary cidr; when an IP matches the
CIDR, we check how many other child_addr's match the same CIDR instead of
checking the other limits.  We use Net::CIDR::Lite::Span for this business
though so it's probably less efficient than the above.  We also use
hook_pre_connection to implement global connection limits, which allows us
to defer and write an entry to our DB logs when the limit is exceeded,
rather than just not responding.  This is way less efficient though, and
we've had to contend with it snowballing when the system becomes so busy
that the db is slow.  Oh well, someday we'll switch to async :)

-Jared



Re: white-listing in hosts_allow plugin

2010-12-15 Thread Jared Johnson
 return DENYSOFT_DISCONNECT, 'Too many connections from your class C'
 if ( grep {(ip_to_int($_)0xff00) == $mask}
  @{$arg-{child_addrs}} )

Though this code is still untested in its
heavily-modified-from-our-version form, I feel like I should point out
that should be @{ $arg{child_addrs} }, not @{ $arg-{child_addrs} } :)

  $c_limit;
 return NEXT;
 }


-Jared



Re: Transaction.pm body_fh() badly broken [was: qpsmtpd-v0.84 seems to ignore size_threshold]

2010-10-29 Thread Jared Johnson
 commit a52660a646012691f993cca821c00fe05cff08bb
 Author: jaredj jar...@nmgi.com
 Date:   Wed Mar 25 07:38:05 2009 -0500

 Spool body when $transaction-body_fh() is called

 Qpsmtpd::Transaction::body_filename() calls $self-body_spool() if
 the message body has not already been spool to disk.  This adds
 the same check to Qpsmtpd::Transaction::body_fh()

 but I don't remember the background behind it.  Jared, do you remember
 why you changed this?

Looking at the history in our internal SVN history, it looks like this was
part of an effort to dup() body_fh.  IIRC this was because we were
passing it to an external library that close()'d it but we don't want the
original FH closed.

BUT... at some point internally we noticed that this caused bugs!  I don't
remember what sort, but our commit log says it confuses some methods of
writing to the fh.  So we reverted the changes to body_fh() and added a
dup_body_fh() to do what we want:

sub body_fh {
  return shift-{_body_file};
}

sub dup_body_fh {
  my $self = shift;
  open ( my $fh, '=', $self-body_fh );
  return $fh;
}

I guess by the time we discovered it was buggy I had forgotten I had
submitted the buggy change upstream.  Sorry!  We only use dup_body_fh() in
one place right now, so though adding such a function might be useful
upstream, but I don't think there's any need to preserve the old,
apparently undocumented, behavior; seems safe to just revert.

-Jared



Re: Auth issues with turnpike

2010-09-17 Thread Jared Johnson
 With this in mind is it possible without having to patch qpsmtpd to
 influence the order that auth methods are presented in from 250 AUTH
 PLAIN LOGIN CRAM-MD5 such that LOGIN is the first available option?
 Apparently in turnpike you can't specify the method you use and it
 always picks the first one in the list.

 I appreciate this is a rather odd request and a much better option would
 be to not use turnpike, but can anyone help with fixing this at the
 server end without resorting to altering the core qpsmtpd code? Altering
 plugins is fine.

It looks like you would need to modify Qpsmtpd/SMTP.pm, at least with the
way things are set up now; it creates the AUTH line outside of
$txn-notes('capabilities'), and uses a hash so the sorting is random. 
This diff would probably work but is untested:

--- dc-smtpd/lib/Qpsmtpd/SMTP.pm(revision 13227)
+++ dc-smtpd/lib/Qpsmtpd/SMTP.pm(working copy)
@@ -243,7 +243,8 @@
 # Check if we should only offer AUTH after TLS is completed
 my $tls_before_auth = ($self-config('tls_before_auth') ?
($self-config('tls_before_auth'))[0] 
$self-transaction-notes('tls_enabled') : 0);
 if ( %auth_mechanisms  !$tls_before_auth) {
-push @capabilities, 'AUTH '.join( ,keys(%auth_mechanisms));
+push @capabilities, 'AUTH '.join( ,grep { exists
$auth_mechanisms{$_} }
+ qw( PLAIN LOGIN CRAM-MD5 ));
 $self-{_commands}-{'auth'} = ;
 }

Seems like if at all possible it would be better to have the auth plugin
figure this stuff out and then add to $txn-notes('capabilities') like
everything else does.  Then you could just do it in an extra plugin, like:

sub hook_ehlo {
my ( $self, $txn ) = @_;
my $auth = grep { /^AUTH / } @{ $txn-notes('capabilities') }
or return DECLINED;
my %methods = map { $_ = undef } split / /, $auth;
$auth = join( ' ', grep { exists $a{$_} }
   qw( AUTH PLAIN LOGIN CRAM-MD5 ) );
$txn-notes( capabilities = [ ( grep { { ! /^AUTH / }
 @{ $txn-notes('capabilities') } ),
   $auth ] );
}

even though it amounts to more work, it would be nice to be able to use
plugins for such things and not patch SMTP.pm according to one's desires. 
And most things you'd want to hack the AUTH line for would be simpler. 
Indeed, if you aren't worried about dealing with unpredictable states of
the AUTH line, the plugin would probably be reduced to about two lines.

-Jared



URIBL update

2010-07-31 Thread Jared Johnson
I've made a bunch more changes to the uribl plugin locally; man, we
_really_ need to get some kind of svn-to-gig thing going.  Or at least I
need to re-educate myself on git and start putting things in my github
again.  If I don't manage to do this by the time (soon) that things are
settled down a bit and we have some production testing, I'll submit a new
one to the list again; but hopefully by the time I'll have found time to
git git going again and be able to point people to that.

I got permission from Dallas @ URIBL to use the datafeed data, but also
got his opinion on the matter which is that using tld_lists the way I am
is not going to gain much, and introduces the risk that a new spammer
'haven' could be missed entirely.  After talking with my team we're going
to go with a full TLD list right now, and perhaps later we'll collect our
own stats to verify Dallas is right about the tiny benefit (he probably
is).  tld_lists has been updated to reflect this, though if anyone feels
more bold than me and wants updates to the 'pruned' list let me know.

I modified the parse_mime plugin as discussed previously on the list, now
the uribl plugin isa_plugin('mime_parser') and does lazy parsing.

I'll probably remove 'semicolon munging detection'; as Devin said, if real
(current) data doesn't show it's being used why bother.  I'd like to go
over a larger sampling of current data first though, which I plan to do
soon.

I've re-arranged the code slightly to allow not only the async plugin but
our own local plugin to easily take advantage of plugin inheritance to
avoid code duplication.  Our own plugin is now just 40 lines or so, thus
it gets to inherit the other 600 lines of uribl without any forking :)

There are some additional changes in the works that I'm curious for input
on if anyone cares:

- We're finally getting more URIBL datafeeds.  I'd like to use this data
to verify how static TXT results are for each service and, if applicable,
generate templates for them in the same script that generates tld_lists.pl
(and probably rename that to fit its more general purpose).  So for
services that do indeed have very static TXT templates, we could
optionally skip TXT lookups and instead generate our own response (e.g.
links) without the cost of one more DNS query.  A couple of additional
brainstorms on this topic:

 - Dynamically generate the TXT template by going ahead and doing TXT
   lookups until the first one we get back for each service, at which
   point we cache the template and don't do any more.  (This would have to
   be re-done in every new child process)
 - Dynamically verify the validity of the statically-set template by doing
   TXT lookups until we get one and then checking.  Still has to be
   per-child-process.
 - If we do either of the above dynamic checks, or if we don't choose to do
   any TXT-avoiding magic at all, we ought to launch TXT requests in the
   callback that receives matches on A lookups, that way we do two queries
   per _hit_, rather than two queries per URI.

I've already added the independent option to just turn off TXT queries,
for anyone who wants to save on DNS traffic at the cost of providing links
on rejection.

- We're interested in optionally resolving URL shortening links (e.g.
tinyurl, bit.ly, etc.) using HTTP::Async

My boss is still deliberating whether the URL shortening resolution thing
would be contributed, or if we would consider it part of the 'special
sauce'.  I'm hopeful he'll be in favor of contributing it.

- We'd like to change the check_headers directive to take more args than
just true/false.  0 would still mean 'don't check headers', 1 would still
mean 'check all headers', but 'all' would also mean 'check all headers';
anything else would be interpreted as the header(s) to be checked
(comma-delimited if in list form).  So you could do check_headers =
'subject' or check_headers = 'subject,received'.  The default should
probably be off.  This is mainly because I noticed check_headers
automatically checks the Received header, which is even more interesting
when combined with the SBL-XBL service; basically, this plugin is now a
replacement for SA's RCVD_IN_SBL rule, etc.  This is probably a good
thing... as long as you actually wanted it :)  But you should be able to
avoid it if you like and just specify the headers you're interested in.

Any comments on these?  Do they sound worthwhile?  What should the
defaults be?

-Jared



Re: per-recipient configuration

2010-07-29 Thread Jared Johnson
 I have stuff that might wind up looking like

   $rcpt-persistent-{statistics_receivedmsgs_total}++;
   
 $rcpt-persistent-{statistics_receivedstatsbysender}{$NormalizedSenderAddress}++;

Perhaps $rcpt-storage could be provided that could be tied... it just
seems like this is the less conventional accessor in qp-land. 
notes('foo') can be accessed with -{_notes}-{foo}, but this is not the
general convention.  When I want to increment a note i just do
$foo-notes( blah = $foo-notes('blah') + 1 ).  That said, some of my
colleagues have also complained that notes() even exists as a method and
think it would be fine for it to just be a reserved namespace accessed
directly as a hash key.

 in addition to reading configuration.

Nothing's exactly stopping you from reading configuration directly with a
persistence hook, it just doesn't seem that natural to me, and I think the
needs of a hook_user_config are a subset of the needs of a persistence
hook.  I wouldn't really cry if I had to change every $rcpt-config() call
in my codebase to $rcpt-get() or $rcpt-storage() etc, it just seems less
intuitive and less in line with the existing convention of $qp-config()

-Jared



Re: per-recipient configuration

2010-07-29 Thread Jared Johnson
 colleagues have also complained that notes() even exists as a method and
 think it would be fine for it to just be a reserved namespace accessed
 directly as a hash key.

or at least that notes() should return the {_notes} hashref if it doesn't
have any callers

-Jared



Re: per-recipient configuration

2010-07-29 Thread Jared Johnson
 Another thing that nothing stops us from, in a late-bound programming
 language without private namespaces, is adding additional methods to
 base objects without changing the code that declares the base objects,
 as long as the implementation of the base objects promises to remain
 the same.

I do think it would be best though to come to decisions upstream to
provide methods that most people will find sufficient and intuitive,
because once they find it insufficient or unintuitive and add their own
methods, they'll start using them in their code and the code becomes that
much more forked.  I find the $qp-config() style method intuitive for
this myself; adding a hashref accessor to at least the notes() method and
perhaps to this
persistence-method-that-might-also-want-to-cover-config-needs, might be
considered intuitive by many.

-Jared



URIBL plugin 'action' defaults

2010-07-29 Thread Jared Johnson
Would anyone object to setting the default action to 'deny' on certain
reliable low-fp URIBL lists?  Probably Spamhaus SBL-XBL and DBL and URIBL
Black.  It seems like a new user turning on uribl checks would expect them
to do something more than adding headers, as long as the services it
rejects for are reliable.  But this also means that anyone who was already
using the plugin without setting any actions would see a change in
behavior on upgrade.

-Jared



Re: Rewritten URIBL plugin

2010-07-28 Thread Jared Johnson


 Jared Johnson wrote:

  I think we should probably consider putting support for parsed
 messages
  into core, with the parsing done lazily if requested by the API.

 I forgot, we did kinda think of a couple of reasons not to want an API.
 depending on where you put it, you may find QP in general depending on
 MIME::Parser even if it never uses it.  The benefit of the plugin is
 that
 /etc/qpsmtpd/plugins controls whether you wind up having to install, and
 take up your memory with, MIME::Parser

 One thing we've discussed in the past (at least in my imagination)
 although not quite figured out how to implement, is making plugins act
 a little more like normal modules, so that one plugin can use
 another.

 So if you're interested in the parsed mime functionality, your
 plugin can plugin_use util/parsed-mime and the right magic happens.

Oh yeah that's right, someone *did* implement what you're talking about. 
You can do it with 'plugin inheritance' (which ironically i knew nothing
about until looking at the async uribl plugin the other night):

# /usr/share/qpsmtpd/plugins/mime_parser
use MIME::Parser;
sub parse_mime {
my ( $self, $txn ) = @_;
return $txn-notes('mime_body') if $txn-notes('mime_body');
 a bunch of code to create a $msg ...
$txn-notes( mime_body = $msg );
return $txn-notes('mime_body');
}

# /usr/share/qpstmpd/plugins/some_plugin_that_might_want_mime_parsing
sub init {
my ( $self, $qp, %args ) = @_;
return unless $qp-config('parse_mime') or $args{parse_mime};
$self-isa('mime_parser');
$self-{can_parse_mime} = 1;
}

sub hook_data_post {
my ( $self, $txn ) = @_;
if ( $self-{can_parse_mime} ) {
$self-some_recursive_mime_function( $self-parse_mime );
} else {
$self-regular_body_getline_function( $txn );
}
}


Voila!  A lazy 'plugin model' that is *also* and API, which doesn't 'use
MIME::Parser' until you *want* to use it; furthermore, you don't have to
just put the official mime_parser plugin in there, with a little
modification (or alternatively some care to use the same filename within
an alternate directory) you could use your home-rolled
messier-but-more-efficient version that doesn't use MIME::Parser at all.

I could easily modify my own stuff to do this and test it.  I'd even be
interested in pulling a couple of our own home-grown MIME recursion
methods into it.  If this is indeed the will of the council ;)

p.s. I'm kind of stoke about plugin inheritance these days.  I'm becoming
convinced that after some code churn on our side it will allow us to
finally switch to async without having to do just rewrite all our plugins,
switch to the async daemon, and see what happens.  And with a little churn
on the upstream side, if that manages to happen, I also think it could
allow us to un-fork 90% of the QP plugin code we currently have re-written
and instead submit patches to QP that have some guarantee to be tested and
aren't surrounded by completely useless context :)

-Jared



Re: [Fwd: DoubleCheck DataFeed access]

2010-07-28 Thread Jared Johnson
 Also:  we recently contributed a new URIBL plugin to the Qpsmtpd
 project,
 which makes use of our pruned TLD lists that use your datafeed data.
 They had some question of whether this was something you would be
 comfortable with having distributed publicly.  Note no actual URIBL data
 is distributed, just a list of TLDs that happens to *not* include
 top-level TLDs that would be extremely unlikely to generate hits against
 your service.  This is used to limit the number of extraneous queries to
 your public mirrors, which I'm guessing you would consider beneficial.
 Could you verify whether we have your permission/blessing to distribute
 such a list gleaned from your data?


 Ya, fine.  It doesnt sound like it would have significant impact on
 volume to me, as the top 25 tlds (including ipv4 volume) that are
 queried represent 91% of the total query volume, and the top 100 tld
 represent 99% of the volume.

 If you tell me which tlds are suppressed, I can give you an idea of
 query volume savings according to mirror traffic.

 For example, suppression of .mil and .int would result save 4/100th of a
 percent (0.00039).   Now, if there was a hacked webserver in .mil and
 spammers used it as a drop page or redirector, our temporary listing of
 it would never hit for you if you suppress them.I guess you have to
 weigh the savings versus the potential for abuse.   You wouldnt want to
 supress a TLD that becomes the next spammer haven and have to scramble
 to release an update.   Thinking about recent history such as .tk
 (Tokelau),  .st (Sao Tome), .im (Isle of Man), and others.

There are 135 excluded tlds:

ac ad af ag ai al an ao aq arpa asia aw ax bb bf bi bj bm bn bo bt bw cd
cf cg ci ck coop cr cu cv dj dm do dz edu er et fj fk fm fo ga gf gh gi gl
gm gn gov gp gq gt gu gw gy ht int iq jm jo jobs kh ki km kn kp kw ky lb
lc lk lr ls lu ly mc mg mh mil ml mm mo mp mq mr mt museum mv mw mz na nc
nf ng ni nr om pa pf pg pn ps pw qa re rw sb sd sh sl sm sn sr sv sy sz td
tel tf tg tj tm tn travel ug va vg vi vu wf ye yu zm zw

... I also used the feed data to prune two and three level TLDs from the
SURBL list, which is pretty obviously not based on any data:

coop.br coop.tt gov.ae gov.am gov.ar gov.as gov.au gov.az gov.ba gov.bd
gov.bh gov.bs gov.by gov.bz gov.ch gov.cl gov.cm gov.co gov.cx gov.cy
gov.ec gov.ee gov.eg gov.ge gov.gg gov.gr gov.hk gov.hu gov.ie gov.il
gov.im gov.in gov.io gov.ir gov.is gov.it gov.je gov.jp gov.kg gov.kz
gov.la gov.li gov.lt gov.lv gov.ma gov.me gov.mk gov.mn gov.mu gov.my
gov.np gov.ph gov.pk gov.pl gov.pr gov.pt gov.py gov.rs gov.ru gov.sa
gov.sc gov.sg gov.sk gov.st gov.tl gov.to gov.tp gov.tt gov.tv gov.tw
gov.ua gov.uk gov.vc gov.ve gov.vn gov.ws gov.za jobs.tt tel.no tel.tr
act.gov.au nsw.gov.au nt.gov.au pa.gov.pl po.gov.pl qld.gov.au sa.gov.au
so.gov.pl sr.gov.pl starostwo.gov.pl tas.gov.au ug.gov.pl um.gov.pl
upow.gov.pl uw.gov.pl vic.gov.au wa.gov.au

There were also about _1400_ reserved .us TLDs prune from the list, IIRC

You make some pretty good points.  It may well not be worth the trouble,
at least for one-level TLDs.

Thanks,

Jared Johnson
Software Developer
DoubleCheck Email Manager



Re: per-recipient configuration

2010-07-28 Thread Jared Johnson
 so inside QP, the qpsmtpd::address object would have a known parameter
 that brings up the per-address persistence hash, which would be a flat
 hash. Something like
..

See in my mind, per-recipient config and persistent data storage are more
separate.  Maybe part of the reason I look at it this way is that in my
own implementation, I never really write config for a recipient, I only
read it (from my persistent storage, the db, of course).  I don't see it
necessary to be able to say $rcpt-config( spam_threshold = 10 ) from QP,
I'd do it from the UI).  Stored things are always written from QP
(logging) and sometimes read (auto whitelist ).  A hook_user_config plugin
would even be likely to make use of the persistent storage itself, but I
still see the concepts as split, and you could implement and benefit from
either one without the other.  When I went to write a reference plugin for
hook_user_config, I just thought of one where an admin can just drop
config for users into directories on the fs in the event that he wants to
override what's already set on the global level; I think the hook
structure even fell through to $qp-config() so it could really just begin
with an extension of /etc/qpsmtpd and go from there, just like
$qp-config() does.

That said, for persistent storage I would like to see a more
straight-forward API and skeleton API; something like:  $txn-get,
$txn-store, $rcpt-get, $rcpt-store, and a corresponding hook_get and
hook_store that are passed ( $self, $txn, [ $rcpt ] ) so it knows what to
key on, though it can even ignore it if it wants.  Extra points if QP
provides a
safe-but-reusing-connections-appropriately-depending-on-the-forking-model
DBI handle via $self-qp-dbh, and maybe a $self-qp-cache() for a
'persistent' Cache::FastMmap or Memcached cache, but the plugin could be
required to establish the actual data store for itself.  Then the plugin
goes to town.  It can store everything in a generic way, key = value or
whatever, or it could map the key names to your own business logic.  The
reference persistent storage plugin could still implement the type of
store you're talking about, and that would work out of the box for people
who just want to have greylisting out of the box or etc; but if I'm
reading it correctly, even if it's really awesome it's likely a lot of
developers would just have to scrap it in favor of what they're already
doing, at which point the more flexibility hook_store plugins have the
better.

# in stock upstream plugins/greylist
$rcpt-store( greylist = $self-qp-connection-remote_ip );

# in our internal storage plugin that overrides the generic one
# I don't think anyone would actually want this in particular :)
sub init { shift-{dbh} = $dbi-connect(...) }
sub dbh { shift-{dbh} }
sub hook_store {
my ( $self, $txn, $rcpt, $conf, $value ) = @_;
return DECLINED unless $conf eq 'greylist';
return OK unless $rcpt-notes('user_uid');
my $sth = $self-dbh-prepare_cached(
INSERT INTO bobsgreylist (ip,helo,rcpt) VALUES (?,?,?) );
$sth-execute( $value,
   $self-qp-connection-hello_host,
   $rcpt-notes('user_uid') );
return OK;
}

Unlike per-recip config, though we don't have any API etc. written
in-house to support generic persistent storage writing, for now we just
stick our DBI directly in our plugins, making them even more forked; so
this is all purely theoretical and David has the advantage of speaking
from some experience :)

-Jared



Re: per-recipient configuration

2010-07-27 Thread Jared Johnson
 My experience porting the Advenge SMTPD to be a qpsmtpd plug-in
 indicated that the current interface is entirely adequate for
 per-recipient things, as long as the plug-in manages its own
 persistence. I don't think any new hooks are required.

On principal I'd agree that it's possible to do it all in the plugins, but
if we're talking about making per-recip generally supported officially, it
seems best to make that as easy as possible.  I really like having the
user_config and a $rcpt-config() method to go along with the hook_config
and corresponding $self-qp-config().  And the more API support we have,
the less extra code is added to plugins to support per-recipient stuff,
e.g. the less forked they are from the non-per-recip counterparts.  IMO

Could you show some code examples?  What do you do with the queue
plugin(s) if recipients ended up having different headers or bodies due to
tagging etc.?

 Per-recipient
 features can be provided as a layer consisting of coordinating
 handlers for existing hooks, the most that needs to be done in my
 opinion is perhaps to establish a registry of name spaces in the
 existing per-message notes data, to prevent different plugins from
 clobbering e/o's data.

Is this still necessary if Qpsmtpd::Address::notes() exists?

 One of the problems is that SMTP doesn't support per-recipient results
 after DATA.

 Is QPSMTPD used widely enough that we could introduce new ESMTP
 extensions with a chance that they could become widely adopted? I
 believe I have seen a proposed per-recipient results after DATA ESMTP
 specification, but ... okay I'll search for it ... it was in a thread
 started by me, in 2006.

 http://www.ietf.org/mail-archive/web/asrg/current/msg12816.html

This would be flippin awesome.  Although even if QP has more pull than I
think it does, it would take a bazillion years for all of the interwebs to
adopt any protocol change.  Nevertheless the sooner it starts to change,
the better :)

-Jared



Re: per-recipient configuration

2010-07-27 Thread Jared Johnson
I assume you meant to CC the list originally, I've added it :)

 Could you show some code examples?

 not immediately, but sure

 What do you do with the queue
 plugin(s) if recipients ended up having different headers or bodies due
 to
 tagging etc.?

 Advenge only deals with deliver/defer with some fanciness on the defer
 side because the bounce message asks for some grease.

That makes the problem simpler but may be an incomplete solution for some.
 If we wanted to punt on the problem in mainline QP for a while, I suppose
plugins could just only ever modify the global body and avoid or
generalize any modifications that wanted to be different per-recipient. 
But again this is extra burden on the plugins

 Per-recipient
 features can be provided as a layer consisting of coordinating
 handlers for existing hooks, the most that needs to be done in my
 opinion is perhaps to establish a registry of name spaces in the
 existing per-message notes data, to prevent different plugins from
 clobbering e/o's data.

 Is this still necessary if Qpsmtpd::Address::notes() exists?

 Is that a persistent data structure keyed on address?

Well it's persistent data stored in the Qpsmtpd::Address object itself:

sub notes { # in Qpsmtpd::Address.pm
  my ($self,$key) = (shift,shift);
  # Check for any additional arguments passed by the caller -- including
undef
  return $self-{_notes}-{$key} unless @_;
  return $self-{_notes}-{$key} = shift;
}

I submitted that patch a while ago and I think it was accepted.

 What's efolder.net?

My new employer (it used to be @nmgi.com).  They bought DoubleCheck and
all its IP, infrastructure, and employees from Network Management Group. 
Their own product is a cloud backup solution, at some point we'll probably
be working on email archiving integrated with that.  It was really a nice
deal, since the switch I get to work at home and I have a new awesome boss
:)

-Jared



Re: Rewritten URIBL plugin

2010-07-27 Thread Jared Johnson
 I think we should probably consider putting support for parsed messages
 into core, with the parsing done lazily if requested by the API.

It's a good idea, it's only that I haven't yet had the time and
justification to write and to test a move from the plugin model (see the
attached plugin in my newer post) to an internal API model.  The plugin I
attached represents a trivially-edited version of a plugin we've been
testing in production for quite some time.  It probably wouldn't be *that*
difficult or risky, though, to take the very same code and put it into the
lazy API of your choice.  And I think it'd be great, even though we
wouldn't actually benefit from the laziness ourselves ;)

If you don't hate storing the resulting parsed mime in
$txn-notes('mime_body'), though, that would rock, since then it wouldn't
break our existing plugin-based forked code ;)

-Jared



Re: Rewritten URIBL plugin

2010-07-27 Thread Jared Johnson
 I think we should probably consider putting support for parsed messages
 into core, with the parsing done lazily if requested by the API.

I forgot, we did kinda think of a couple of reasons not to want an API. 
depending on where you put it, you may find QP in general depending on
MIME::Parser even if it never uses it.  The benefit of the plugin is that
/etc/qpsmtpd/plugins controls whether you wind up having to install, and
take up your memory with, MIME::Parser

You could conditionally 'require', although that does do different things
to memory management that I don't understand

Another thought, there are different reasons people might want mime to be
parsed that benefit from different implementations.  If you only want the
text data, maybe you want to optimize for that somehow and throw away the
binary attachment data, or vice versa (unfortunately there doesn't seem to
be a way to do this with MIME::Parser that i've found though...).  If you
only want the text data *and* you don't mind having it not perfectly
reconstructed (like the URIBL) plugin, then perhaps you could come up with
a way to parse without MIME::Parser that is more efficient and more lazy
(not in the sense you were suggesting earlier :P)

-Jared



Re: Rewritten URIBL plugin

2010-07-26 Thread Jared Johnson
 - Introduces support for URIBL services that may not have worked right,
 at
 least out of the box, before.  Defines the subtle differences between
 various known URIBL services in order to maximize compatibility.

 Is it worth pulling some of this config out of the code and putting it
 into some sort of config file?

Probably would be worth it, and not that difficult to do.  Any new config
scheme should probably be both backward compatible and non-spaghetti.  The
current scheme looks like:

multi.uribl.com 2,4 deny

the new plugin does add one additional feature, you can now do:

multi.uribl.com 2 deny
multi.uribl.com 4 log

but $zones defines a bunch of other things - labels for the zone in
general and specific masks, check_ip, check_host,
host_wildcards_supported, lookup_a_record, lookup_ns_record.  Ignoring the
label issue, all the other directives are related only to the zone;
perhaps we could add a new scheme on top of the mask definition scheme
(which we would continue to read in the old way), getting even further
away from the one-line-per-zone style:

multi.uribl.com action deny
multi.uribl.com check_ip 1
multi.uribl.com check_host 1
multi.uribl.com lookup_a_record 0
multi.uribl.com lookup_ns_record 0
multi.uribl.com host_wildcards_support 0

If nobody hates this scheme it does seem like something I could add pretty
easily send an updated copy

 - Uses Net::DNS::Async to simplify the code, and also to ensure the
 afore-mentioned A and NS lookups will prompt new URIBL lookups in an
 efficient and simple manner via callbacks

 Does the code still work with the async qpsmtpd cores?

Well... I didn't change the existing plugin a whole lot; I did change from
data_handler() registered via register(), to plain old hook_data_post();
this may need changed considering that the comments above each sub (which
I also removed) suggested that hook_data_post() and register() were 'not
used' for async.  These things could easily be changed if needed.  However
we don't run async anywhere, not even in test environments, so I'm not
very qualified to know if they need changed, or give any guarantees.  And
I'm confused by that bit.  From what I can tell, the async/uribl plugin
ignores plugins/uribl entirely and uses
Qpsmtpd::Plugins::Async::DNSBLBase, which in turn uses ParaDNS.

Now I noticed the async rhsbl plugin used Q::P::A::D and that this used
ParaDNS before and looked at it briefly to see if I could use it for my
purposes, but at the top of the docs it's mentioned that only A lookups
are supported, so I skipped it and went to Net::DNS::Async.  I don't
really know the strongpoints of N::D::A vs. ParaDNS but it did note that
the latter more expressly claims to be interested in performance and
scalability.  So in conclusion:

(1) I think I'll look into using ParaDNS and maybe even
Q::P::Async::DNSBLBase if appropriate, even though my plugin is not in
async; it might still have advantages, and the interesting logic is in the
callbacks and not the module use anyway
(2) I don't know if it matters whether this particular plugin works with
async, I guess it depends on whether the old one really ever did.  At
least Net::DNS::Async should not make it any more blocking than the old
plugin's collect_results().  At any rate somebody who plays in async
territory should probably comment before any action is taken on the new
plugin :)

 Attached also is tld_lists.pl, a companion file that needs to be dropped
 in lib/Qpsmtpd/ which provides the list of first, second, and third
 level
 TLDs that we care about.  It's derived from our URIBL datafeed as well
 as

 Do the owners of that data care about it being used this way?  You may
 be violating any agreement with them.  Would they be ok if this was
 released as an independent CPAN module?Either way, can we
 structure this as an API instead of just an include file?

(1) I suspect that anyone who runs a URIBL service would be thrilled with
this information being available for this particular use, because it
minimizes unnecessary queries to their services.  Consider the old method
that sent 2 or 3 queries instead of one because it didn't know which one
was right.  The fact that SURBL and URIBL both publish the two and three
level lists and encourage their use is evidence of this :)

(2) First-level TLDs are public information, of course, and with the
two-level and three-level lists being public, the interesting business is
that the first-level list and the SURBL second and third level lists are
pruned based on URIBL's data.  So it comes down to whether anyone can be
considered to have a claim on the privilege of knowing how to *exclude*
results.  I guess maybe they could if it came down to it...

(3) I'm pretty sure I could get URIBL's explicit permission, but it would
take a long time.  The founder and sole owner of URIBL was also the
original author of DoubleCheck, so I feel we would have a bit of an in :) 
However even so he always takes forever to get back to us in

Re: Rewritten URIBL plugin

2010-07-26 Thread Jared Johnson
 Do the owners of that data care about it being used this way?  You may
 be violating any agreement with them.  Would they be ok if this was
 released as an independent CPAN module?Either way, can we
 structure this as an API instead of just an include file?

Forgot to mention, yes, I'd be happy to add the few lines it would take to
make this an API, even if it's just in lib/Qpsmtpd.  It might be nice for
us to get datafeeds with other services (we're looking to at least get one
from DBL, possibly SURBL and imvURI) before publishing to CPAN, mainly so
that we know for sure that we have not over-pruned the first level or
SURBL second  third level lists.  As they are though, I consider the
lists good enough to warrant production use of this method over the
current method in the existing QP uribl plugin.  After all any TLD that
manages to make it onto other services but has garnered *zero* hits on
URIBL, probably has very very few entries on those other services. 
Anyway, if it's a bit of a licensing/copyright problem to use the data
from URIBL, it'll be more of one to use the data from other services as
well, but we can probably work that question in from the get-go

-Jared



Re: Rewritten URIBL plugin

2010-07-26 Thread Jared Johnson
 On Sun, Jul 25, 2010 at 04:43:36AM -0500, Jared Johnson wrote:
 The plugin has the following advantages over the original:

 Based on a single read-through of the code, I like most of it.  Some
 assorted observations, though:

 - Parsing out the MIME sections is worthwhile, though it's really
 something
   that ought to be done in the core libs when plugins indicate they need
 it,
   or in a utility plugin if that's practical.  This was something I meant
   to bring up a while back -- we have quite a few different
 content-scanning
   plugins, and if they're all obliged to invoke MIME::Parser repeatedly
 it'll
   be even more costly than that module already is.  Not that I have time
 to
   do it myself, though.  :P

Our internal plugin actually just uses $txn-notes('mime_body') instead of
creating a new $msg, because we do indeed have such a utility plugin.  A
bit short of being transparently handled by internals, but I could at
least include code in the plugin to use $txn-notes('mime_body') instead
of parsing when that exists.  This would also allow our internal plugin to
be slightly less forked from what winds up in upstream :)  Your idea of
doing this internally sounds cool to me, but we're unlikely to get around
to implementing it ourselves any time soon given the system that already
exists is working OK.  And unfortunately our utility plugin is really just
one aspect of a bigger plugin that wouldn't really be useful to QP, and we
have no method of 'asking' for MIME parsing from said plugin since for our
purposes it's *always* necessary to parse mime.

 - For messages below a few hundred kb or so, consider
   MIME::Parser-output_to_core(), so we can avoid the fs churn.
 Annoyingly,
   MIME::Parser doesn't provide a mechanism to only consider particular
   mimetypes, or we could skip decoding non-text bodies.

The MIME::Parser docs are pretty insistent that using output_to_core is
detrimental to performance in general, and should only be bothered with
when there are actual disk constraints.  I've never tested this theory for
really small messages, but it's kept me away from trying it :)

If it's not very difficult, perhaps it could be optional via a plugin
argument, and later via the handy dandy general
mime-parsing-needed-feature if it's ever written

 - It looks like you're scanning inside all parts of multipart MIME, where
   doing so only on text parts might be preferable.

No, in scan_body() we do:

return unless $part-effective_type ~~ [qw( text/plain text/html )];

I think that's what we want anyway...

 - The config really needs to live in a config file -- changing or
 expanding
   the format is fine if needed, but config that lives in code can't really
   be edited.  I'm not bothered by having per-list defaults in the plugin,
   but one should be able to fully add a new list or alter an existing one
   without changing Perl code.

I agree.  If you don't object to the spec I proposed in my reply to
Robert, I can get that changed soon.

 - In the unwinding of HTML entity escaping -- I suggest doing amp; last.
   amp;gt; is gt;, not .

Good idea, thanks :)

 the SURBL two-level and three-level lists -- but the SURBL lists are
 pruned to exclude stuff that we're pretty sure is never going to
 actually
 get listed -- when was the last time you saw a spammer use .mil?
 uribl.com never has, according to our datafeed :)

 We'll probably see compromised .gov and .mil sites hosting malware, much
 as we
 see in other domains.  They're not that useful for classical spammer
 purposes
 since they can't be registered in the usual fashion.

Ideally, we should only bother querying services for URIs that have some
chance of being listed on those services, regardless of their chance of
actually being used in spam.  Thus, if the datafeeds don't have any hosts
with .mil, there's no point checking them.  However, (1) we're only
deriving our list from URIBL right now, and even as we expand we can never
say that we derived our list from every service that someone might use;
(2) even if I ran my tld_list generating script every day against every
service and sent in an updated list as soon as something got listed on
.mil, there would still be the lag time to get a release out with the
updated list and then have people upgrade, during which time there would
be some misses.

I think that using 'pruned' lists is eventually going to cause some misses
somewhere.  For our own use, since we have control of update timelines,
it's worth the risk, and we also don't have to worry about legal issues
with using the data.  If either of these is a big enough problem for QP in
general, though, you could work around it by using an un-pruned
tld_lists.pl.  This would give %firstlevel_tlds 263 entries instead of
128; %surbl_secondlevel_tlds would have 3,801 instead of 2,426; and
%surbl_thirdlevel_tlds would have 155 instead of 135.  If you're worried
about the legal issue but *not* the timeliness issue, you could still
exclude a few

Re: Rewritten URIBL plugin

2010-07-26 Thread Jared Johnson
 And
 I'm confused by that bit.  From what I can tell, the async/uribl plugin
 ignores plugins/uribl entirely and uses
 Qpsmtpd::Plugins::Async::DNSBLBase, which in turn uses ParaDNS.

 In that case, it's possible (and likely) that it may never have
 worked with async.  When time is short, it's easy to just ask questions :)

OK, I looked closer and I guess the async plugin actually 'inherits' from
the non-async one, so it uses its methods.  That's a new topic for me but
I think I can come up with the necessary changes to the non-async plugin
and maybe a patch to the async one to use it correctly.  But I won't be
able to test the theory :)

-Jared



Re: per-recipient configuration

2010-07-26 Thread Jared Johnson
 Having a common API for per-recipient things have long been on the todo
 list.

 We've talked about it a couple times before; but the
 requirements/needs/wishes never sunk in deep enough in my head to do
 anything about it.  So:  How do you use this?

 My thoughts all along have been to just have some way to make a find user
 data plugin that'll fetch a per-recipient blob of information that other
 plugins can, uh, plug into.

 The big stumbling block is what the common infrastructure for dealing with
 diverting needs of the recipients should be.   What would we need in terms
 of per-recipient header rewriting, post-DATA processing etc?

It is kid of a big scary can of worms.  In large part because AFAICT, a
number of shops including my own have grown their own solutions, often
leaving things just as forked as we are.  It seems that everybody's doing
it in different, incompatible ways which each shop is fully pleased with
and would find it difficult to move back way from, and really each method
seems valid enough.  It may be workable to try to support multiple
different treatments, or it may be necessary to choose one and go with it.
 The latter would mean that some existing users might just remain forked,
but if you made the right choices then probably new users would never
bother to move away.  There was a time when this was more at the forefront
of my mind and I was interested enough to 'put together a proposal', I
sent some ideas to the list at the time but now I'm not so sure we even
follow those ideas in our own fork; we've moved beyond, and yet what we
have now is still a work in progress.  So to 'do it right the first time'
some new thinking would have to take place.  But FWIW, here's some about
what we do now:

- Qpsmtpd::Address has a config() method which passes through to plugins
that call 'hook_user_config'.  Note that I used to call this
'hook_rcpt_config' but really it's quite valid to call
$txn-sender-config(...) when you've determined that this is an outbound
message and you want to know something about the sender's preferences
instead.  The way I implemented this is a bit of a mess.  At any rate I
think I submitted a patch to do this sort of thing, but I might not have
managed to stick with the review process long enough to answer everyone's
concerns/questions.  Or maybe it was accepted!  I can't really remember ;)

- Qpsmtpd::DSN is heavily modified so that it can be used to store
information in Qpsmtpd::Address (which itself has methods to manipulate
such things) about what we have decided we want to do with each recipient
and why we have chosen to do it; e.g. ( Rejected = 'Spam' ).

- Qpsmtpd::Transaction has been enhanced so that recipients() returns only
the recipients that we have (thus far) decided we're going to be
delivering to, and all_recipients() and rejected_recipients() have been
added to give information about recipients that we have decided not to
accept.

Keeping these lists around is useful for logging but may also be necessary
depending on how one decides to deal with the issue of we've reached the
end of DATA and want to reject this recipient and accept this one based on
their different preferences, but we can only give one response. 
Obviously we respond with '250' in this case; as for the 'rejected'
recipients, some choose to drop them silently; some have a 'quarantine'
method in place and choose to quarantine for these recipients; some choose
to bounce (this last method is the most reviled, but what can I say?  I
don't want to do it, but my lead developer does.  And the situation is
pretty rare anyhow).

The content of these methods in our forked code might shed a little light
on how they work with the DSN data stored in each recipient:


sub recipients {
  my $self = shift;
  @_ and $self-{_recipients} = [...@_];
  return () unless $self-{_recipients};
  return grep { ! $_-dsn or $_-dsn-action ~~ [qw( Accepted Delivered
Queued Quarantined )] }
@{$self-{_recipients}}
}

sub rejected_recipients {
  my $self = shift;
  @_ and $self-{_rejected_recipients} = [...@_];
  return () unless $self-{_recipients};
  return grep { $_-dsn and ! $_-dsn-action ~~ [qw(Delivered Queued
Quarantined)] }
@{$self-{_recipients}}
}

sub all_recipients {
  my $self = shift;
  return () unless $self-{_recipients};
  return @{$self-{_recipients}};
}

- All of our per-recip-pref-aware post-data scanning plugins loop through
each to-be-accepted recipient and determine what we want to do for each
recipient.  Then a single plugin afterward handles the results and
responding to the client.  So basically, an empty $txn-recipients()
becomes a short-circuit for whatever post-data plugins are left in the
mix.  We actually don't do this with the DSN objects; we use a separate
'class' note to denote that something i 'spam', 'clean', 'whitelisted',
etc.  Then the last plugin sets the DSNs for each recipient.  Maybe we
should have figured a way to do that, idunno.

- Each recipient object can 

Re: per-recipient configuration

2010-07-26 Thread Jared Johnson
 sub rejected_recipients {
   my $self = shift;
   @_ and $self-{_rejected_recipients} = [...@_];

I noticed after posting this that the line above is a no-op.  Ha :)

   return () unless $self-{_recipients};
   return grep { $_-dsn and ! $_-dsn-action ~~ [qw(Delivered Queued
 Quarantined)] }
 @{$self-{_recipients}}
 }

-Jared



Re: Rewritten URIBL plugin

2010-07-26 Thread Jared Johnson
 Attached also is tld_lists.pl, a companion file that needs to be dropped
 in lib/Qpsmtpd/ which provides the list of first, second, and third
 level
 TLDs that we care about.  It's derived from our URIBL datafeed as well
 as

 Would they be ok if this was
 released as an independent CPAN module?Either way, can we
 structure this as an API instead of just an include file?

I've already pretty much done the work to access tld_lists as an API, but
I'm now doubting the use unless it is released to CPAN:  it really makes
the most sense to have Qpsmtpd::TLDLists.pm itself still pull in the
contents of tld_lists.pl as I posted them, in a 'require', because those
contents are created by an automated script.  It seems silly to require
the modification of said automated script if we want it to generate
different logic.  But when Qpsmtpd::TLDLists require's tld_lists.pl, then
comes the question 'if Qpsmtpd::TLDLIsts is the only thing that accesses
this data, and plugins/uribl is the only thing that uses
Qpsmtpd::TLDLists, then why doesn't plugins/uribl just read the data
itself?

You guys can have it whichever way you want it but I say, don't be afraid
to treat a .pl file as if it were basically a static, shipped config file
in a format easily read by perl plugins :)

-Jared



Re: URIBL plugin and 'semicolon insertion munging'

2010-07-24 Thread Jared Johnson
 This seems to be a big improvement at least on the 3 million lines of
 random traffic i tested with, and it's a smaller patch:
[snip]

Well, it may have been an improvement over my own data, but a colleague
pointed out the following case:

check out spamsite.com;it's awesome!

And this didn't deal with Devin's example of http://spamsite.com;.net either

At this point it's a bit beyond testing on the original plugin for me, but
here's what I came up with given my own modified hostname re loop (the
minimal entity decoding + stripping is still needed)

while ( m{ ( (?: [a-zA-Z0-9:./-]+ @ )?
 [a-zA-Z0-9][a-zA-Z0-9.;-]+\.$tld )
   (?! \.?\w ) }gxo ) {
my $host = lc $1;
# Deal with inserted-semicolon munging, e.g. 'http://foo;.com'
if ( my @split = $host =~ /(.*?);(.*)/ ) {
my @h = split /\./, $split[0];
if ( $h[-1] =~ /^$tld$/ ) {
# 'foo.com;.net', 'foo.com;.net;.foo', 'foo.com;it's great'
$host = $split[0];
} else {
$split[1] =~ /([^.;]+)/;
$split[1] = $1 if $1; # 'foo;.com;.net', 'foo;.com'
$host = $1 ? $split[0].$1 : $split[0];
}
}

This works at least for the test cases i've come up with, and doesn't
otherwise change processing from what _I_ had before for the other few
million lines i'm checking against, but is not yet in production and the
change alone is not tested against the vanilla code.  If anyone's
interested, the loop continues, dealing with excluding email addresses
while including user:p...@site:


if ( $host =~ s/.*\.\.// ) {
next unless $host =~ /\./;
}
if ( $host =~ s/^\w{3,16}:\/+// ) {
# 'http://realsite.com/cgi/em...@email.host.com'
# 'http://user:p...@realsite.com'
$host =~ s/\/.*//;
$host =~ s/.*@//;
} elsif ( $host =~ /(.*):/ ) {
# 'user:p...@realsite.com'
# 'mailto:em...@emailhost.com'
next if $1 =~ /mailto/;
$host =~ s/.*@//;
} else {
# 'realsite.com/cgi/em...@email.host.com'
$host =~ s/\/.*//;
next if $host =~ /@/;
}

This is the main reason for the different RE above.  Also it obsoletes the
need for two RE loops.

-Jared



Re: URIBL plugin and 'semicolon insertion munging'

2010-07-24 Thread Jared Johnson
 while ( m{ ( (?: [a-zA-Z0-9:./-]+ @ )?
  [a-zA-Z0-9][a-zA-Z0-9.;-]+\.$tld )
(?! \.?\w ) }gxo ) {
 my $host = lc $1;
 # Deal with inserted-semicolon munging, e.g. 'http://foo;.com'
 if ( my @split = $host =~ /(.*?);(.*)/ ) {
 my @h = split /\./, $split[0];
 if ( $h[-1] =~ /^$tld$/ ) {

and I forgot to mention how $tld is defined.

my %firstlevel_tlds = map { $_ = undef } # 128 entries
qw( ae aero am ar as at au az ba bd be bg bh biz br bs by bz ca cat
cc ch cl cm cn co com cx cy cz de dk ec ee eg es eu fi fr gd ge gg
gr gs hk hm hn hr hu id ie il im in info io ir is it je jp ke kg kr
kz la li lt lv ma md me mk mn mobi ms mu mx my name ne net nl no np
nu nz org pe ph pk pl pr pro pt py ro rs ru sa sc se sg si sk st su
tc th tk tl to tp tr tt tv tw tz ua uk us uy uz vc ve vn ws za );

my $tld = my $tld_lc = join '|', keys %firstlevel_tlds;
$tld =~ s/(\w)/[$1\u$1]/g;
$tld = (?:$tld);

the list is derived from our URIBL datafeed, e.g. we're only interested in
tld's that actually hit some lists in the first place, therefore skipping
things like .mil which spammers don't seem to have the guts to use.  This
admittedly makes it not a *perfect* solution for munging, but close enough
IMO.  We don't (yet) have DBL or SURBL datafeeds, unfortunately, but the
SURBL two-level and three-level lists are derived from the interwebs and
not from spam traffic AFAICT, and DBL doesn't have a list because they
support wildcards.  If anyone has an SURBL and/or DBL datafeed and is
interested, I could provide a script to generate a list from those to see
if anything should be added.

The substitution business is to make the RE into a big fat ugly mess with
things like '[cC][oO][mM]', so that we can take out the /i, which makes
things about twice as fast. but it's not strictly necessary :)

-Jared



Re: URIBL plugin and 'semicolon insertion munging'

2010-07-23 Thread Jared Johnson
Ah, I tried it in my own FF and the trick still works :)  It seems like
all you have to do to get around the nbsp; etc. problem is to wait a
little longer before applying the fixup -- allow the semicolon to match in
the hostname search and then strip it out.  Attached a patch, untested
(though the same change is tested on my now-greatly-forked uribl
plugin)... I'm sad to say I haven't been on github for a while so it may
be an age before i get around to submitting a patch according to the
standards, but if anybody else is interested in picking it up here it is
:)

Also, was it intended that the URI finding sub should find RHS hostnames
in email addresses as well?  I'm pretty sure it does... although again,
the testing I'm doing is no longer on the official code.  It seems
undesirable to me, I'm currently working on avoiding such hits while still
getting things like http://citibank.com:f...@phishingsite.com

I got approval from my boss yesterday to submit my updates to the plugin
to the ML, btw, which I'll do when it's a bit closer to finished, probably
the beginning of next week.  It's a huge change though, and notably adds
dependancies on MIME::Parser and Net::DNS::Async and completely changes
the config file format.  I don't really have approval to spend hours
creating digestible individual patches... but posting the whole plugin is
better than not posting any code, right? :)

-Jared

 On Wed, Jul 21, 2010 at 07:54:30AM -0500, Jared Johnson wrote:
 Unlike the other bits of dodge this sort of munging operations,
 examining my test results and asking uncle google has not made it clear
 to
 me what inserted-semicolon munging really is.  Can anyone shed light
 on

 My memory of this is fuzzy, but my SVN log indicates it was an attempt to
 deal
 with a style of munging that exploited different browser behavior on
 encountering semicolons in the hostname component of URLs.  At the time,
 the
 form I was considering was http://domain;.com/;.  Under firefox, the
 semicolon is taken to mean the end of the hostname and the start of the
 path,
 thus triggering the implied-.com behavior and ending up at
 http://domain.com/;.com/;.  I forget what IE does/did, but given the
 IE/FF
 market share balance in 2005 when I added that feature, it was probably
 similar.

 In fact the problem is a lot more complex, because the five major browsers
 out
 there all deal differently with deviations from the norm in URLs, and
 spammers
 exploit those deviations to mislead parsers.  For the case I had in mind,
 stripping semicolons out would be the right thing, but will be misled by
 the
 munge pattern http://domain.com;.com/foo;.  The uribl plugin deals with a
 lot
 of the munging tricks that were common back when it was written, but it's
 probably not comprehensive today and it's definitely suceptible to picking
 up
 bogus hostnames (e.g. the nbsp;.net behavior you note) based on what's
 left
 after the known munging tricks are unwound.

 http://code.google.com/p/browsersec/wiki/Part1#Uniform_Resource_Locators
 has a
 summary of some of the deviations, although it doesn't address this
 specific
 one.

 --
 Devin  \ aqua(at)devin.com, IRC:Requiem; http://www.devin.com
 Carraway \ 1024D/E9ABFCD2: 13E7 199E DD1E 65F0 8905 2E43 5395 CA0D E9AB
 FCD2

--- uribl.orig	2010-07-23 17:06:10.894320796 -0500
+++ uribl	2010-07-23 17:08:34.314345909 -0500
@@ -290,8 +290,6 @@
 $l =~ s/[=%]([0-9A-Fa-f]{2,2})/chr(hex($1))/ge;
 # Undo HTML entity munging (e.g. in parameterized redirects)
 $l =~ s/#(\d{2,3});?/chr($1)/ge;
-# Dodge inserted-semicolon munging
-$l =~ tr/;//d;
 
 while ($l =~ m{
 \w{3,16}:/+# protocol
@@ -339,7 +337,7 @@
 }
 while ($l =~ m{
 ((?:www\.)? # www?
- [a-zA-Z0-9][a-zA-Z0-9\-.]+\.   # hostname
+ [a-zA-Z0-9][a-zA-Z0-9;\-.]+\.   # hostname
  (?:aero|arpa|asia|biz|cat|com|coop|# tld
 edu|gov|info|int|jobs|mil|mobi|
 museum|name|net|org|pro|tel|travel|
@@ -347,6 +345,8 @@
 )(?!\w)
 }gix) {
 my $host = lc $1;
+# Dodge inserted-semicolon munging
+$host =~ tr/;//d;
 my @host_domains = split /\./, $host;
 $self-log(LOGDEBUG, uribl: matched 'www.' hostname $host);
 
@@ -372,7 +372,7 @@
 \w{3,16}:/+ # protocol
 (?:\S+@)?   # user/pass
 (
-	 [a-zA-Z0-9][a-zA-Z0-9\-.]+\.   # hostname
+	 [a-zA-Z0-9][a-zA-Z0-9;\-.]+\.   # hostname
 	 (?:aero|arpa|asia|biz|cat|com|coop|# tld
 edu|gov|info|int|jobs|mil|mobi|
 museum|name|net|org|pro|tel|travel|
@@ -380,6 +380,8 @@
 )
 }gix) {
 my $host = lc $1;
+# Dodge inserted-semicolon munging
+$host =~ tr/;//d

Re: URIBL plugin and 'semicolon insertion munging'

2010-07-23 Thread Jared Johnson
 It seems like
 all you have to do to get around the nbsp; etc. problem is to wait a
 little longer before applying the fixup -- allow the semicolon to match in
 the hostname search and then strip it out.

My bad.. I guess the plugin currently only fixes up '#\d\d\d' encoding,
not nbsp; etc.  maybe i'll work on that...

-Jared



Re: URIBL plugin and 'semicolon insertion munging'

2010-07-23 Thread Jared Johnson
 It seems like
 all you have to do to get around the nbsp; etc. problem is to wait a
 little longer before applying the fixup -- allow the semicolon to match
 in
 the hostname search and then strip it out.

 My bad.. I guess the plugin currently only fixes up '#\d\d\d' encoding,
 not nbsp; etc.  maybe i'll work on that...

This seems to be a big improvement at least on the 3 million lines of
random traffic i tested with, and it's a smaller patch:

--- uribl.orig  2010-07-23 17:06:10.894320796 -0500
+++ uribl   2010-07-23 19:57:39.304321519 -0500
@@ -289,7 +289,13 @@
 # Undo URI escape munging
 $l =~ s/[=%]([0-9A-Fa-f]{2,2})/chr(hex($1))/ge;
 # Undo HTML entity munging (e.g. in parameterized redirects)
-$l =~ s/#(\d{2,3});?/chr($1)/ge;
+$l =~ s/#(\d{2,4});?/chr($1)/ge;
+# Un-encode a few common important named entities and discard the
rest
+$l =~ s/nbsp;/ /go;
+$l =~ s/amp;//go;
+$l =~ s/gt;//go;
+$l =~ s/lt;//go;
+$l =~ s/\w{2,6};//go;
 # Dodge inserted-semicolon munging
 $l =~ tr/;//d;





URIBL plugin and 'semicolon insertion munging'

2010-07-21 Thread Jared Johnson
Hi,

I'm working with Devon Carraway's URIBL plugin and have been testing its
effectiveness in finding URI's using 6 million lines or so of email
traffic from a day in the life of our mail servers.  My testing has shown
that the following line in the while ( $l = $transaction-body_getline )
loop within lookup_start is problematic:

# Dodge inserted-semicolon munging
$l =~ tr/;//d;

Unlike the other bits of dodge this sort of munging operations,
examining my test results and asking uncle google has not made it clear to
me what inserted-semicolon munging really is.  Can anyone shed light on
how semicolons could be used to obfuscate URIs so the URIBL plugin can't
detect them?  If I have an understanding of this, perhaps I can come up
with a safer alternative.  I'll paste some of the output of my test script
to demonstrate the effect of tr/;//d.  The 'Original result' is what we
find if we're using tr/;//d, the 'New result' is what we find without it.

TDnbsp;BRequired:/BBRBRFONT size=2
face=Tahomanbsp;.NET/FONTBRFONT size=2
face=Tahomanbsp;Blah/FONTBRFONT size=2
face=Tahomanbsp;Blah/FONTBR/TD
Results differ!
Original result: nbsp.net
New result:  no match

Wichita, KS 67204, USA nbsp;nbsp;www.somesite.comnbsp;nbsp; =
Results differ!
Original result: nbspwww.somesite.com
New result:  www.somesite.com

... you gets the picture.

-Jared






[PATCH (maybe)] Don't reset_transaction between queueing and responding

2009-09-26 Thread Jared Johnson
So, our fork has a few logging and bayes training mechanisms that run in 
hook_reset_transaction.  They are all very fast, most of the time; but 
recently one of them broke and started running very slowly.  The result: 
 QP would inject an incoming messages into the queue, enter 
hook_reset_transaction, and 20 minutes later try to respond '250 
Queued!'.  Of course, by then clients had given up and assumed their 
messages were not delivered, resulting in duplicate messages and bounces 
and such.


Here's the patch that I applied to guard against this.  On principal, we 
should do as little as possible between queueing a message and telling 
the client we queued; if a plugin really needs to something at this 
juncture, it should probably be using hook_queue_post.  Am I right?  Or 
is this going to break something and bite me later :)


I haven't pushed this to my git yet, but I will if nobody points out any 
fatal flaws or other reasons to change the patch first


-Jared
diff --git a/lib/Qpsmtpd/SMTP.pm b/lib/Qpsmtpd/SMTP.pm
index 258282c..7d75d4e 100644
--- a/lib/Qpsmtpd/SMTP.pm
+++ b/lib/Qpsmtpd/SMTP.pm
@@ -818,10 +818,9 @@ sub queue_pre_respond {
 sub queue_respond {
   my ($self, $rc, $msg, $args) = @_;
   
-  # reset transaction if we queued the mail
-  $self-reset_transaction;
-  
   if ($rc == DONE) {
+# reset transaction if we queued the mail
+$self-reset_transaction;
 return 1;
   }
   elsif ($rc == OK) {
@@ -840,6 +839,7 @@ sub queue_respond {
 $msg-[0] ||= 'Queuing declined or disabled; try again later';
 $self-respond(451, @$msg);
   }
+  $self-reset_transaction;
   
   # And finally run any queue_post hooks
   $self-run_hooks(queue_post);


RE: Released 0.83

2009-09-16 Thread Jared Johnson


From: Ask Bjørn Hansen [...@develooper.com]
Sent: Tuesday, September 15, 2009 5:22 PM
To: qpsmtpd@perl.org ML
Subject: Released 0.83

   prefork: More robust child spawning (Jared Johnson)

You might as well credit this one to Peter Samuelson while you're at it.  
That's my co-worker who gave me the patch.  He wanted it submitted upstream but 
didn't feel like getting on github :)

-Jared

Re: Regarding config plugins...

2009-09-09 Thread Jared Johnson

Patches submitted where?

-Jared

On 09/09/2009 04:43 PM, Jason Mills wrote:

Okay patches submitted.
For for the time delay but I had a bit of hard drive scare last night
after I finished the code but before I could spin the patch.

Any input, suggestions and or scorn is welcome :)

- J

On Tue, 2009-09-08 at 21:22 -0700, Jason Mills wrote:

Working on it right now :)

On Tue, 2009-09-08 at 21:17 -0700, Ask Bjørn Hansen wrote:

On Sep 8, 2009, at 21:05, Jason Mills wrote:


So to the point of the question: Would anyone mind if I patched the
configuration stuff within Qpsmtpd to allow configuration minded
plugins
to alter/hook/stack into how Qpsmtpd discovers and utilizes cached
configuration values?


Show us a patch!   Changes to make it easier or more powerful to write
plugins are generally welcome.

(patches with unit tests are even better...)


   - ask


Re: [PATCH] Log even when we aren't in a transaction

2009-08-24 Thread Jared Johnson

On 08/23/2009 04:03 PM, Matt Sergeant wrote:

On Thu, 20 Aug 2009 10:18:39 -0500, Jared Johnson wrote:

It looks like logging/file doesn't like the empty hashref returned by
Qpsmtpd::transaction().


I never understood why it did that. Any reason it can't return either
undef or (preferably) a new Transaction object?


I don't really understand it either, and I fear that which I don't 
understand, so I worry about taking it out and breaking some hackage 
that depends no it :)


BTW, a bit of a cleaner work-around to this problem that we've tested 
successfully is to add 'sub transaction { return shift }' to Transaction.pm


-Jared


Re: [PATCH] Log even when we aren't in a transaction

2009-08-20 Thread Jared Johnson

This should allow the logging/file plugin to log even if it isn't
called from
within a transaction


Thanks, applied and pushed.

The conditional around where you made the first change are a bit of a
mess (indicating that there's a better way to do whatever it does),
but I didn't look closer ...  :-)


Well, this is in production now and hasn't actually fixed the issue.. 
now I realize that we're getting Can't call 'notes' on unblessed method 
reference.. not Can't call 'notes' on undef.. I guess I'll have to 
look closer :)


-Jared


Re: [PATCH] Log even when we aren't in a transaction

2009-08-20 Thread Jared Johnson

On 08/14/2009 04:31 PM, Ask Bjørn Hansen wrote:


On Aug 14, 2009, at 14:14, Jared Johnson wrote:


This should allow the logging/file plugin to log even if it isn't
called from
within a transaction


Thanks, applied and pushed.

The conditional around where you made the first change are a bit of a
mess (indicating that there's a better way to do whatever it does),
but I didn't look closer ...  :-)


Well, I didn't have a closer look at logging/file itself, but I found 
that it was Qpsmtpd::Transaction::DESTROY() that was trying to log with 
$self-log() and failing.  It looks like logging/file doesn't like the 
empty hashref returned by Qpsmtpd::transaction().  The following patch 
does clear up the issue for us, although a more general solution might 
be in order.  Thoughts?


index 125c9a5..6053259 100644
--- a/lib/Qpsmtpd.pm
+++ b/lib/Qpsmtpd.pm
@@ -408,8 +408,10 @@ sub run_hooks_no_respond {
 my ($self, $hook) = (shift, shift);
 if ($hooks-{$hook}) {
 my @r;
+my $transaction = $self if ( ref $self ) =~ /Transaction$/;
+$transaction  ||= $self-transaction;
 for my $code (@{$hooks-{$hook}}) {
-eval { (@r) = $code-{code}-($self, $self-transaction, 
@_); };

+eval { (@r) = $code-{code}-($self, $transaction, @_); };
 $@ and warn(FATAL PLUGIN ERROR [ . $code-{name} . ]: 
, $@) and next;

 if ($r[0] == YIELD) {
 die YIELD not valid from $hook hook;



-Jared




   - ask

--

-BEGIN ANTISPAM SIGNATURE-
DoubleCheck identified this message as CLEAN. If this is incorrect, help
improve accuracy:

Classify this message as SPAM:
 http://mg1.nmgi.com/message/spam?token=rv.R1EgQhrnd
More options:
 http://mg1.nmgi.com/message/details?token=rv.R1EgQhrnd

-END ANTISPAM SIGNATURE-




patch for minor logging plugin crash

2009-08-14 Thread Jared Johnson
We noticed that logging/file sometimes emitted this error:

XX: Can't call method notes on unblessed
reference at /usr/share/qpsmtpd/plugins/logging/file line 272.

It appears this is because QP internal sometimes legitimately calls -log before
a transaction object exists.  The patch I'm sending should prevent logging/file
from crashing in that event.

I'm not really sure how to get git send-email to attach this description to my
actual patch email, so stay tuned for the patch itself :)

-Jared




[PATCH] Log even when we aren't in a transaction

2009-08-14 Thread Jared Johnson
From: Jared Johnson jar...@jello.nmgi.com

This should allow the logging/file plugin to log even if it isn't called from
within a transaction
---
 plugins/logging/file |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/plugins/logging/file b/plugins/logging/file
index 31292ad..5717fe5 100644
--- a/plugins/logging/file
+++ b/plugins/logging/file
@@ -263,13 +263,15 @@ sub hook_logging {
 # - It's not already open
 # - We're allowed to split sessions across logfiles
 # - We haven't logged anything yet this session
+# - We aren't in a session
 if (!$self-{_f} ||
 !$self-{_nosplit} ||
+!$transaction ||
 !$transaction-notes('file-logged-this-session')) {
 unless (defined $self-maybe_reopen($transaction)) {
 return DECLINED;
 }
-$transaction-notes('file-logged-this-session', 1);
+$transaction-notes('file-logged-this-session', 1) if $transaction;
 }
 
 my $f = $self-{_f};
-- 
1.6.0.4



Re: invalid sender creates havoc

2009-07-30 Thread Jared Johnson

On 07/30/2009 07:11 AM, Peter J. Holzer wrote:

On 2009-07-29 14:07:53 -0500, Jared Johnson wrote:

We recently noticed a message in our postfix queue that thought it was
addressed to  foo.com.  After examining it, it turned out that Qpsmtpd
accepted a MAIL FROM command formatted like so:

MAIL FROM:u...@d.com,foo.com


Weird.

Ah, I see the problem. In line 187 of Qpsmtpd::Address the regexp for
the domain is initialized like this:

my $domain = (?:$address_literal|$subdomain(?:\.$subdomain)*);

(looks like I didn't know about qr// when I wrote that code)

That's a double quoted string, so \. needs to be written as \\..


Ah, that fixes it.  Incidentally, I think that might have been 
introduced by a patch that I submitted.  I think I recall there was a 
bit of discussion over using qr// but there was some good reason not to, 
but the reason itself completely escapes me :)


-Jared

--

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: invalid sender creates havoc

2009-07-30 Thread Jared Johnson

On 07/29/2009 09:39 PM, Ask Bjørn Hansen wrote:


On Jul 29, 2009, at 12:07, Jared Johnson wrote:


We recently noticed a message in our postfix queue that thought it
was addressed to  foo.com.  After examining it, it turned out that
Qpsmtpd accepted a MAIL FROM command formatted like so:

MAIL FROM:u...@d.com,foo.com



Weird tha postfix would munge it.  I thought don't make stuff up was
one of the reasons why everyone upgraded from sendmail.  Anyway... :-)


To be fair, I realized later that postfix doesn't munge it, our postfix 
queue parsing stuff munged it :)


-Jared

--

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


invalid sender creates havoc

2009-07-29 Thread Jared Johnson
We recently noticed a message in our postfix queue that thought it was 
addressed to  foo.com.  After examining it, it turned out that Qpsmtpd 
accepted a MAIL FROM command formatted like so:


MAIL FROM:u...@d.com,foo.com

When it passed it to postfix, it transmographied it first into the 
sender 'u...@domain.com, something.com' and then into the sender 
'u...@domain.com' and the recipient ' something.com'.  That's a bit 
silly, but it seems that QP never should have accepted the message in 
the first place.  I tested using vanilla git Qpsmptd::Address as follows:


perl -MQpsmtpd::Address -le \
  print Qpsmtpd::Address-new('u...@d.com,foo.com')
u...@d.com,foo.com

Taking a look at Qpsmtpd::Address, I noticed some comments and code that 
seemed to be looking for similar syntax:


line
#   A-d-l = At-domain *( , A-d-l )
#   ; Note that this form, the so-called source route,
#   ; MUST BE accepted, SHOULD NOT be generated, and SHOULD be
#   ; ignored.

...

# strip source route
$path =~ s/^...@$domain(?:,\...@$domain)*://;

It looks to me like it's trying to find syntax like:

MAIL FROM:u...@d.com,@foo.com

and make it equivalent to just MAIL FROM:u...@d.com

That doesn't seem to be what it actually does, though:

perl -MQpsmtpd::Address -le \
  print Qpsmtpd::Address-new('u...@d.com,@foo.com')
u...@d.com\,@foo.com

Does anyone have a good enough grasp on rfc2821 and 
Qpsmtpd::Address::canonify() to know just what canonify() is actually 
supposed to be doing with source routes and why it's managing to allow 
this obviously invalid syntax through instead?


-Jared

--

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: problems with authentication and qpsmtpd-prefork

2009-07-09 Thread Jared Johnson

This thread may or may not be applicable:

http://www.nntp.perl.org/group/perl.qpsmtpd/2008/07/msg8134.html

-Jared

On 07/09/2009 07:15 AM, Stefan Priebe wrote:

Hi!

I've massive problems with authentification and qpsmtpd. After some time
i always get a you're already authorized. This is cause by not deleting
the key:
$qpsmtpd-{_auth} after a client close his connection.

Then later when the next client connects to the same child it is already
authorized by $qpsmtpd-{_auth}.

I could solve this by adding a
delete($qpsmtpd-{_auth});

to qpsmtpd-prefork after
 # continue to accept connections until old age is reached
 for (my $i = 0 ; $i  $child_lifetime ; $i++) {


But i don't think this is the correct way. Or do i miss something?

Regards,
Stefan

--



--

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: PATCH [1/5] Disconnect badhelo

2009-06-22 Thread Jared Johnson

One could however offer a alternate standard set of plugins
implementing the discussed ideas, I think.


One could.  I've been considering it, but mine's not as configurable (in
terms of handling mailer behaviour on a hit) as I'd like.  Probably
wouldn't be hard to rework that bit.


The project already maintains multiple different daemon architectures, 
and a separate set of plugins for -async... IMO it would be an 
unnecessary burden to maintain an alternate set of plugins.  I agree 
with Chris that methodology changes done right would make plugins 
simpler, though, eliminating the need for this.


Of course the semi-related issue of per-recip configuration directives 
does seem like something that could complicate plugins, unless that 
complexity were moved to core somehow (e.g. the plugin says DENY and 
core possibly decides to do something else)


-Jared


Re: PATCH [1/5] Disconnect badhelo

2009-06-19 Thread Jared Johnson
I like this approach, but it's had its complications for us because we 
don't want to make the compromise of doing the rest of the checks at a 
given stage needlessly; e.g. if our RBL plugin hits, we don't want to 
bother doing RDNS and SPF lookups only to say 'sorry, you were on an 
RBL' afterward.  Have you come up with an intuitive way to deal with 
this sort of thing?  We have come up with some ways, but I don't know 
that I'd call it intuitive, yet :)


One thing we've done that dealt with the problem somewhat was to expand 
the idea of the DSN object to be more customizable and to specify all 
the information we need for logging, and for deciding what to do when a 
particular filter hits.  It hasn't really addressed the issue 
completely, we've got three or four different methods in place dealing 
with variations on the same problem right now...


-Jared

On 06/19/2009 12:12 PM, Chris Lewis wrote:

Charlie Brady wrote:


It's not as though there are any SMTP clients out there which will try
again with a different helo host. Unless there is a spambot tuned for this
qpsmtpd plugin.


There are plenty of bots that do that.


I don't understand your statement here. In what way would any client be
better off with a connection help open following the DENY, than having
DENY_DISCONNECT?


It would be better for qpsmtpd in the long run for ALL of the filter
decision plugins to note what they found in -notes(), and always issue
DECLINED.  Leave it up to a single meta-plugin run last in every SMTP
phase to decide how to change the SMTP state (deny, deny_disconnect,
score, etc).

The reaction to a plugin filter hit should be orthogonal to setting a
filter hit.  Meaning: a filter hit doesn't change SMTP protocol state,
that's another plugin's job.

In this way, you retain the flexibility to set the engine's behaviour -
whether it be deny immediately, deny at RCPT TO, deny after DATA,
score, whitelist, etc _independently_ of the plugin rule hit.

By doing this you don't have to add what to do when you get a hit
configuration to every plugin - they don't have to know about
whitelisting mechanisms or what state to go to.  Simplifying the
filtering plugins greatly.

I had to rewrite (usually from scratch) every one of the qpsmtpd plugins
to make that happen in our implementation.




Spaces in MAIL FROM/RCPT TO commands

2009-06-11 Thread Jared Johnson
We've been considering blocking messages that break RFC compliance by 
including a space before or after the colon in MAIL FROM: and RCPT TO: 
commands.  From RFC 5321 Section 3.3:


   Since it has been a common source of errors, it is worth noting that
   spaces are not permitted on either side of the colon following FROM
   in the MAIL command or TO in the RCPT command.  The syntax is exactly
   as given above.

I wrote up the following statistics in an attempt to see just what would 
happen if we started enforcing this syntax.  As it turns out, we would 
be able to skip a whole lot of RBL, SPF, and address lookups as well as 
content scans, and also be able to block a number of spams that would 
have otherwise gotten through our filter... but we would also block 
*some* legit mail.  Most of the mail that would have been blocked was 
sent by email marketers, though, and wouldn't have necessarily been 
missed by our customers; and if the senders knew what they were doing, 
they could always do something about it.


Anyway, I thought I'd post it here and see if anyone has any opinions on 
whether it's acceptable to enforce this.  It's certainly very tempting.




Within a period of about 24 hours, the node that I examined processed 
22,523 transactions that would have violated strict syntax checks WRT 
spaces.

---
Rejections:
invalid recip  15,881
rbl 4,856
spamassassin  683
spf   387
syntax168 (missing angle brackets, etc.)
relay denied   38
blacklisted sender 23
no rdns14
rdns mismatch   2
phishing2
---
Deferrals:
spf defer  24
---
Quarantined:  211
Delivered:192
---


Total Blocked:   22,265 (99.1%)
Total Delivered:192 (0.9%)

This node processed a total of 33,553 transactions overall, 3,796 of 
which resulted in delivery.  66.9% of these transactions would have been 
blocked outright by blocking on improper spaces in mail from/rcpt 
commands.  Doing so would also have decreased the number of delivered 
messages by 5%, and decreased the number of quarantined messages by 32% 
-- but a few of these new blocks would have been false positives.


I examined 71 of the delivered messages that would have been blocked due 
to RFC-ignorance.  There were some definite false positives.  Most of 
these false positives were addressed with 'bounce@' and similar 
addresses and were obviously sent by direct marketing companies.  Three 
messages, however, were sent by normal end-users who would have received 
bounces if we were blocking based on RFC ignorance.


Of the remaining delivered messages with invalid syntax, nearly all 
*appeared* to be sent by direct marketers.  Upon investigation, however, 
most of these appeared very likely to be illegitimate.  I suspect that a 
small handful of popular direct marketing email software applications 
use this flawed syntax, and are used by some legitimate marketers, as 
well as by a large number of spammers, and perhaps a few small marketing 
companies that don't realize (and probably don't care) that their 
customers are sending spam through their servers.  StrongMail, the 
apparent MTA for multiview.com, is one candidate:


http://www.strongmail.com/what-we-offer/email-deliverability-solutions/

At any rate, some obviously legitimate mass email campaigns would have 
been blocked due to RFC ignorance from the following organizations:


bounce.rd.com (Reader's Digest): 3
cmpgnr.com (Campaigner)  5
email.joann.com (Joann Fabric):  2
service.govdelivery.com: 2
medtech.cfmvmail.com:1
mktg.artfulhome.com: 1
strongmail.multiview.com:1

Marketing Senders:  15
Human Senders:   3
Total:  18



So, the samples that I examined had a 23.4% false positive rate, 
although FWIW if you don't count mass-marketers, it would be a 3.9% FP 
rate.  The overall FP rate, without taking collisions with other 
rejection methods into account, would be around 0.2%.  Considering that 
this syntax is clearly prohibited by RFC, and the majority of senders 
are professional organizations who can do something about the issue if 
they want their messages to get through our systems, it may be 
appropriate to reject outright in spite of the possibility of blocking 
some legitimate mail.


-Jared


Re: prefork 'orphaned child' messages

2009-06-05 Thread Jared Johnson
One of my associates suggested this patch.  The idea is to reduce 
loop_sleep if we had to spawn a lot of children last time through the 
loop.  This way on restart (since $created_children is initialized to 
$idle_children), and on sudden concurrency spikes, we loop quickly 
enough to spawn the children we need very quickly; whereas under normal 
conditions, we don't bother.  I still decreased loop_sleep by half in 
our own version, so that the longest wait clients will experience during 
a drastic concurrency increase would be just slightly above 15 seconds.


 Would there be any interest in accepting this if I tested it and put 
it in my git?


-Jared

On 06/02/2009 08:09 AM, Jared Johnson wrote:

Why not set --idle-children at start-up to something higher (or just 0
to disable)?


Setting it higher is a bit bothersome because our QP children use too
much memory right now (mainly from using DBIx::Class), so it would be a
bit unfortunate to have more memory-consuming kids around that aren't
needed.  As for disabling idle-children.. what exactly would the
behavior be with disabled idle children?  In a situation where
concurrency is saturated, it seems that QP spawns $idle_children
processes every $loop_wait seconds.  Would QP instead spawn a new child
_immediately_ when a new connection attempt came in?  Or if it still
waits $loop_wait seconds, how many children does it spawn when it sees
some are needed?

-Jared


--- a/qpsmtpd-prefork
+++ b/qpsmtpd-prefork
@@ -75,7 +75,7 @@ my $max_children= 15;   # max number of child processes to spawn
 my $idle_children   = 5;# number of idle child processes to spawn
 my $maxconnip   = 10;
 my $child_lifetime  = 100;  # number of times a child may be reused
-my $loop_sleep  = 30;   # seconds main_loop sleeps before checking children
+my $loop_sleep  = 15;   # seconds main_loop sleeps before checking children
 my $re_nice = 5;# substracted from parents current nice level
 my $d_start = 0;
 my $quiet   = 0;
@@ -339,10 +339,11 @@ sub reaper {
 #arg0: void
 #ret0: void
 sub main_loop {
+my $created_children = $idle_children;
 while (1) {
 # if there is no child death to process, then sleep EXPR seconds
 # or until signal (i.e. child death) is received
-sleep $loop_sleep unless @children_term;
+sleep $loop_sleep / ($created_children * 2 + 1) unless @children_term;
 
 # block CHLD signals to avoid race
 my $sigset = block_signal(SIGCHLD);
@@ -379,9 +380,9 @@ sub main_loop {
 }
 
 # spawn children
-for (my $i = scalar(keys %children) ; $i  $chld_pool ; $i++) {
-new_child();# add to the child pool
-}
+$created_children = $chld_pool - keys %children;
+$created_children = 0 if $created_children  0;
+new_child() for 1..$created_children;
 
 # unblock signals
 unblock_signal($sigset);


Re: prefork 'orphaned child' messages

2009-06-02 Thread Jared Johnson

Why not set --idle-children at start-up to something higher (or just 0
to disable)?


Setting it higher is a bit bothersome because our QP children use too 
much memory right now (mainly from using DBIx::Class), so it would be a 
bit unfortunate to have more memory-consuming kids around that aren't 
needed.  As for disabling idle-children.. what exactly would the 
behavior be with disabled idle children?  In a situation where 
concurrency is saturated, it seems that QP spawns $idle_children 
processes every $loop_wait seconds.  Would QP instead spawn a new child 
_immediately_ when a new connection attempt came in?  Or if it still 
waits $loop_wait seconds, how many children does it spawn when it sees 
some are needed?


-Jared


Re: prefork 'orphaned child' messages

2009-06-01 Thread Jared Johnson

Even if you're not near max children the parent will only spawn max idle
children, then it sleeps until an event, wake-up and see if more
children are needed. Debug log should give some clue, if this is the
reason.


Bingo.  Looking further into things, it was apparent that a freshly 
restarted node was grossly underprovisioned WRT child processes.  On 
intial restart, it had 5... 30 seconds later, it had 10.  I manually set 
$loop_sleep=5 (originally hard-coded to 30) and this addressed the 
issue.  Is this drastically smaller loop likely to have any 
side-effects?  If not, it might be a good idea to set this upstream.


-Jared


Re: prefork 'orphaned child' messages

2009-05-29 Thread Jared Johnson

No I don't think it's normal. What are you doing in your plugins? Wasn't
there some issue we uncovered a while ago to do with MySQL?


I don't recall hearing of issues with mysql.. we use postgres here and 
do, well, lots of stuff:  lookups for for global and ip-based max 
concurrency rules in hook_pre_connect, lookups for various preferences 
in hook_rcpt.. if it's a plugin that's breaking things then it's almost 
certainly our own code, as we've re-written most of qp's plugins for our 
uses :)


Any clue what's going on in general?  Do children wind up 'orphaned' 
when they go completely unresponsive, e.g. they're hanging out in some 
infinite loop somewhere or something similarly evil?


The basic problem we've been encountering is that very rarely, all of 
our dozen QP nodes inexplicably introduce long delays before answering 
with a banner (no banner delay involved); watching the logs, it doesn't 
look like any child is spawned during this wait, and we aren't anywhere 
near max-children.  Eventually, a child is spawned and then handles the 
connection normally, assuming the client hasn't timed out.  And 
eventually, every slave recovers and starts behaving normally.  It seems 
like this issue would have to be network related in some way, as every 
node is affected around the same time; DNS was the first obvious choice 
but we don't seem to be having any problems with that.  One we find said 
network problem this issue might go away, but it seems that our smtpd 
should not be so fragile to whatever issue is hanging it up..


-Jared
Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: prefork 'orphaned child' messages

2009-05-29 Thread Jared Johnson

What's orphaned is not a child process, but a shared mem hash record for a
process which no longer exists. I suspect that code is racy.


Hrm, then if we're getting a whole lot of these, does this mean child 
processes are going away at a high rate?  I wouldn't expect such a 
condition using prefork with relatively low concurrency, maybe this 
reflects a problem during the end of the connection..


-Jared

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: prefork 'orphaned child' messages

2009-05-29 Thread Jared Johnson

Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Do we have to be exposed to this spam?


*blush*

Since I administer our qp installation that adds those, I suppose I 
could exempt myself without anybody noticing :)


-Jared



Re: [qpsmtpd] 0.81 Can prefork listen on more than one port?

2009-05-21 Thread Jared Johnson

On 05/21/2009 02:56 PM, Ask Bjørn Hansen wrote:


On May 21, 2009, at 12:30, J wrote:


The prefork in 0.81 can only listen on one.  The next release will
support
listening on multiple addresses  ports in prefork, just as
forkserver does.


Any idea when that might be?


Probably in about 4-6 weeks.  My plan is to do a release at least
every 3 months; or when there are significant changes/bugfixes -
whichever comes first.


You might want to consider the multiple-addresses/ports thing to be 
significant enough to warrant an early release; I noticed a bit ago that 
the stock debian package had switched to prefork, but had to revert back 
to forkserver because of the regression on that issue.  Food for thought.


Of course, someone could also just advice the debian maintainer to apply 
that patch to the debian package :)


-Jared


Re: [qpsmtpd] 0.81 Can prefork listen on more than one port?

2009-05-21 Thread Jared Johnson

I'm the Debian maintainer.  :)


Fancy meeting you here :)  OT, you might want to patch the debian init 
script now that QP supports SIGHUP.  Here's my local patch, but I 
haven't bothered scrutinizing and un-preforkifying and therefore 
officially sending it in:


--- dc-smtpd/debian/dc-smtpd.init   (revision 8819)
+++ dc-smtpd/debian/dc-smtpd.init   (revision 8820)
@@ -112,6 +112,12 @@
fi
 }

+qpsmtpd_reload()
+{
+start-stop-daemon --stop --signal 1 --quiet --pidfile $PIDFILE
+return 0
+}
+
 case $1 in
start)
echo -n Starting qpsmtpd: 
@@ -123,7 +129,12 @@
qpsmtpd_stop
echo qpsmtpd-prefork.
;;
-   restart|reload|force-reload)
+reload|force-reload)
+echo -n Reloading qpsmtpd: 
+qpsmtpd_reload
+echo qpsmtpd-prefork.
+;;
+   restart)
qpsmtpd_stop
qpsmtpd_start
;;

-Jared


Re: [qpsmtpd] from 0.32 to 0.81, forkserver to pre-fork

2009-05-13 Thread Jared Johnson

First issue: If the plugins (Most are highly customized) run under
forkerserver, will they be expected to run under pre-fork without
modification? I'm trying to recall some of the discussions, and I think
the issues were with pollserver - something about async code(?).


If I recall our own similar migration correctly, things went off nearly 
without a hitch from forkserver to prefork.  You might want to think 
about whether your customized stuff makes any assumptions that every new 
connection gets a new process, e.g. if you have some sort of ugly 'our' 
stuff that's supposed to be per-transaction; if so, your code is kinda 
broken anyway :P


IIRC, our hook_pre_connect stuff did assume it was running before the 
fork, and (IIRC again) on prefork it runs after the fork; at any rate, 
we moved some things to register() that belonged there in the first 
place to alleviate the problems we had with hook_pre_connect


-Jared
Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: [qpsmtpd] port while in pre_connection hook?

2009-05-05 Thread Jared Johnson

Hi,

In current qp it would be about like so:

sub hook_pre_connection {
my ($self,$transaction,%arg) = @_;
my $port = $arg{local_port};
}

-Jared

b-sub-bdf0...@rope.net wrote:

On Tue, 5 May 2009, Hanno Hecker wrote:


On Tue, 5 May 2009 17:20:13 + (UTC)
b-sub-bdf0...@rope.net wrote:


qpsmtpd 0.31

We are using the following to determine the local port when in the
'connect' hook:

my $localport = $self-qp-connection-local_port ;

We would like to do the same thing in the 'pre_connection' hook. The above
line doesn't work, and shows 'uninitialized value' in the logs.

Is there some other way to get the local port at this stage, or do we have
to move what we want to do to the connect hook, instead?

The pre-connection hook was not working until 0.32... (or SVN rev
599 / 601). This is more than 3 years now... and lot's of bugs were
fixed .)


My mistake - version 0.32. We will be upgrading shortly.

However, I still need to get the local port number in the pre_connection
hook, or I have to move the functionality to later than what would be
ideal :-/


Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: Qpsmtpd::Transaction::body_fh

2009-04-07 Thread Jared Johnson
So, after some more extensive testing, we've found that dup'ing body 
fh's all over the place and then writing to different ones can cause 
some problems.  So I think I will submit my dup() enhancement as a 
separate function, dup_body_fh().


When I updated the documentation I noticed that body_fh() is actually 
documented to return undef in the event that the body is not yet spooled 
to disk.  I assumed that it was an oversight that body_fh didn't call 
body_spool() similarly to body_filename, but it seems there was some 
thought given to it.  Is this the preferred behavior?


-Jared

Jared Johnson wrote:
Based on my testing, it looks like body_resetpos() behavior would be 
unaffected by the dup().  I didn't realize this was the case, so my 
mention of 'if we happen to seek() to the wrong place bad things won't 
happen' is not exactly accurate.  But this is probably for the best; 
although I would normally expect someone to do '$txn-body_resetpos; my 
$fh = $txn-body_fh;' -- it is probably best to not break when they do 
'my $fh = $txn-body_fh; $txn-body_resetpos;'.  So the main 
accomplishment of calling dup() is that we can pass $fh to a module that 
calls $fh-close() without clobbering things :)  I'll send in a patch 
soonish.


-Jared

Robert Spier wrote:

Most plugins call body_resetpos before using the FH.

-R

Jared Johnson wrote:

I'm considering submitting a patch that enhances
Qspmtpd::Transaction::body_fh() to (1) call body_spool() if the
message was not already spooled to disk; and (2) dup the filehandle
before returning it, so that if a plugin that calls body_fh, or an
external module that the plugin passes the fh to, happens to seek() to
some weird place or close() the fh, bad things won't happen to plugins
that use the fh later (body_fh, body_resetpos, body_getline).

My question is, are there times when a plugin might be expecting such
modifications to body_fh, justifying splitting the second
functionality into a separate function, say, body_fh_dup() or so?  I'd
lean toward making the changes directly to body_fh() but wanted to see
if there were any objections to this.

-Jared





--
Jared Johnson
Software Developer and Support Engineer
Network Management Group, Inc.
620-664-6000 x118


perltidy

2009-04-03 Thread Jared Johnson

first of all, kudos on the frequent releasing :)

I've attached a suggested patch to .perltidyrc.  I've been playing 
around with perltidy'ing all QP code and some results I don't like. 
This doesn't fix all the things that I don't like, but it's been sitting 
on my laptop for a month so I thought I'd submit it to the ML for 
discussion.  I haven't committed it to my git since I'd just revert it 
if it wasn't accepted :)  Here's the relevant documentation to go along 
with the additions:


   -ce,   --cuddled-else
   Enable the cuddled else style, in which else and elsif 
are follow immediately after the curly brace closing the previous block. 
 The default is not to use cuddled elses, and is indicated
   with the flag -nce or --nocuddled-else.  Here is a 
comparison of the alternatives:


 if ($task) {
 yyy();
 } else {# -ce
 zzz();
 }

 if ($task) {
   yyy();
 }
 else {# -nce  (default)
   zzz();
 }



   [-nbcc], -bbc,  --blanks-before-comments
   A blank line will be introduced before a full-line comment. 
 This is the default.  Use -nbbc or  --noblanks-before-comments to 
prevent such blank lines from being introduced.




   -msc=n,  --minimum-space-to-comment=n
   Side comments look best when lined up several spaces to the 
right of code.  Perltidy will try to keep comments at least n spaces to 
the right.  The default is n=4 spaces.




-Jared
diff --git a/.perltidyrc b/.perltidyrc
index 65b29f2..9ffed84 100644
--- a/.perltidyrc
+++ b/.perltidyrc
@@ -10,6 +10,10 @@
 -lp # line up with parentheses
 -cti=1  # align closing parens with opening parens (closing token placement)
 
+-ce # cuddled else
+-nbbc   # don't insert blank lines before comments
+-msc=1  # minimum one space before side comments (default 4)
+
 # -nolq # don't outdent long quotes (not sure if we should enable this)
 
 


Re: perltidy

2009-04-03 Thread Jared Johnson

[-nbcc], -bbc,  --blanks-before-comments


typo'd my interposition there, it's -nbbc

-Jared


[PATCH] Close spamd socket when we're finished with it

2009-04-02 Thread Jared Johnson
From: jaredj jar...@nmgi.com

QP's connection to spamd unnecessarily persists beyond the run of the
spamassassin plugin itself.  This closes the socket as soon as we're
finished using it.
---
 plugins/spamassassin |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/plugins/spamassassin b/plugins/spamassassin
index 1c4aa85..6865efe 100644
--- a/plugins/spamassassin
+++ b/plugins/spamassassin
@@ -185,6 +185,7 @@ sub hook_data_post { # check_spam
 
   }
   my $tests = SPAMD;
+  close SPAMD;
   $tests =~ s/\015//;  # hack for outlook
   $flag = $flag eq 'True' ? 'Yes' : 'No';
   $self-log(LOGDEBUG, check_spam: finished reading from spamd);
-- 
1.5.6.3

-- 
Inbound and outbound email scanned for spam and viruses by the
DoubleCheck Email Manager v5: http://www.doublecheckemail.com


Re: Qpsmtpd::Transaction::body_fh

2009-04-01 Thread Jared Johnson
Based on my testing, it looks like body_resetpos() behavior would be 
unaffected by the dup().  I didn't realize this was the case, so my 
mention of 'if we happen to seek() to the wrong place bad things won't 
happen' is not exactly accurate.  But this is probably for the best; 
although I would normally expect someone to do '$txn-body_resetpos; my 
$fh = $txn-body_fh;' -- it is probably best to not break when they do 
'my $fh = $txn-body_fh; $txn-body_resetpos;'.  So the main 
accomplishment of calling dup() is that we can pass $fh to a module that 
calls $fh-close() without clobbering things :)  I'll send in a patch 
soonish.


-Jared

Robert Spier wrote:


Most plugins call body_resetpos before using the FH.

-R

Jared Johnson wrote:

I'm considering submitting a patch that enhances
Qspmtpd::Transaction::body_fh() to (1) call body_spool() if the
message was not already spooled to disk; and (2) dup the filehandle
before returning it, so that if a plugin that calls body_fh, or an
external module that the plugin passes the fh to, happens to seek() to
some weird place or close() the fh, bad things won't happen to plugins
that use the fh later (body_fh, body_resetpos, body_getline).

My question is, are there times when a plugin might be expecting such
modifications to body_fh, justifying splitting the second
functionality into a separate function, say, body_fh_dup() or so?  I'd
lean toward making the changes directly to body_fh() but wanted to see
if there were any objections to this.

-Jared




git/svn sync

2009-03-05 Thread Jared Johnson
Hi,

This isn't really a git ML, but perhaps some of you have had similar struggles 
in dealing with the switch to git and can give me some suggestions.  I used to 
keep my organization's SVN version of QP up-to-date with `svn merge` pretty 
easily.  Now that QP is in Git, I'd like to see if there's a way to continue to 
effortlessly synchronize with Git.  I've looked into git-svn, which looks like 
it makes it very easy to create an SVN branch with every single Git commit 
represented as an SVN commit.  However, my organization's SVN Czar is not 
thrilled with the idea of ballooning our history by a thousand or so commits.  
What I would really like to do is 'merge everything from commit XYZ to HEAD' in 
a single commit to SVN, with either git or svn presenting any conflicts that 
might arise.  Can anyone suggest how I can accomplish this?

Thanks in advance

-Jared

Re: Noise on the Commit List

2009-03-04 Thread Jared Johnson
However, in general the mails github creates seem  
pretty useless. 


I find them very useful.  I can see exactly when you merge someone
else's work and when you make a release.  It was just the large flood of
things with 5 year old dates that was not useful.


I agree that these updates are generally useful, as an easy passive way 
to keep tabs on master and as quick links to view commits on github


-Jared


Re: persistence?

2009-02-23 Thread Jared Johnson

right now notes is per-connection (is that correct?) so there may be
things that would break if the scope of notes were to change.


Well, $self-qp-connection-notes() is per-connection, 
$transaction-notes() is per-transaction, and (if/when my address notes 
patch is accepted), $addr-notes() is per address.  But what if we give 
plugin authors access to hook_connection_notes, hook_transaction_notes, 
hook_address_notes, which passes the object in question, allowing the 
plugin to horse around with the note handling however it wants to, 
including widening the scope:


sub hook_transaction_notes_get {
# store tarpit and some other thing in cache::fastmmap
my ($self,$txn,$key) = @_;
my %cfm_keys;
@cfm_keys{qw(tarpit otherthing)} = ();
return DECLINED unless exists $cfm_keys{$key};
my $cache = Cache::FastMmap-new(...) or return DECLINED;
return OK,$cach-get($key);
}

Possibly a hook_notes() could be provided to let you override all scopes 
of notes, either for convenience or for other strange purposes:


sub hook_notes_set {
# I'm picky, so I want notes to be stored in {_my_notes}
# rather than {_notes}, except for $connection objects
my ($self,$obj,$key,$value)
return DECLINED if ref $obj eq 'Qpsmtpd::Connection';
$self-{_my_notes} = $value;
}


What kind of locking and/or transactioning do we want in persistent notes?


I suggest letting the backend handle both.  Typically we'll be using 
something that was designed to let not just multiple QP processes but 
also multiple non-QP processes interact with the store, and so it 
already has to implement its own locking and atomicity.  If we 
absolutely need to implement our own, we can do so in the plugin.


-Jared


address notes patches

2009-02-20 Thread Jared Johnson
I've finally got around to attempting to learn to navigate Git. 
Attempting to follow the suggested workflow in development.pod, I've 
created a QP fork on Github:


http://github.com/jaredj/qpsmtpd/tree


... and created three topic branches, each with a single commit, 
representing my new effort to add address notes and clean up the code 
just a bit while I'm at it, but without breaking notes('arg',undef):


Shorten/clarify Connection and Transaction notes()
http://github.com/jaredj/qpsmtpd/commit/46a045e3d22a1ef2160835edb4c3c7c1f51f4658

Consistent Connection and Transaction notes() POD
http://github.com/jaredj/qpsmtpd/commit/f15c7d687fc5d8921edaf7d6d0f0add70ec0e553

Add Qpsmtpd::Address::notes() method
http://github.com/jaredj/qpsmtpd/commit/40f6d68b0274891abb7f224b08b2f9bdc9eed031

... and merged them to my fork.  Any suggestions making this process 
handier for you guys to review suggested changes would be appreciated. 
Three topic branches may be a bit overkill, but I figured better 
overkill than underkill :)  I was a bit impressed with Git today; 
normally when I work without connectivity, I work right up to the point 
that I would be ready to commit and then either stop or begin working on 
manually created copies of files, since I can't begin editing SVN 
history and making SVN branches until I can talk to the central repo. 
With Git, I was able to put together a decent set of branches and 
commits in the car on the way to work this morning with no internet 
connection; Nice :)


-Jared


reset_transaction between AUTH and MAIL_FROM

2009-02-20 Thread Jared Johnson
I've begun working on a custom auth plugin and tried setting some 
transaction notes after successful authentication based on the 
credentials that were used.  Of course this didn't work, as the 
transaction is reset as soon as MAIL FROM is issued.  I can probably get 
around this with connection notes, of course, but I was just wondering 
whether it was intentional for transactions to be reset between MAIL 
FROM and AUTH, rather than at the first issue of *either* MAIL FROM *or* 
AUTH?


-Jared


Re: Error: 503 but you already said AUTH

2009-02-20 Thread Jared Johnson

The attached diff should fix it for you. I'll commit later if nobody
complains.


It appears this still hasn't been applied.  I just applied it myself and 
it works for me; is there anything stopping this from being committed?


Also note that you might want to wipe out the tab that's in the midst of 
all those non-tab indents while you're at it :)


-Jared


Re: Error: 503 but you already said AUTH

2009-02-20 Thread Jared Johnson
Does git support 'hooks' like SVN does?  Here at work we have some very 
spiffy SVN hooks that don't let us commit code if it fails perltidy, 
contains tabs, trailing whitespace, long lines, etc.  It can be 
overridden by the comitter, but often it catches things we didn't 
actually mean to commit.  Especially whitespace problems are easy to miss.


-Jared

Hanno Hecker wrote:

On Fri, 20 Feb 2009 10:21:22 -0600
Jared Johnson jar...@nmgi.com wrote:


The attached diff should fix it for you. I'll commit later if nobody
complains.
It appears this still hasn't been applied.  I just applied it myself and 
it works for me; is there anything stopping this from being committed?

No, I'm running this on one machine myself without problems.

Also note that you might want to wipe out the tab that's in the midst of 
all those non-tab indents while you're at it :)

We should run perltidy now and write a .pod what our preferred coding
styles are. (e.g. running perltidy / podtidy before any commit)

Hanno




Re: persistence?

2009-02-20 Thread Jared Johnson

David Nicol wrote:

Okay then.  Anyone for collaborating on setting one up, with a unified
accessor syntax similar to what we have for notes ?

http://advenge.com/2009/products/downloads/qpsmtpd/plugins/verp

is a working plugin that uses DirDB persistence library to rewrite the
envelope return addresses of all traffic except traffic that is exempt
due to relaying-okay or the verp_no note getting set.

a persistent notes -- maybe pnotes like notes, but persistent --
would be more general and could be replaced with a backend as needed.
As for lasting between two and three weeks, well, what all is needed?


It sounds like a good idea in general.  We currently keep a few 
persistent things in our database; in the past we've kept some in 
Cache::FastMmap (we later decided this was premature optimization and 
did away with it); we are looking at keeping some things in memcached. 
A universal accessor for all of these types of persistent store would 
make it easier to generalize out plugins like, say, greylisting, and 
still allow people to set up whatever sort of persistent store makes sense.


I would envision the implementation in Qpsmtpd::store(), hook_store, 
hook_store_get, and a convenience function in 
Qpsmtpd::Address::store()... how about about a fake example implementing 
imaginary adjustable tarpit at connect time and RCPT time based on, say, 
the frequency of spam for a given connecting IP, and another based on 
frequency of spam for a given recipient (not that this is a particularly 
great idea..)


hook_pre_connection {
  my ($self,$txn) = @_;
  # This would be %arg or something if I read the docs
  my $ip = $self-qp-connection-remote_ip;
  _delay_connection($self-store(ip_tarpit_seconds_$ip));
}

sub hook_rcpt {
  my ($self,$txn,$rcpt) = @_;
  # Demonstration of a convenience method for a store keyed on address
  _delay_connection($rcpt-store('tarpit_seconds'));
}

sub hook_data_post {
  my ($self,$txn) = @_;
  my $hit;
  for my $rcpt ($txn-recipients) {
if _scan_for_spam($txn) {
  my $old = $rcpt-store('tarpit_seconds') // 0;
  $rcpt-store('tarpit_seconds',$old + 5);
  $hit = 1;
}
  }
  return DECLINED unless $hits;
  my $ip = $self-qp-connection-remote_ip;
  my $old = $self-store(ip_tarpit_seconds_$ip);
  $self-store(ip_tarpit_seconds_$ip,$old + 5);
}

sub hook_store {
  my ($self,$txn,$addr,$key,$value) = @_;
  if ($addr) {
# A Qpsmtpd::Address or compatible object was passed,
# probably because we were called with $rcpt-store()
# I want to do funky stuff with it to build my real key
$key = $addr-notes('uid') . '_' . $addr-format . _$key;
  }
  eval {
my $cache = Cache::FastMmap-new(...);
$cache-set($key,$value);
  };
  return $@ ? DECLINED : OK;
}

sub hook_store_get {
  my ($self,$txn,$addr,$key) = @_;
  if ($addr) { ... } # juju like above
  my $cache = Cache::FastMmap-new(...);
  return OK,$cache-get($key);
}

Some things I'm not so sure about:

- Two hooks, one for retrieval and one for storage?  I think that's 
probably the way to go
- Do we always pass a possible address object to the hooks?  Or do we 
leave it out and let people manually muck with their keys?  It seems to 
me that it would be very useful to give people means to construct keys 
in interesting ways without having to expose that on the caller end; but 
is this the best way to do so?  You could go as far as to allow *any* 
number of arguments, which get passed to the hook, which takes the last 
argument as the value and uses the rest after $self and $transaction to 
construct the key.  That would probably a bit overboard, but maybe there 
are other ideas.


Thoughts?

And, just out of curiosity, where would DirDB be more appropriate to use 
than memcached or Cach::FastMmap?


-Jared


Re: [PATCH] Add 'address notes'

2009-02-12 Thread Jared Johnson

-  my $self = shift;
-  my $key  = shift;
-  @_ and $self-{_notes}-{$key} = shift;
-  #warn Data::Dumper-Dump([\$self-{_notes}], [qw(notes)]);
-  $self-{_notes}-{$key};
+  my ($self,$key,$value) = @_;
+  $self-{_notes}-{$key} = $value if defined $value;
+  return $self-{_notes}-{$key};

I originally considered these functionally identical, but they are not. 
 The new code, called with, say, $txn-notes('discard',undef), would 
result in evaluation as if it were a 'get' method rather than setting 
the 'discard' note to undef.  That seems quite dangerous.  I suggest 
either reverting the language back to the '@_ and' model, or else doing 
something like:


my ($self,$key,$value) = @_;
$self-{_notes}-{$key} = $value if scalar @_  2;
return $self-{_notes}-{$key};


I'd be happy to submit a patch for either sort of language.



Incidentally, I came across this when looking into a bug that was caused 
when I did $txn-recipients(grep {$_ ne $rcpt} $txn-recipients) -- 
expecting it to clear the recipient list if grep returned an empty list. 
 Instead, it predictably behaved as a get method, e.g. a noop.  I 
solved this by adding a clear_recipients() method to 
Qpsmtpd::Transactions, then thought better of it and added 
remove_recipients() instead.  I'll try to get around to submitting these 
as well if they seem useful :)


-Jared


Re: End of headers hook

2009-02-09 Thread Jared Johnson

And voila. Hook renamed to data_headers_end, documentation in
README.plugins with a BIG warning.


Just a thought -- perhaps the methodology for throwing away further 
input as cleverly described by Jose Martinez should be thrown around and 
tested and then documented in README.plugins together with suggestions 
for adding an appropriate transaction note and doing the real denying in 
hook_data_post?


-Jared


RE: End of headers hook

2009-02-07 Thread Jared Johnson
That sounds intriguing... it may not be very RFC-compliant to actually, say, 
interrupt a sender halfway through DATA transmission to tell them that you're 
rejecting based on headers, though.

I can't say that has always stopped me...

-Jared

From: Karl Y. Pradene [k...@ailleurs.org]
Sent: Saturday, February 07, 2009 8:25 AM
To: qpsmtpd@perl.org
Subject: End of headers hook

Hi,
is it possible add a hook at the end of headers?

--
Karl Y. Pradene


Re: [PATCH] config() method for Qpsmtpd::Address objects

2009-01-11 Thread Jared Johnson

Robert Spier wrote:

What's the goal of this?  It also needs some documentation or examples.


The idea is to allow plugins to do something like:

for my $rcpt ($transaction-recipients) {
do_stuff() if $spam_score  $rcpt-config('spam_threshold');
}

IOW, per-user prefs.  This is the first, obvious requisite to this sort 
of handling.


I documented config() in the POD for Qpsmtpd::Address itself; I can add 
some documentation and examples to README.plugins as well, is there any 
other location that could use documentation?


Also, I noticed some more deficiencies in this patch today, namely that 
Qpsmtpd::config() needs to pass the Qpsmtpd::Address object to 
run_hooks('rcpt_config').  Although I did some basic testing to make 
sure it was functional, I suggest holding off on application until the 
code has seen more real-world use from me and I submit a follow-up.


Thanks!

-Jared


[PATCH] Make remote_host available in hook_pre_connection

2009-01-10 Thread Jared Johnson

Just one more...

I've found it useful to have remote_host available in my pre-conection 
plugin (specifically for enforcing concurrency limits specified in a 
rule based on host patterns).  Attached is the patch I have applied to 
make this possible.


Now I'm off to bed!

-Jared
Index: qpsmtpd-forkserver
===
--- qpsmtpd-forkserver	(revision 967)
+++ qpsmtpd-forkserver	(working copy)
@@ -246,8 +246,13 @@
 # get local/remote hostname, port and ip address
 my ($port, $iaddr, $lport, $laddr, $nto_iaddr, $nto_laddr) = Qpsmtpd::TcpServer::lrpip($server, $client, $hisaddr);
 
+# set enviroment variables
+($ENV{TCPLOCALIP}, $ENV{TCPREMOTEIP}, $ENV{TCPREMOTEHOST})
+  = Qpsmtpd::TcpServer::tcpenv($nto_laddr, $nto_iaddr);
+
 my ($rc, @msg) = $qpsmtpd-run_hooks(pre-connection,
  remote_ip= $nto_iaddr,
+ remote_host  = $ENV{TCPREMOTEHOST},
  remote_port  = $port,
  local_ip = $nto_laddr,
  local_port   = $lport,
@@ -296,9 +301,6 @@
::log(LOGINFO, Connection Timed Out); 
exit; };
   
-# set enviroment variables
-($ENV{TCPLOCALIP}, $ENV{TCPREMOTEIP}, $ENV{TCPREMOTEHOST}) = Qpsmtpd::TcpServer::tcpenv($nto_laddr, $nto_iaddr);
-
 # don't do this!
 #$0 = qpsmtpd-forkserver: $ENV{TCPREMOTEIP} / $ENV{TCPREMOTEHOST};
   
Index: qpsmtpd-prefork
===
--- qpsmtpd-prefork	(revision 967)
+++ qpsmtpd-prefork	(working copy)
@@ -647,10 +647,15 @@
 my %children;
 shmem_opt(\%children, undef, $$, $iaddr);
 
+# set enviroment variables
+($ENV{TCPLOCALIP}, $ENV{TCPREMOTEIP}, $ENV{TCPREMOTEHOST})
+  = Qpsmtpd::TcpServer::tcpenv($nto_laddr, $nto_iaddr);
+
 my ($rc, @msg) =
   $qpsmtpd-run_hooks(
   pre-connection,
   remote_ip   = $nto_iaddr,
+  remote_host = $ENV{TCPREMOTEHOST},
   remote_port = $port,
   local_ip= $nto_laddr,
   local_port  = $lport,
@@ -694,9 +699,6 @@
 exit;
 };
 
-# set enviroment variables
-($ENV{TCPLOCALIP}, $ENV{TCPREMOTEIP}, $ENV{TCPREMOTEHOST}) = Qpsmtpd::TcpServer::tcpenv($nto_laddr, $nto_iaddr);
-
 # run qpmsptd functions
 $SIG{__DIE__} = 'DEFAULT';
 eval {


Re: [PATCH] config() method for Qpsmtpd::Address objects

2009-01-10 Thread Jared Johnson
I noticed the previous iteration of this patch would allow cached global 
config values to override per-recipient config values.  The attached 
patch adds the following change to avoid this:


--- lib/Qpsmtpd.pm  (revision 967)
+++ lib/Qpsmtpd.pm  (working copy)
@@ -142,7 +142,7 @@
   my ($self, $c, $type) = @_;

   #my $timer = $SAMPLER-(config, undef, 1);
-  if ($_config_cache-{$c}) {
+  if ($_config_cache-{$c} and ref $type ne 'Qpsmtpd::Address') {
   return wantarray ? @{$_config_cache-{$c}} : 
$_config_cache-{$c}-[0];

   }



I don't know if it would be useful/advisable to try to cache per-recip 
config results keyed to the address or something -- I think that would 
probably be potentially unsafe.  rcpt_config plugins can always do their 
own caching if they want.


-Jared


Jared Johnson wrote:
Adding a config method for Qpsmtpd::Address objects turned out to be 
relatively straightforward.  I was reluctant to subclass Qpsmtpd, but 
this actually seems like it could be useful in the future; at any rate, 
it gets the job done pretty nicely.  This patch will only be useful for 
people who define 'rcpt_config' plugins, but if anyone shares my 
sentiments then there will be folks chomping at the bit like I am to 
write these.  Feedback is welcome...


-Jared



Index: lib/Qpsmtpd.pm
===
--- lib/Qpsmtpd.pm	(revision 967)
+++ lib/Qpsmtpd.pm	(working copy)
@@ -142,7 +142,7 @@
   my ($self, $c, $type) = @_;
 
   #my $timer = $SAMPLER-(config, undef, 1);
-  if ($_config_cache-{$c}) {
+  if ($_config_cache-{$c} and ref $type ne 'Qpsmtpd::Address') {
   return wantarray ? @{$_config_cache-{$c}} : $_config_cache-{$c}-[0];
   }
 
@@ -150,7 +150,8 @@
 
   #warn SELF-config($c) , ref $self;
 
-  my ($rc, @config) = $self-run_hooks_no_respond(config, $c);
+  my $hook = ref $type eq 'Qpsmtpd::Address' ? 'rcpt_config' : 'config';
+  my ($rc, @config) = $self-run_hooks_no_respond($hook, $c);
   @config = () unless $rc == OK;
 
   if (wantarray) {
Index: lib/Qpsmtpd/Address.pm
===
--- lib/Qpsmtpd/Address.pm	(revision 967)
+++ lib/Qpsmtpd/Address.pm	(working copy)
@@ -1,6 +1,8 @@
-#!/usr/bin/perl -w
 package Qpsmtpd::Address;
 use strict;
+use warnings;
+use Qpsmtpd;
+our @ISA = qw(Qpsmtpd);
 
 =head1 NAME
 
@@ -317,6 +319,18 @@
 return $self-{_host};
 }
 
+=head2 config($value)
+
+Looks up a configuration directive based on this recipient, using any plugins that utilize
+hook_rcpt_config
+
+=cut
+
+sub config {
+my ($self,$key) = @_;
+$self-SUPER::config($key,$self);
+}
+
 sub _addr_cmp {
 require UNIVERSAL;
 my ($left, $right, $swap) = @_;
Index: lib/Qpsmtpd/Plugin.pm
===
--- lib/Qpsmtpd/Plugin.pm	(revision 967)
+++ lib/Qpsmtpd/Plugin.pm	(working copy)
@@ -4,7 +4,7 @@
 
 # more or less in the order they will fire
 our @hooks = qw(
-logging config post-fork pre-connection connect ehlo_parse ehlo
+logging config rcpt_config post-fork pre-connection connect ehlo_parse ehlo
 helo_parse helo auth_parse auth auth-plain auth-login auth-cram-md5
 rcpt_parse rcpt_pre rcpt mail_parse mail mail_pre 
 data data_post queue_pre queue queue_post vrfy noop


  1   2   >