"alan r" <[EMAIL PROTECTED]> writes:
> I am trying to create a filter which matches From: headers that dont
> have a proper email address in them
This is difficult, perhaps even impossible. Jeffrey Friedl, in
/Mastering Regular Expressions/, gives a regular expression over 5000
characters long to match an email address and it still does not match
all valid addresses.
> (or at least, are missing an '@'.
This is much easier. <wink>
> So far, I have this:
>
> headers "^From\:( |[^@\s])*$" [EMAIL PROTECTED]
What you really want is for the RE match to fail if you see an '@'
after the 'From:'. You might find this RE does the same thing and is
a little simpler:
"\AFrom:[^@]*\Z"
Anything that's not an '@' includes spaces, so you don't really need
to deal with them separately. Second, you should be aware that TMDA
uses the re.MULTILINE flag when doing RE searches. This means that
'^' and '$' match newlines, not the beginning and end of the whole
string. A From field that looked like this:
From:\n [EMAIL PROTECTED]
would be (incorrectly) matched by your RE, since the search stops at
the end of the first line ('From:') because of the '$' in your RE.
This field, albeit weird, is still a valid From field. In the actual
message it would look like this.
From:
[EMAIL PROTECTED]
The '\A' and '\Z' escape codes I suggest above match
beginning-of-string and end-of-string in multiline matches. Thus the
'@' will be found, the search will fail and the mail will not be
dropped.
> This would be simpler if one could exclude end-of-line inside
> brackets which I dont think is possible. Instead this expression
> just excludes all whitespace characters except for space.
I'm not sure why you want to (not) match a newline -- it seems to
make the RE more complex without any real gain, but you *can* match a
newline inside brackets by specifying '\n'.
> When this works I plan to change the the action to drop. I havent
> had any hits on this rule yet so I'm wondering if it has a bug in
> it.
One reason you might not see many matches (other than '<>') is that
some MTAs assume that an address without a domain is local and they
will append the local domain onto unqualified addresses. That is, if
they see a From field with just 'joe' in it, they'll actually rewrite
the From field as '[EMAIL PROTECTED]' before deliving it.
qmail doesn't do this and I believe the behavior is configurable with
Postfix (although I could be mis-remembering). I don't know about
Sendmail or Exim.
Tim
_____________________________________________
tmda-users mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-users