Performance of these filters is indeed something to pay attention to. I = have added what I call "inline content filtering" to my own version of xmail and it is very successful as long as it runs quickly.
When loading the filter expressions at runtime, I separate them into = those that contain embedded wildcards, and those that are simple string searches (e.g., *enlarge your* is simple, *v[1i]gra* is = not). xmail's wildcard matching routine is particularly un-optimal where there are one or more *'s inside the expression (e.g., = abc*def doesn't degrade into beginning-of-string and end-of-string tests). making this distinction (and trying to use = expressions of the "simple" variety) improves speed greatly. things to note: - the filter must decode base64. happily, this does not have to be = done precisely - it can be attempted for all lines after 'base64' is seen in a header line, for example. my code attempts base64 = decoding when it is not really necessary sometimes, but it is cheaper than analyzing the message structure in detail - many undesirable messages require comparison of multiple data lines = due to line breaks. this means that as each line is presented to the "inline filter", it goes through some gyrating: -- whitespace is standardized, and leading and trailing white is = removed -- the new line is appended to a "sliding-window"-type buffer. if = the data was text, a space is inserted before the new line. if it was base64-decoded data, the new data is appended without the = space. once the buffer contains a configurable number of lines, excess lines are removed from the front of the buffer. this means you = need to keep track of the lines in a way that permits retrieval of a pointer to the entire multiline buffer AND keeps track of = where each line ends so that data can drop off the front of the buffer. actually, I keep two of these buffers going, one contains = the message as received and the other contains the message lines after base64-decoding if any data have been successfully decoded - for maximum benefit, comparisons must be performed against the = multiline data multiple times: -- downshifted with whitespace standardized, then -- with html comments removed (leaving html tags, so that hrefs etc. = can be caught), then -- with all html tags removed - if you debug to a file, keep it open until all data have been = presented; using xmail's built-in logging routines - which open logfiles as-needed - is a big performance hit. - extrememly large messages can take an inordinate amount of time to = filter this way. I have a configurable upper-bound, and once that many lines of data have been seen, filter comparisons are no longer = performed. this is a hack, but so far anyway, if a message hasn't violated anything in the first, say, 10K lines, it probably isn't = going to. The "hook" for all of this is in SMTPHandleCmd_DATA(), which: - creates the content filter object, passing it a cached copy of the = filter expression list. I also pass in the IP of the sender so that local senders can be exempted - performs its normal data input loop, passing each line to the = content filter object as it is received. if the filter object indicates that the message is going to be rejected, incoming data are no = longer written to the message file but the loop continues until the end of data is received. (this can mitigate attempts to fill = up server disk space) - retrieves the final result from the object (permitted/denied) after = all data is received - destroys the content filter object This was all a fair amount of work, but I use this instead of the = built-in filter capability because I don't like handling potentially large messages over and over again, and I don't like = launching an external program to do the filtering. --- [EMAIL PROTECTED] -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] = On Behalf Of Harald Schneider Sent: Wednesday 27 August 2003 8:38 AM To: [EMAIL PROTECTED] Subject: [xmail] AW: Re: How to reject message in SMTP transaction I also would appreciate a scripting hook on SMTP level. This would open = =3D up alot of exciting possibilities. E.g. limiting the number of connections by sending 'Too busy, try =3D later', eventually = message size checking etc. --Harald > -----Urspr=3DFCngliche Nachricht----- > Von: [EMAIL PROTECTED] =20 >[mailto:[EMAIL PROTECTED] Im Auftrag von Michal=3D20 = Altair=20 >Valasek > Gesendet: Mittwoch, 27. August 2003 15:10 > An: [EMAIL PROTECTED] > Betreff: [xmail] Re: How to reject message in SMTP transaction =3D20 >=3D20 >=3D20 > |Yes would be great function, if you think of Sobig and that=3D3D20=20 > |'virus detected' messages. >=3D20 > Yes, I am. Being spammed by them, I am writing new version of my FLAVS = =20 >filter. I cannot silently discard messages (because I am not=3D20 sure = >if its virus or not, I am checking only content-type and file names),=20 >but I =3D =3D3D > want to > cause as less problems as possible. >=3D20 > | And i think the SMTP protcol allows it that the=3D3D20 > |server says 'no' > |to the mail after the data command and the '.' - am i right? >=3D20 > Yes, you are - this is the place where SMTP server should=3D20 handle = >things =3D3D like > "message too big" or "mailbox full". The main problem is that=3D20 > you must =3D3D > keep > open TCP connection, while filter(s) are executing, which may - for > long-time running filters and systems under heavy load -=3D20 > cause problems. =3D3D > But > I think that it should be a option. >=3D20 > Now we can see, that antivirus and such systems can cause the same, = =3D =3D3D > maybe > even bigger problems as the virus itself: it's easy to handle=3D20 =20 >viruses, =3D3D not > thousands of technicaly legitimate NDR's they cause. So I am=3D20 > trying to =3D3D > write > and configure everything in my jurisdiction to behave well=3D20 > and not cause > unnecessary problems and load. >=3D20 > - > To unsubscribe from this list: send the line "unsubscribe xmail" in > the body of a message to [EMAIL PROTECTED] > For general help: send the line "help" in the body of a message to > [EMAIL PROTECTED] >=3D20 - To unsubscribe from this list: send the line "unsubscribe xmail" in the = body of a message to [EMAIL PROTECTED] For general help: send the line "help" in the body of a message to = [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe xmail" in the body of a message to [EMAIL PROTECTED] For general help: send the line "help" in the body of a message to [EMAIL PROTECTED]