Group, I tried a few additions to the rule which I posted last week. I was pleased to see that the potential performance hit was nearly undetectable, compared to recent days prior to the rule. The comment block near the bottom of the code explains what was tried, that was different from last week's version of the function.
In addition, I've added another space-compression line, and re-ordered some of the steps performed on $lc_subject, at the top of the function, to better optimize $lc_subject for comparison to the database records. This is surprisingly fast for my database of 2200 records - with MD reporting total "Filter time is" in the area of 1 to 4 seconds per email, depending upon the email and what needs to be done to it (including the launching of an external virus scanner). Compared to last week's times prior to the rule I cant really see a clear time penalty. The database contains records that look like this... free.stuff REJECT home.loans REJECT best.business.you.can.find REJECT And so on. The function is called, from filter_begin like this... if (lookup_subject()) { action_bounce("Access denied. Subject \"$Subject\" suggests MSG may contain SPAM/WORM/VIRUS/HOAX.", "553", "5.7.1"); return action_discard(); } And here is the actual completed function in it's entirety. See the comments within. It's working quite well on my two servers. ############################# # Search the subject-line database for subject lines/keywords to block ############################# $DBFilenameSUBS = "/etc/mail/subjects.db"; sub lookup_subject() { # convert incoming subject to lower-case my $lc_subject = lc($Subject); my $subject_result = 0; my %GDB; if (tie(%GDB,'DB_File', $DBFilenameSUBS, O_RDONLY)) { # remove white space from the middle so that # "free s t u f f here" becomes "free s t u f f here" $lc_subject =~ s/(\s)\s+/$1/g; # next 2 lines collapse "free s t u f f here" into "free stuff here" $lc_subject =~ s!((^|\s)\S\s(\S(\s|$)){2,})! my $lc_subject_x=$1;$lc_subject_x=~s/\s//g;sprintf "%s"," $lc_subject_x ";!ego; $lc_subject =~ s/^\s+//; # Trim leading whitespace $lc_subject =~ s/\s+$//; # Trim trailing whitespace $lc_subject =~ s/^re://; # Trim leading "re:" $lc_subject =~ s/^fw://; # Trim leading "fw:" $lc_subject =~ s/^fwd://; # Trim leading "fwd:" $lc_subject =~ s/\s+/./g; # Collapse whitespace into periods # Scan database for a complete match (only) if ($GDB{$lc_subject}) { $subject_result = 1; md_graphdefang_log("Subject_Line", "Subject-line found in subjects.db"); } else { # See if any one word in the subject appears as a record @subject_array = split (/\./, $lc_subject); foreach $subject_word (@subject_array) { if ($GDB{$subject_word}) { $subject_result = 1; md_graphdefang_log("Subject_Word", "Subject-word \"$subject_word\" found in subjects.db"); last; } } } if (!$subject_result) { # here we reverse the logic... see if any record in the database # is found as a substring in the subject. if a record contains # "free.stuff" and the subject says "get your free stuff here", # then flag it as a hit. my $subject_record; foreach $subject_record (keys %GDB) { if ($lc_subject =~ m/(^|\.)\Q$subject_record\E($|\.)/) { $subject_result = 1; md_graphdefang_log("Subject_Substring", "Subject-substring \"$subject_record\" found in subject line"); last; } } } untie %GDB; } else { md_syslog('warning', "subject: Cannot open file $DBFilenameSUBS"); } return $subject_result; } ############################# _______________________________________________ Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang