On 7/1/20 3:52 PM, John Hardin wrote:
On Wed, 1 Jul 2020, Aner Perez wrote:
I opened a bug (7832) about this but was told to report on the SA users mailing list
instead.
The attached email is an example which triggers the HK_SCAM rule. Looks like
__HK_SCAM_S7 is the culprit here since it matches the words "business" and "enterprise"
when they are found one after the other (even on different lines).
In the real world this was triggered by a business email that had the following in the
signature:
FirstName LastName
Altice Business
Enterprise Account Executive
What was the *overall* score of that message? Was this rule enough to push the message
over the spam threshold (5 points)? Or was the message still scored as ham?
In our case it was marked as spam but only because we have the spam threshold set very low
(2.4). The message scored a 3.357 when the BAYES_50 was added in.
It looks like to me like the logic in __HK_SCAM_S7 is a little off...
/(?:(?:investment|proposed|lucrative) (?:business|venture)|(?:business|venture)
(?:enterprise|propos(?:al|ition)))/i
seems like it should be:
/(?:(?:investment|proposed|lucrative) (?:business|venture)|(?:business|venture|enterprise)
propos(?:al|ition))/i
That makes more sense but the rule still seems like it would be easily triggered by
standard business talk (e.g. business proposal). I guess that's the nature of business
emails... they're naturally spammy.
...but I'll let Henrik comment.
Potentially, making it a rawbody rule might avoid this FP without affecting its
performance against the targeted spams...
For future reference: sending a sample email to the list as a bare attachment is
problematic, as it may be altered en-route and thus invalidate any meaningful analysis.
It's better to attach it as a zip/gzip, or to upload it to someplace like Pastebin and
just post the URL to it here. (In this case, your description should probably be enough to
figure it out without the sample so you shouldn't need to do that unless someone
explicitly asks you to do so.)
Thanks I'll keep that in mind.
- Aner