[Dspam-user] retrain via dspam-xxx@

Cyril' Tue, 11 May 2010 12:16:00 -0700

Hi all,

I'm afraid I found kinda bug in the code which handles the detection of
"spam-...@domain", to trigger retraining to false positives/negatives.


Remember I had been struggling around this for days? Dspam was never
considering my forwards to spam-cyri...@xxx, just doing nothing, thus
never "learning", thus never finding spam...

I found that, in agent_shared.c :
function : int process_parseto(AGENT_CTX *ATX, const char *buf)
-------------------------
  (...)
  if (!buf)
    return EINVAL;
  h = strstr(buf, "\r\n\r\n");
  if (!h) h = strstr(buf, "\n\n");        [1]
  x = strstr(buf, "<spam-");
 (...)
  if (x > h) x = NULL                  [2]
---------------------------

For this last line [2], I understand that the goal is to ignore any
"spam-" or similar, if it is "after" a double CRLF ("empty line"), which
is probably intended to encode the headers/body boundary.
I am not sure whether this is useful, as this function is called with a
1-line text buffer (fetched line by line in the caller). But that's just
for safety, no big deal.

But, due to line [1] above, if the line does not contain any
double-CRLF, h will be NULL when checking x > h (on line [2]) as this is
not protected. x > NULL will quite often be true, thus x will be reset
to NULL, and we will never conclude a spam-xxx was found.

I changed line [2] to
  if (h &&  x > h) x = NULL;

And that works.

I am not sure in which context this could work before, but I guess this
patch cannot hurt, even if one day, buf contains empty lines.

Hope that helps, it did for me ;-) Tell me if I am wrong on stg...

Regards,
Cyril'

------------------------------------------------------------------------------

_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

[Dspam-user] retrain via dspam-xxx@

Reply via email to