Hi all,
I'm afraid I found kinda bug in the code which handles the detection of
"spam-...@domain", to trigger retraining to false positives/negatives.
Remember I had been struggling around this for days? Dspam was never
considering my forwards to spam-cyri...@xxx, just doing nothing, thus
never "learning", thus never finding spam...
I found that, in agent_shared.c :
function : int process_parseto(AGENT_CTX *ATX, const char *buf)
-------------------------
(...)
if (!buf)
return EINVAL;
h = strstr(buf, "\r\n\r\n");
if (!h) h = strstr(buf, "\n\n"); [1]
x = strstr(buf, "<spam-");
(...)
if (x > h) x = NULL [2]
---------------------------
For this last line [2], I understand that the goal is to ignore any
"spam-" or similar, if it is "after" a double CRLF ("empty line"), which
is probably intended to encode the headers/body boundary.
I am not sure whether this is useful, as this function is called with a
1-line text buffer (fetched line by line in the caller). But that's just
for safety, no big deal.
But, due to line [1] above, if the line does not contain any
double-CRLF, h will be NULL when checking x > h (on line [2]) as this is
not protected. x > NULL will quite often be true, thus x will be reset
to NULL, and we will never conclude a spam-xxx was found.
I changed line [2] to
if (h && x > h) x = NULL;
And that works.
I am not sure in which context this could work before, but I guess this
patch cannot hurt, even if one day, buf contains empty lines.
Hope that helps, it did for me ;-) Tell me if I am wrong on stg...
Regards,
Cyril'
------------------------------------------------------------------------------
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user