https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5185
--- Comment #26 from Richard van der Hoff <bugzi...@rvanderhoff.org.uk> 2012-02-26 21:46:18 UTC --- A few thoughts on this from me: (In reply to comment #10) > By the way, comment 3 and comment 4 both suggest this will only affect > messages with no Received header. I'm pretty sure that's not the case. This, on further inspection, is a lie. We use the earliest Received header, so the MTA's frobbing of the time on the last Received header only matters if there were no other Received headers. I probably looked at a message with no Received headers when I reported this, so I missed the CRLF vs LF issue. I still think both issues need addressing, however. Anyway: (In reply to comment #20) > There needs to be some part of msgid that isn't under the control of spammers, > otherwise it's trivial for them to prevent their spam ever being learned. They > can generate as many spams with the same msgid as they like, and they can > prime > the database with an initial dummy high-scoring spam that has no usable tokens > in common with the rest. Given that the earliest Received header is most certainly under the control of the spammers, I certainly don't think we've made anything worse in this regard, and whilst what we have now might not be perfect, I think calls to put it back as it was are overstating matters. Perhaps I'm being dense, but I don't really see how the spammers can use this to their advantage. Is preventing your spams being learnt really that useful? (In reply to comment #25) > I feel we need to aim for a solution that works for everyone as the goal > before > we add yet another configuration option. Agreed. Flexibility is all well and good, but having millions of configuration options makes it really hard for people to get a piece of software working as it should. (In reply to comment #23) > I think if we can get a msg_id that is more unique to the message sans the > transport path, it could IMPROVE bayes use. Whilst that's true, I have another suggestion. At the end of the day, we're just trying to uniquely identify a particular message on our server, right? Even if I get two copies of a spam, I can learn them as spam separately, I just want to prevent re-learning each one on subsequent folder scans etc. So how about trying to extract the local message-id from the most recent Received header, rather than all this messing about with checksums etc? -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.