https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6152
--- Comment #8 from Steve Freegard <[email protected]> 2009-07-10 15:17:23 PST --- (In reply to comment #7) > Unless negative look-ahead assertions do have a significant performance > impact, > we could even do it the other way round and actually define what we consider > to > be a sane offset. Like this. > > /[-+](?!(?:0\d|1[0-4])(?:[03]0|[14]5))\d{4}$/ > > This looks out for a four digit offset, that does not match the sane offsets > defined in the leading (?! ) part. Probably better comprehensible, apart from > the reversed logic. ;) > > This one is my proposal. > Thanks for all the feedback. The version above is much easier to read. My only comment would be to remove the $ anchor as the offset isn't always at the end of the date header - consider these examples that I just pulled out from my spamtrap mailbox: Date: Tue, 12 May 2009 07:30:19 -0700 (PDT) Date: Thu, 14 May 2009 18:48:45 +0000 (GMT+00:00) Although I guess you could handle these cases by add changing the end of the regexp: \d{4}(?:\s\(\S+\))?$/ > Being slightly more anal, cutting off at +1400, not allowing 14-odd fractions > either, would just bloat the RE and isn't worth it IMHO. Same for > differentiating further between positive and negative possible offsets. Yeah; definitely agree - your proposed regexp is good enough and considerably better than the current rule without adding unnecessary bloat. -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
