https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6549
Adam Katz <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #2 from Adam Katz <[email protected]> 2011-06-28 02:03:18 UTC --- Let's look at the reason we "fixed" bug 3236: If the Squirrelmail server is in the trust path, the webmail client's IP becomes the last-external "server" and can therefore hit DNSBLs and dynamic detectors (although dynamic detectors should be safe due to their dependence on __LAST_EXTERNAL_RELAY_NO_AUTH, assuming our Squirrelmail parser correctly recognizes the authentication note). We fixed that with a workaround which is now coming back to bite us. This goes right to the issue of whether or not an authenticated last-external relay should live in the trust path. The relevant scenario is: trusted relay -> trusted webmail server -> client where the IP of the client (the end-user) can be: trusted, clean, or dirty (DNSBL, dynamic detection, etc). When trusted, ALL_TRUSTED fires and there is no problem. When clean, there are similarly no problems. When dirty, we either have a spammer or a legitimate user at e.g. a hotel. (The other scenario, in which the trust path ends earlier, effects rules that examine beyond the last-external/last-untrusted, but those rules are written with this in mind and therefore are not problems.) Without parsing squirrelmail's inserted Received header, we can't trace the sender. This gives dirty IPs a free pass, enabling email from the hotel. The fix to bug 3236 took this final header out of the equation so that HAM sent from a dirty client IP could still go out. However, it opened the door for SPAM to benefit in the same way, enabling Nigerian scams. There are four approaches that I see. 1. Do not parse the header (keep things as they are) 2. Parse the header (put things back the way they were) 3. Parse the header and extend trust to a so-called authenticated client 4. Require most last-external rules match __LAST_EXTERNAL_RELAY_NO_AUTH In order to understand what to do, we need to look at more than just Squirrelmail and Horde (which appears to do this too, given bug 3236 comment 3 as exemplified in attachment 2511). Let's look at the larger webmail providers: Hotmail uses the X-Originating-IP header rather than forging a Received header. I've also seen X-WebmailclientIP used for this purpose. GMail does not log the webmail client's IP anywhere in the email. Yahoo works like Squirrelmail, adding a Received header containing "via HTTP" like this: Received: from [192.0.2.169] by web33701.mail.mud.yahoo.com via HTTP; Thu, 12 May 2011 10:38:16 PDT We currently parse this normally, so I would argue that we should similarly parse the Squirrelmail and Horde headers. The difference is that nobody is going to add Yahoo's webmail servers to their trust path (or at least, nobody should!), so the issue never comes up there, whereas it isn't uncommon for a company to add its webmail servers to its own internal networks list, which is where the third and fourth approaches may be of use. Note that the fourth approach is already partially in effect; a large number of rules (e.g. all dynamic host detection rules) already depend on __LAST_EXTERNAL_RELAY_NO_AUTH. What do the pre- bug 3236 FPs' X-Spam-Relays-External pseudoheaders look like? Do they recognize the authentication? What are the actual FP'ing rules? To further the __LAST_EXTERNAL_RELAY_NO_AUTH workaround (rather than altering the trust path), we could merely add it to meta rules that wrap around the RCVD_IN_* rules, mimicking things like RDNS_DYNAMIC. This has the side effect of nuking some of the helpful text provided by DNSBL hits... -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
