Sidney, > News of an ICANN decision to allow international character > sets in domain names was reported last week, for example
IDN and punycode has been around for a while below TLD, but so far the few TLDs were only for testing. We came across it in: http://marc.info/?t=123928717600002 > I'm concerned that it might have a big impact on SpamAssassin's parsing > of headers and URLs. It is quite possible there is still some too-strict regexp lying around. I know I fixed some in a dkim plugin. > However, what does this mean for detecting URLs in plain text messages > in which a URL string can be in a non-ASCII charset and MUAs might > (eventually) parse them as URLs? Slippery road ahead... Can't hurt to open a PR as a placeholder for concerns and ideas. Mark
