Warren Togami wrote, On 4/11/09 3:27 PM:
It seems clear that we will need to flatten/encode any URI domain to punycode for URIBL lookups.

I agree with that -- if something has non-ASCII characters then punycode is the canonical form to use to look it up.

The unclear part is if we will need to decode URI's prior to punycode encoding. I suspect we will be forced to decode.

I'm not sure exactly what you mean, but the big issue that I see is how to determine that a string is a URL (where it starts and where it stops) that needs to be encoded to punycode. Is that what you are talking about? The rule of thumb that I used when working on code to extract URLs from plain text is that is some common MUA hot links it, then we want to treat it as a URL. Perhaps the answer is to wait until MUAs support these URLs and then follow that rule of thumb.

 -- sidney

Reply via email to