Re: urllib.unquote and unicode

Walter Dörwald Thu, 21 Dec 2006 04:13:38 -0800

Martin v. Löwis wrote:
> Duncan Booth schrieb:
>> The way that uri encoding is supposed to work is that first the input
>> string in unicode is encoded to UTF-8 and then each byte which is not in
>> the permitted range for characters is encoded as % followed by two hex
>> characters. 
> 
> Can you back up this claim ("is supposed to work") by reference to
> a specification (ideally, chapter and verse)?
> 
> In URIs, it is entirely unspecified what the encoding is of non-ASCII
> characters, and whether % escapes denote characters in the first place.


http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1

Servus,
   Walter
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: urllib.unquote and unicode

Reply via email to