-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Konstantin,
On 11/18/16 2:10 PM, Konstantin Kolinko wrote: > One more authority, that I forgot to mention in my mail: IANA > registry of mime types > > Registry: > https://www.iana.org/assignments/media-types/media-types.xhtml > > Registration entry for "application/x-www-form-urlencoded" > https://www.iana.org/assignments/media-types/application/x-www-form-ur lencoded > > -> Encoding considerations : 7bit > > According to RFC defining this registry, it means that the data is > 7-bit ASCII only. https://tools.ietf.org/html/rfc6838#section-4.8 Oh, that's the nail in the coffin. application/x-www-form-urlencoded from W3C says "if the character doesn't fit into the encoding of the message, it must be %-encoded" but it never says what "the encoding of the message" actually is. My worry was that it was mutable, and that UTF-8 was a valid encoding, meaning that 0xc2 0xae on the wire would have been acceptable (rather than %C2%AE). If application/a-www-form-urlencoded is *absolutely* supposed to be 7-bit ASCII, then nothing above 0x7f can ever be legally transferred across the wire when using that content-type. This solves André's problem with this content-type where he wanted to specify the charset to be used. It seems the standard defines the character set: US-ASCII. The only problem now is that it's not clear how to turn %C2%AE into a character because you have to know that UTF-8 and not Shift-JS or whatever is being used. > -> Required parameters : No parameters -> Optional parameters : No > parameters > > OK. So no charset= parameter is allowed. My advise to specify the > charset parameter was wrong. Agreed: it is always against the spec(s) to specify a charset for any MIME type that is not text/*. > Though historically ~10 years ago I saw > "application/x-www-form-urlencoded;charset=UTF-8" Content-Type in > the wild. Oh, I'm sure you saw it. I even tossed that into my client to see if it would make a difference. Not surprisingly, it did not. > It was a web site authored in WML (Wireless Markup Language) and > accessed via WAP protocol by mobile phones. > > (Specification reference for this WML/WAP usage: > http://technical.openmobilealliance.org/Technical/release_program/docs /Browsing/V2_3-20070227-C/WAP-191-WML-20000219-a.pdf > > Document title: WAP WML WAP-191-WML 19 February 2000 > > Wireless Application Protocol Wireless Markup Language > Specification Version 1.3 > > -> Page 30 of 110 (in Section "9.5.1 The Go Element"): There is a > table, where the following line is relevant: > > Method: post Enctype: application/x-www-form-urlencoded Process: > [...] The Content-Type header must include the charset parameter to > indicate the character encoding. > > I suspect that the above URL is not the official location of the > document. I found it through Googling. Official location should be > http://www.wapforum.org/what/technical.htm ) > > > Apache Tomcat supports the use of charset parameter with > Content-Type application/x-www-form-urlencoded in POST requests. Interesting. I suspect that's because there are practical situations where "being liberal with what you accept" is more appropriate than angrily demanding that all clients be 100% spec-compliant :) The (illegal) charset parameter can only mean one thing: the character encoding to use to assemble url-decoded bytes into an actual string value (e.g. %C2%AE -> 0xc2 0xae -> "®" when using UTF-8). Thanks for that final reference; it really does close the case on this whole thing. - -chris -----BEGIN PGP SIGNATURE----- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJYL1YQAAoJEBzwKT+lPKRYyAkP/3Udkqjiqa7BhRH2Gxo8WhNf Wm7BbWGS8vlgbHH/0mNzFPSxGi7mWxlimaGnc+H8fqk54RZCeNaqQPqPXhG7ldA1 QtR/1H1kXoqUNFmqnj3FBgA6UBZhql9RyLZLbeHdZMK9i1sN4bI/CEa2EP5rZ+0d 0sXXj8wRz+yk2bXtdyuW8yHzQRNB/+XJbOrQBVqc+u//K/+q9I8eEN0SlZo8+9t2 9hqqcufhd9YtuH1Ypn1M73l72WFWad7BEgPPG+noLcB8/OrSXfeF2ELEe9dzv6r6 Jyxas6uUiplE8+/1QTu8MYSGqeo3l/xgixCD9gEMLNFBlcLPlQcRhaoQ08bgZOcT SyzVIYYCL7R7MsB1f3QFDEax0vwIi0a6Zrfaa3oqklXEhNuVk/Ani8+sbFw01iHW ZxV6vc0v9APMOg3jVQug3UC1kAGcZi8toISKyrFt9lwK0AbDrSVKfe4sKql91yQm wQCG3e/RjoSo1LEmh9yszurNtOy2ecqTBkIS2cksf4crYSqpefCyB/GpnrJaHMvx P/PQ0hVZUg05Z/tj7Dxma5mWrlm9IQBC+inDiwIEnl9hGp67KfxZAEk8hUstDBWw AK78+DsseGpyx40o6scDz8dR9ThnTHm3k0zhdUZoORwfft78Ar0HYjZCDQArhuMK BDGqIegIrNeJtCDnYOdq =nJCy -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org