https://issues.apache.org/bugzilla/show_bug.cgi?id=48899
Summary: Guess URI charset should solve lot of problems
Product: Tomcat 6
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P2
Component: Connectors
AssignedTo: [email protected]
ReportedBy: [email protected]
Hi tomcat connector developers,
tomcat's connectors have some options to either set the uri encoding to a
certain charset or to use the body's encoding (set by
request.setCharacterEncoding) to decode the uri (if none is set, iso-8859-1 is
used).
I found an article (+code) the demonstrates that the charset of data can easily
guessed at http://glaforge.free.fr/wiki/index.php?wiki=GuessEncoding . Since
the most common charsets for uri encoding are iso-8859-1 (since it's default
for uri encoding) and utf-8 (because it's used for most multi language
websites), the possible choices are very clear.
So I'd suggest to add an option to the connectors to guess the used charset for
decoding of uri parts. There are so many issues with uri decoding that would be
solved that way (e.g. when the uri is decoded as utf-8 and a user types an
umlaut into the address bar, the browser might encode it as iso-8859-1 and the
app has no way to fix this).
Regards, Michael.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]