Thanks for all the replies. Unfortunately, the method URI.setDefaultProtocolCharset(CHARSET) that Manuel proposed doesn't help in my case. It may work for PostMethod but it doesn't for GetMethod when HttpMethodBase.setQueryString(NameValuePair[]) is used. Method setQueryString(NameValuePair[]) already selects the charset for the encoding (US-ASCII) and this is passed through to the method URI.encode() that finally encodes the strings. There, the method String.getBytes(charset), when called with US-ASCII, converts all special characters like the German Umlaute to ASCII-Code 63 (question mark).
There is currently no way to define a different charset than US-ASCII for the encoding with HttpMethodBase.setQueryString(NameValuePair[]). I think it would be good if for the charset instead of the constant US-ASCII the method URI.getDefaultProtocolCharset() was used (then UTF-8 would be use as default) or if there would be an other way to specify a different charset. Martin > -----Original Message----- > From: Oleg Kalnichevski [mailto:[EMAIL PROTECTED] > Sent: Donnerstag, 10. Juli 2003 19:12 > To: Commons HttpClient Project > Subject: Re: Encoding of special characters in request URI > > > This is one of many 'shady' areas of the HTTP spec. Basically there is > no standard way for the client to communicate to the server what coding > has been used to decode query parameters. I believe some browsers use > 'Accept-charset" or 'Accept-Language' headers to negotiate the locale > settings to be used by the server. But I am not sure it these headers > can be used to determine what character coding can be used to decode > URL-encoded data. > > I think we definitely should not be using US-ASCII per default. The > whole point of URL encoding is to escape non-ASCII characters. I suggest > UTF-8 be used per default. > > Oleg > > > > On Thu, 2003-07-10 at 17:48, Michael Becke wrote: > > Hello Martin, > > > > This is a good question, one that I am not positive I know the answer > > to. The HTTP request line (containing the query params) must be > > US-ASCII. That I am sure of. The catch is that form urlencoding > > strings makes them ASCII, regardless of the original charset. So > > HttpMethod.setQueryString(NameValuePair[]) is assuming that the > > inputs(query params) are ASCII when really only the output(encoded > > params) should be ASCII. > > > > The question is how does one determine, on the client and the server, > > what the charset of the query params is? The request charset can be > > specified with the Content-Type header, but this is meant to apply to > > the request entity, not the headers. I have a feeling that we should > > probably be using the content charset anyway. My reasoning > here is that > > an HTML form can be sent via a GET(query params) or POST(post content). > > In both cases the content must be form urlencoded and my feeling is > > that it should be done the same for both. > > > > What does everyone else think? > > > > Mike > > > > Martin Schnyder wrote: > > > When I use the GetMethod class to send text with special > characters (German > > > Umlaute "äöü") in the request parameters, the special > characters are not > > > encoded correctly. This happens when I use method > > > HttpMethodBase.setQueryString(NameValuePair[] params) > > > to set the query parameters. > > > > > > I saw that Release 2.0 Beta 2 fixed that with bug fix 20481. Special > > > characters are now encoded differently but still wrong, as > far as I can see. > > > > > > Method HttpMethodBase.setQueryString(NameValuePair[]) calls > > > formUrlEncode(params, HttpConstants.HTTP_ELEMENT_CHARSET) to > encode the > > > parameters. The value of HTTP_ELEMENT_CHARSET is US-ASCII. > When I change the > > > charset to HttpConstants.DEFAULT_CONTENT_CHARSET (which is > ISO-8859-1), the > > > German "Umlaute" are encoded correctly. I checked that with > the code in CVS > > > HEAD. Is this a bug or should really only the US-ASCII characters be > > > supported in a request URI? > > > > > > Regards, > > > Martin Schnyder > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: > [EMAIL PROTECTED] > > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > [EMAIL PROTECTED] > > For additional commands, e-mail: > [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]