Re: form urlencoding, was Re: URI query escapes

2003-06-22 Thread Sung-Gu
> If not pre-encoded, the URI would look something like
> "http://host/path?param1=value&1¶m2=value=2";.
>
> Once joined together it is too late to encode.  This is why the
> HttpMethod(String) constructor assumes that all URIs are already
> encoded, as it is not possible to correctly encoded them after the fact.
>
> Please let me know if you will be writing some code for this as I will
> take care of it tomorrow otherwise.

When you regard an URI class as URI core manipulation (It means you never
manipulate any URI components in your code yourself, here in any of
commons-httpclient) in your whole code, it doesn't matter...
Because my URI maniuplates both escaped and unscaped components correctly...
That's the point for the real use of the URI class.   It considers URI on
user side (escaped & unscaped) and even communication side (probably only
escaped preffered).

Sung-Gu

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: form urlencoding, was Re: URI query escapes

2003-06-22 Thread Oleg Kalnichevski
Mike, Laura, Adrian

In their pre-Java 1.4.1 form URLEncoder/URLDecoder classes are pretty
much unusable, as these classes always use default system charset, which
sometimes is not good enough. For instance, there's no way to properly
encode strings that simultaneously contain Cyrillic letters and Latin
accents, as both KOI8-R (default Russian encoding on Unix platforms) &
Win1251  (default Russian encoding on Windows platforms) are 8bit
charsets. One would need to use UTF-8, however, standard pre-Java 1.4
URLEncoder does not provide a means of specifying an alternative
charset. 

We have to live with URIUtil for 2.0 release. In the future I would
suggest moving URL encoding logic into Commons-Codec

Oleg


> My only guess is that URLEncoder may not handle character encodings 
> correctly.  I agree that we might as well stick with the code we 
> already have (once fixed).
> 
> Mike
> 
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: form urlencoding, was Re: URI query escapes

2003-06-21 Thread Michael Becke
On Saturday, June 21, 2003, at 11:47 PM, Laura Werner wrote:

I agree, as long as URLEncoder seems to work.

Do you think we need to modify URI so that it uses URLEncoder to 
encode the query part of URIs?  In cases where a client has a URL 
string that may or may not contain query parameters, it would lead to 
a slightly more natural API usage:
 HttpMethod meth = new GetMethod(new URI(urlString));
as opposed to
 String query = null;
 int index = urlString.indexOf('?');
 if (index != -1) {
   query = urlString.substring(index+1);
   urlString = urlString.substring(0, index);
 }
 HttpMethod meth = new GetMethod(new URI(urlString));
 meth.setQueryString(java.net.URLEncoder.encode(query));
or something like that, with error checking of course.
I don't think URI should be doing any form urlencoding.  The URI spec 
does not use the concept of query params.  It just treats the entire 
query as a single entity.

Also, when creating a URI containing query params, the params must be 
encoded before the URI is generated otherwise it will not be parsable.  
For example, when creating a URI with the following query params:

NameValue
param1  value&1
param2  value=2
If not pre-encoded, the URI would look something like 
"http://host/path?param1=value&1¶m2=value=2";.

Once joined together it is too late to encode.  This is why the 
HttpMethod(String) constructor assumes that all URIs are already 
encoded, as it is not possible to correctly encoded them after the fact.

Please let me know if you will be writing some code for this as I will 
take care of it tomorrow otherwise.

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: form urlencoding, was Re: URI query escapes

2003-06-21 Thread Michael Becke
I know in the product we develop, we switched away from using 
java.net.URLEncoder because it didn't work properly.  Unfortunately 
the decision was made before my time so I'm not entirely sure of the 
details and it could well be that the bugs were only present back in 
JRE 1.1.  I'd say that if we have our own code already I'd continue to 
use it, but if not just test java.net.URLEncoder and I'll see if I can 
find out from some of the old timers at work exactly why we don't use 
it.

I've known us to not use something for legacy reasons or because it 
showed up bugs in our code as well, so don't take this as an 
objection, just a word of caution.
My only guess is that URLEncoder may not handle character encodings 
correctly.  I agree that we might as well stick with the code we 
already have (once fixed).

Mike

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: form urlencoding, was Re: URI query escapes

2003-06-21 Thread Laura Werner
Michael Becke wrote:

I propose that we:
  - form urlencode values passed to 
HttpMethodBase.setQueryString(NameValuePair[])
 - use java.net.URLEncoder for form urlencoding
I agree, as long as URLEncoder seems to work.

Do you think we need to modify URI so that it uses URLEncoder to encode 
the query part of URIs?  In cases where a client has a URL string that 
may or may not contain query parameters, it would lead to a slightly 
more natural API usage:
 HttpMethod meth = new GetMethod(new URI(urlString));
as opposed to
 String query = null;
 int index = urlString.indexOf('?');
 if (index != -1) {
   query = urlString.substring(index+1);
   urlString = urlString.substring(0, index);
 }
 HttpMethod meth = new GetMethod(new URI(urlString));
 meth.setQueryString(java.net.URLEncoder.encode(query));
or something like that, with error checking of course.

I'm not sure how much I care, though.  If my fetching code had been 
constructed using the HttpClient code from scratch, I wouldn't even have 
the query parameters in the string in the first place; I'd just add them 
with setQueryString.

I'll see if I can work up a preliminary patch for this stuff later 
tonight or tomorrow morning.   

Adrian Sutton wrote:

I know in the product we develop, we switched away from using 
java.net.URLEncoder because it didn't work properly
FWIW, we're using it and haven't seen any problems.  But we've been on 
1.2 or higher since I started at BeVocal.  (We're moving to 1.4 now 
because the server VM performance is *much* better.)

--Laura



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: form urlencoding, was Re: URI query escapes

2003-06-21 Thread Adrian Sutton
 I am also wondering why we are not using the java.net.URLEncoder for 
this (is also does not encode *-_.).
I know in the product we develop, we switched away from using 
java.net.URLEncoder because it didn't work properly.  Unfortunately the 
decision was made before my time so I'm not entirely sure of the 
details and it could well be that the bugs were only present back in 
JRE 1.1.  I'd say that if we have our own code already I'd continue to 
use it, but if not just test java.net.URLEncoder and I'll see if I can 
find out from some of the old timers at work exactly why we don't use 
it.

I've known us to not use something for legacy reasons or because it 
showed up bugs in our code as well, so don't take this as an objection, 
just a word of caution.

Mike
Regards,

Adrian Sutton

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]