Frey -

Let URI::query_form() escape and unescape your strings, including the
spaces. The function expects plain, unencoded strings, or else that regex
you modified wouldn't be there.

The problem with your solution is that now you can't include '+' (as in
plus) in your key/value pairs. If you use '+' and you mean plus, you'll get
a space when you decode the string. If you try to escape '+' with '%2B',
that same regex will escape the '%', so when you decode the string, you'll
be left with '%2B'.

To make URI::query_form() follow the specifications for user agents found in
section 8.2.1 of RFC 1866, you could add

$key =~ s/ /+/g;

and

$val =~ s/ /+/g;

after the respective regex's, instead of modifying those regex's. Then, just
be sure to pass regular, unescaped strings to URI::query_form().

You can probably get away with using _query.pm as distributed if you use
URI::query_form() to manipulate the query string but URI::query() when you
want to actually use the query string. Spaces will be converted to '%20'
instead of '+', but applications that decode
"application/x-www-form-urlencoded" data will convert both back to a space.

- Mike Sheldrake


> -----Original Message-----
> From: la mouton [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, February 22, 2000 7:01 PM
> To: [EMAIL PROTECTED]
> Subject: URI::URL->query_form() non compliant with specification
>
>
> Gisle,
>
> query_form() is non-compliant with RFC 1866 section. 8.2.1.
>
> This bug relates to the URL escaping of a space in key,value pairs for a
> "application/x-www-form-urlencoded" type POST form.
>
> section 8.2.1 [RFC 1866] states:
>
> [------------------------------------------------------------------]
>     1.The form field names and values are escaped: space characters are
> replaced by `+', and then reserved characters are escaped as per [URL];
>       that is, non-alphanumeric characters are replaced by `%HH', a
> percent sign and two hexadecimal digits representing the ASCII code of the
>       character. Line breaks, as in multi-line text field values, are
> represented as CR LF pairs, i.e. `%0D%0A'.
> [------------------------------------------------------------------]
>
> However, in URI::_query, the default policy is to escape the '+' character
> into '%HH' format which produces a '%2B' as a result.  The offending lines
> of code in _query.pm are:
>
> 33            $key =~ s/([;\/?:@&=+,\$%])/$URI::Escape::escapes{$1}/g;
> 37            $val =~ s/([;\/?:@&=+,\$%])/$URI::Escape::escapes{$1}/g;
>
> removing the '+' from the regex does the trick for me.
>
> regards,
> Frey Kuo
>
>

Reply via email to