Hi Kyle,
> I'm the author of extract_url.pl, so perhaps I can shed some light 
> here.
Thanks.
> The *correct* place to "fix" the issue of escaping (or otherwise 
> sanitizing) ampersands is in the sanitizeuri function (line 208). The 
> current version of extract_url.pl uses this:
> 
>      sub sanitizeuri {
>          my($uri) = @_;
>          $uri =~ 
> s/([^a-zA-Z0-9_.!*()\@&:=\?\/%~+-])/sprintf("%%%X",ord($1))/egs;
>          return $uri;
>      }

I tried now your fix, and it didn't work for me; my browser doesn't
find the resulting pages when the url has ampersands that are
converted to %26 (probably because the % itself is further encoded as
%25 before been sent to the server by the browser (?))

> ...
> I've personally never had a problem with ampersands, and I'm not sure 
> why some people do. Extract_url.pl constructs system commands like so:
> 
>    /path/to/handler 'http://url.with/an&ampersand'

I changed my handler to '/bin/echo %s >>tmp.txt' and it wrote the
correct result, so I guess you're right here. 

> ... which should be perfectly safe and work just fine (and does for 
> me). I suspect the problem stems from using other wrapper script (e.g. 
> /etc/urlhandler/urlhandler.sh). I bet the that wrapper script is not 
> properly quoting its first argument.

I don't know much about shell programming, but I found that
/etc/urlhandler/url_handler.sh is a shell script that obtains its url
doing '$url=$1'. I replaced the whole handler by the following
program:
    #! /bin/bash 
    url=$1; shift
    echo $url >>tmp.txt; 
and found out that the url is cut short at the first ampersand. 

I don't understand why echo by itself yields the correct result (above)
while echo through a bash script yields the truncated result.

Thanks and best regards,
Luis

-- 

                                                                  o
W. Luis Mochán,                      | tel:(52)(777)329-1734     /<(*)
Instituto de Ciencias Físicas, UNAM  | fax:(52)(777)317-5388     `>/   /\
Apdo. Postal 48-3, 62251             |                           (*)/\/  \
Cuernavaca, Morelos, México          | moc...@fis.unam.mx   /\_/\__/
GPG: DD344B85,  2ADC B65A 5499 C2D3 4A3B  93F3 AE20 0F5E DD34 4B85


Attachment: signature.asc
Description: Digital signature

Reply via email to