On 07/02/13 15:06, bes wrote:
> Hi,
>
> i found some bug in wget with interpreting and save percent-encoding 3 byte
> utf8 url
>
> example:
> 1. Create url with "—". This is U+2014 (EM DASH). Percent-encoding UTF-8 is
> "%E2%80%94"
> 2. Try wget it: wget "http://example.com/abc—d"; or wget "
> http://example.com/abc%E2%80%94d"; directly
> 3. Wget save this URL to file "abc\342%80%94d". Expected is
> "abc%E2%80%94d". This is a bug.

The problem is that it checks if it's a printable character in latin1.
There is a bug at https://savannah.gnu.org/bugs/index.php?37564
An option would be to use --restrict-file-names=nocontrol to get the em
dash in the filename, instead of the percent-encoded version.



Reply via email to