Am Montag, 14. Dezember 2015, 18:33:38 schrieb Eli Zaretskii:
> > Date: Sun, 13 Dec 2015 20:04:31 +0100
> > From: "Andries E. Brouwer" <andries.brou...@cwi.nl>
> > Cc: "Andries E. Brouwer" <andries.brou...@cwi.nl>, bug-wget@gnu.org
> > 
> > On Sun, Dec 13, 2015 at 08:01:27PM +0200, Eli Zaretskii wrote:
> > > If no one is going to pick up the gauntlet, I will sit down and do it
> > > myself, although I'm terribly busy with Emacs 25.1 release.
> > 
> > Good!
> 
> While working on this, I bumped into 2 related issues:
> 
>  1. The functions that call 'iconv' (in iri.c) don't make a point of
>     flushing the last portion of the converted URL after 'iconv'
>     returns successfully having converted the input string in its
>     entirety.  IME, you need then to call 'iconv' one last time with
>     either the 2nd or the 3rd argument set to NULL, otherwise
>     sometimes the last converted character doesn't get output.  In my
>     case, some URLs converted from CP1255 to UTF-8 lost their last
>     character.  It sounds like no one has actually used this
>     conversion in iri.c, except for trivially converting UTF-8 to
>     itself.  Is that possible/reasonable?

Possibly. 
Could you please give an example string ? I would like to test it on 
GNU/Linux, BSD and Solaris to see if the output is always the same.


>  2. Wget assumes that the URL given on its command line is encoded in
>     the locale's encoding.  This is a good assumption when the user
>     herself types the URL at the shell prompt, but not when the URL is
>     copy-pasted from a browser's address bar.  In the latter case, the
>     URL tends to be in UTF-8 (sometimes hex-encoded).  At least that's
>     what I get from Firefox.  We don't seem to have in wget any
>     facilities to specify a separate (3rd) encoding for the URLs on
>     the command line, do we?

I stumbled upon this a while ago when thinking about the design of wget2. And 
wget2 already has a working --input-encoding option for such cases.
AFAIK, nobody asked for such an option during the last years - so I assume 
this to be a somewhat 'expert' or 'fancy' option, at least a low priority one.
It is an optional goodie.

Tim


Reply via email to