Christian Biere wrote:
> Is there any simple way to strip the control characters and still preserve
> the valid encoding?

Replacing iscntrl() with is_ascii_cntrl() seems to do the trick. Since
I use ISO-8859-1 as locale charset, iscntrl() is true for a couple of
characters above \x7f. OK, I've just checked that no valid UTF-8 character
is composed of more than one ASCII characters i.e., a multi-byte character
contains only non-ASCII characters.
 
> What's worse is that GTKG seems to attempt downloading such files by name
> instead of urn:sha1 which fails (of course?).

Maybe this hasn't really anything to do with the encoding.

-- 
Christian

Attachment: pgpirHCf3cs5G.pgp
Description: PGP signature

Reply via email to