> If "foo" is a US-ASCII string, "grep foo file" will work fine with any
> US-ASCII-superset charset for which non-ASCII characters do not use
> bytes < 0x80, including the hypothetical one I described, with no
> possibility of a false match. However "grep fóó file" will work only
> if the current shell charset (i.e. of argv[1]) matches the encoding of
> "file".

Not necessarily. It will work as long as the sequence of 3 bytes fóó is the
representation of the string you are looking for in the file, in that file's
encoding. grep does not validate anything, nor should it IMHO. If you want
to guarantee the encoding, use a converter like ICU's uconv(1) or iconv(1).

YA


Reply via email to