Enrico Forestieri wrote:

> Currently, a converter path cannot contain non-ascii characters on systems
> not using utf8 as the local encoding. Moreover, arguments are also passed
> in utf8 encoding. This is clearly wrong and produces assertions, eg. on
> Windows but also on Solaris when using a locale different from utf8.

This is wrong indeed. I thought that I converted all filenames given as
commandline arguments, but obviously I missed some.

> I had a look at the code and come to the conclusion that the FileName
> class with its toFilesystemEncoding method is not of help here. This
> is because we have to also deal with non-absolute paths and some
> arguments may also be in utf8.

Why non-absolute paths? In the past we were moving to absolute paths
wherever possible in order to get rid of the Path class that is simply too
confusing to use.
And why would some arguments be in utf8? AFAIK commandline arguments should
always be in the current locale.

> IMO we should have functions for directly converting from utf8 to the
> file system encoding and vice versa. I would like to hear opinions about
> this issue and, for discussion, attach here a patch solving these
> problems.

As I already wrote to Abdel last week the fact that all our filename related
code does still use std::string and not docstring is only of temporary
nature. I did not convert it yet because that would mean a lot of
conversions e.g. in some bibtex related code that cannot be converted to
docstring easily.
Therefore I don't think that we should convert from/to utf8 directly. If a
function

std::string const to_filesystem8bit(docstring const & s);

is needed, then add it, but instead of direct conversion from/to utf8 I
would rather change all filename related code to docstring. I can do that
if you want, but not too soon (the next week will be busy).


Georg

Reply via email to