Enrico Forestieri wrote: > Currently, a converter path cannot contain non-ascii characters on systems > not using utf8 as the local encoding. Moreover, arguments are also passed > in utf8 encoding. This is clearly wrong and produces assertions, eg. on > Windows but also on Solaris when using a locale different from utf8.
This is wrong indeed. I thought that I converted all filenames given as commandline arguments, but obviously I missed some. > I had a look at the code and come to the conclusion that the FileName > class with its toFilesystemEncoding method is not of help here. This > is because we have to also deal with non-absolute paths and some > arguments may also be in utf8. Why non-absolute paths? In the past we were moving to absolute paths wherever possible in order to get rid of the Path class that is simply too confusing to use. And why would some arguments be in utf8? AFAIK commandline arguments should always be in the current locale. > IMO we should have functions for directly converting from utf8 to the > file system encoding and vice versa. I would like to hear opinions about > this issue and, for discussion, attach here a patch solving these > problems. As I already wrote to Abdel last week the fact that all our filename related code does still use std::string and not docstring is only of temporary nature. I did not convert it yet because that would mean a lot of conversions e.g. in some bibtex related code that cannot be converted to docstring easily. Therefore I don't think that we should convert from/to utf8 directly. If a function std::string const to_filesystem8bit(docstring const & s); is needed, then add it, but instead of direct conversion from/to utf8 I would rather change all filename related code to docstring. I can do that if you want, but not too soon (the next week will be busy). Georg