Jean-Marc Lasgouttes wrote:

>>>>>> "Georg" == Georg Baum
>>>>>> <[EMAIL PROTECTED]>
>>>>>> writes:
> 
> Georg> Currently we have several UNICODE FIXMEs in the code where we
> Georg> pass an utf8 encoded filename to an fstream and hope that it
> Georg> will work. This is wrong, because the encoding of filenames
> Georg> depends on the locale on linux. I don't know what it is on
> Georg> windows, I never use non-ascii filenames.
> 
> Does encoding of file depend on locale? Isn't it stored on the
> filesystem?

I am not sure. Under linux the encoding is a mount parameter of some
filesystems (e.g. fat or ntfs). I believe that other file systems like etx3
or reiserfs do not know anything about encodings, and that the applications
interpret the raw data according to the locale. Further research is needed
here, also on OS X and windows. In the worst case we need a preferences
option that specifies how filenames are encoded.
Anyway, it is clear that we need the notion of a (probably platform
specific) "filesystem encoding" in LyX, since it can be something else than
utf8.

> Georg> - Split the existing FileName class into a general part that
> Georg> can be used for any filename, and a part that is specialized
> Georg> for files that appear in documents. This is done by the
> Georg> attached patch.
> 
> Why do you need to separate them?

Because the existing FileName class knows something of mangled file names
and relative paths related to some buffer directory. This is only useful
for file names that have something to do with documents.
IMO it is safer to separate this, because it is impossible then to get the
mangled name of a file that resides in the temp dir itself.

> Georg> - Convert both file name classes to docstring
> 
> OK.
> 
> Georg> - Store all filenames not as string or docstring, but in the
> Georg> FileName class and convert all functions that take a filename
> Georg> argument to the FileName class. That will eliminate a lot of
> Georg> assertions on absolute paths.
> 
> Do we _have_ to do that now?

No, we don't _have_ to do it now, but I think it is a good idea. We could of
course also leave everything as it is, introduce a
std::string const to_filesystem_encoding(docstring const &)
method and use that where needed. The problem with that approach is that we
either will miss some places that need to be converted, or have to look at
all places where filenames are used. If we need to do the latter we can as
well do the type change which has the added benefit that we do not
introduce a third encoding for std::string. Currently all std::string are
either ascii or utf8. If we now add a potentially different filesystem
encoding we ask for trouble IMHO.


Georg

Reply via email to