Jean-Marc Lasgouttes wrote: >>>>>> "Georg" == Georg Baum >>>>>> <[EMAIL PROTECTED]> >>>>>> writes: > > Georg> Currently we have several UNICODE FIXMEs in the code where we > Georg> pass an utf8 encoded filename to an fstream and hope that it > Georg> will work. This is wrong, because the encoding of filenames > Georg> depends on the locale on linux. I don't know what it is on > Georg> windows, I never use non-ascii filenames. > > Does encoding of file depend on locale? Isn't it stored on the > filesystem?
I am not sure. Under linux the encoding is a mount parameter of some filesystems (e.g. fat or ntfs). I believe that other file systems like etx3 or reiserfs do not know anything about encodings, and that the applications interpret the raw data according to the locale. Further research is needed here, also on OS X and windows. In the worst case we need a preferences option that specifies how filenames are encoded. Anyway, it is clear that we need the notion of a (probably platform specific) "filesystem encoding" in LyX, since it can be something else than utf8. > Georg> - Split the existing FileName class into a general part that > Georg> can be used for any filename, and a part that is specialized > Georg> for files that appear in documents. This is done by the > Georg> attached patch. > > Why do you need to separate them? Because the existing FileName class knows something of mangled file names and relative paths related to some buffer directory. This is only useful for file names that have something to do with documents. IMO it is safer to separate this, because it is impossible then to get the mangled name of a file that resides in the temp dir itself. > Georg> - Convert both file name classes to docstring > > OK. > > Georg> - Store all filenames not as string or docstring, but in the > Georg> FileName class and convert all functions that take a filename > Georg> argument to the FileName class. That will eliminate a lot of > Georg> assertions on absolute paths. > > Do we _have_ to do that now? No, we don't _have_ to do it now, but I think it is a good idea. We could of course also leave everything as it is, introduce a std::string const to_filesystem_encoding(docstring const &) method and use that where needed. The problem with that approach is that we either will miss some places that need to be converted, or have to look at all places where filenames are used. If we need to do the latter we can as well do the type change which has the added benefit that we do not introduce a third encoding for std::string. Currently all std::string are either ascii or utf8. If we now add a potentially different filesystem encoding we ask for trouble IMHO. Georg
