On Mon, 2009-08-17 at 01:23 +0300, Abdulaziz Ghuloum wrote: > BTW, I just remembered that on some (all?) operating systems, file > paths need not map directly to unicode, so, some procedures may die > unexpectedly (e.g., if directory-list receives a name that cannot be > utf-8-decoded) and some files may not be accessible (since we use > unicode strings to represent path names).
If the UTF-8-decoding error-handling-mode is not 'raise, then decoding errors will be silently replaced or ignored. Maybe everything which takes paths should also take bytevectors? If a bytevector is given, it's passed to C directly. And everything which returns paths should return bytevectors if they cannot be UTF-8-decoded. But this complicates things for everyone because you'd need to check if you've got a string or bytevector. So, instead... Maybe there should be a pathname-transcoder parameter? If the Latin-1 codec is used, every byte is valid. The default for the parameter could be UTF-8, and if users know they might need to, they can parameterize to Latin-1 and be responsible for interpreting the strings that way. Maybe there's some way to know what codec the OS uses for path names and Ikarus can initialize the default to that? -- : Derick ----------------------------------------------------------------
