Re: [ikarus-users] path name encoding

Derick Eddington Mon, 17 Aug 2009 12:37:04 -0700

On Mon, 2009-08-17 at 01:23 +0300, Abdulaziz Ghuloum wrote:
> BTW, I just remembered that on some (all?) operating systems, file  
> paths need not map directly to unicode, so, some procedures may die  
> unexpectedly (e.g., if directory-list receives a name that cannot be  
> utf-8-decoded) and some files may not be accessible (since we use  
> unicode strings to represent path names).


If the UTF-8-decoding error-handling-mode is not 'raise, then decoding
errors will be silently replaced or ignored.

Maybe everything which takes paths should also take bytevectors?  If a
bytevector is given, it's passed to C directly.  And everything which
returns paths should return bytevectors if they cannot be UTF-8-decoded.
But this complicates things for everyone because you'd need to check if
you've got a string or bytevector.  So, instead...

Maybe there should be a pathname-transcoder parameter?  If the Latin-1
codec is used, every byte is valid.  The default for the parameter could
be UTF-8, and if users know they might need to, they can parameterize to
Latin-1 and be responsible for interpreting the strings that way.  Maybe
there's some way to know what codec the OS uses for path names and
Ikarus can initialize the default to that?

-- 
: Derick
----------------------------------------------------------------

Re: [ikarus-users] path name encoding

Reply via email to