On Tue, 2004-01-20 at 09:06, Bjoern JACKE wrote:
> Hi,
> 
> there has been discussion about charset conversion in JFS once or 
> twice before, for example on 
> http://www-124.ibm.com/developerworks/bugs/?func=detailbug&bug_id=3387&group_id=35
> I want to bring this topic up once more. JFS is the only POSIX 
> filesystem which enforces a certain encoding (like the Redmond's 
> filesystems do), which makes the iocharset mount option neccessary. 
> This mount option however brings up many problems, for example it's 
> not possible to use multiple encodings on one filesystem, which is a 
> shame for a multiuser os. 

This is a shame.  This JFS was designed on OS/2, where it was possible
for the kernel to convert to the user's charset, since the process's
locale was available to the kernel.  JFS has no way of knowing the
process's charset in Linux.

> Even if all people use the same encoding, 
> let's say UTF-8, problems come up when there are RPMs or any kind of 
> archives which bring files whose names are not validly UTF-8 encoded, 
> maybe in latin1 or EUC_JP.
> These files can simply not be created on JFS, which leads to 
> unforeseeable problems. Even if one would use an encoding like 
> iocharset=iso8859-1, which is 8-bit, utf-8 cannot be created because 
> not all 255 possible characters are valid in iso8859-1. 

Actually, I believe that the iso8859-1 charset was fixed in the 2.6
kernel so that all 8-bit characters are represented.

> A nasty 
> workaround to get a sane Unix-like filesystem, where all kind of 
> filenames can be created (except for slash and null) is to use an 
> 8-bit encoding like cp850 with iocharset, where all 255 characters are 
> defined, and thus also utf-8 and all kinds of byte sequences are 
> valid. This will not make the filenames appear correct inside (in 
> UTF-16) but it will lead to sane behaviour.
> Having to use such a trick to get a sane behaving JFS is not a very 
> nice way. Having correct representation of the filenames inside in 
> UTF-16 would by the way just make sense it the filesystem is for 
> example shared with an OS/2 system.
> Wouldn't it make senÑe to just do charset conversion if this is 
> explictly wished by the user, for example just if a convert mount 
> option was given? Otherwise JFS should just behave like any other 
> POSIX filesystem and accept any sequence of bytes, except for slash 
> and null. It should just map byte by byte to UTF-16 by deault and 
> not care about encodings. IMHO for a Unix filesystem this is a major 
> design bug.

I believe the new-and-improved iso8859-1 charset provides a good
solution.  Is the fix in the 2.6 kernel sufficient, or does someone need
to push the iso8859-1 fix back to the 2.4 kernel?

> 
> BjÃrn
-- 
David Kleikamp
IBM Linux Technology Center

_______________________________________________
Jfs-discussion mailing list
[EMAIL PROTECTED]
http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion

Reply via email to