Re: Multi-byte locales

2005-10-19 Thread Stephen Crawley

Tom Tromey wrote:


Florian == Florian Weimer [EMAIL PROTECTED] writes:



Florian On a related note, is it possible to access the command line
Florian as an array of byte arrays?

Nope.


To elaborate.  As far as I know, none of Sun's Java implementations since
(at least) JDK 1.1 have offered the ability to get at the command line 
args as

bytes.  I would imagine that Sun's view would be that 1) this is unnecessary
and, 2) it may not be implementable on some platforms. 


IMO, it would be a bad idea to add such functionality to Classpath (as GNU
extensions to Java), or to individual Java VM's (as VM specific extensions
to Java).  The reasons are 1) and 2) above, AND the fact that any 
significant

unofficial extensions to the Java platform will tend to lead to problems
in porting Java applications across Java platforms, and hence to 
platform lock-in.


-- Steve






___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Re: Multi-byte locales

2005-10-18 Thread Michael Koch
On Tue, Oct 18, 2005 at 12:11:30AM +0200, Florian Weimer wrote:
 It seems that with Sun's JDK, some files are unaccessible if you run
 in a multi-byte locale (something which uses UTF-8, for example)
 because it's not possible to specify an UTF-16 string which is encoded
 to the name of the file you are interested, provided that the file has
 a name which is not a valid character sequence in the current locale.
 (This is not a big deal on UCS-2/UTF-16 platforms, but I'm not really
 sure what the designers thought when they tried to fit this model on
 UNIX.)
 
 Is there some GNU extension which can work around this issue?  On a
 related note, is it possible to access the command line as an array of
 byte arrays?

There is no GNU extension (yet) that van work around this that I'm aware
of. For the other problem use the following untested pseudo-code.

You get the arguments as String[] args:

byte[][] data = new byte[args.length][];
for (int i = 0; i  args.length; i++) {
  data[i][] = args[i].getBytes();
}


Cheers,
Michael
-- 
Escape the Java Trap with GNU Classpath!
http://www.gnu.org/philosophy/java-trap.html

Join the community at http://planet.classpath.org/


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Re: Multi-byte locales

2005-10-18 Thread Florian Weimer
* Michael Koch:

 There is no GNU extension (yet) that van work around this that I'm aware
 of.

Do you think this (i.e. non-accessible files) is a problem at all?

 You get the arguments as String[] args:

 byte[][] data = new byte[args.length][];
 for (int i = 0; i  args.length; i++) {
   data[i][] = args[i].getBytes();
 }

Doesn't work, really, those strange bytes have already been dropped
by the startup code. 8-(


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Re: Multi-byte locales

2005-10-18 Thread Michael Koch
On Tue, Oct 18, 2005 at 09:29:53AM +0200, Florian Weimer wrote:
 * Michael Koch:
 
  There is no GNU extension (yet) that van work around this that I'm aware
  of.
 
 Do you think this (i.e. non-accessible files) is a problem at all?

Does files really contain such filenames with weird characters in the wild
out there ? We should at least document well that this is not supported
yet. I dont think this a real problem for now. We can just tell them to
rename the files and it will work if needed.


Cheers,
Michael
-- 
Escape the Java Trap with GNU Classpath!
http://www.gnu.org/philosophy/java-trap.html

Join the community at http://planet.classpath.org/


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Re: Multi-byte locales

2005-10-18 Thread Florian Weimer
* Michael Koch:

 On Tue, Oct 18, 2005 at 09:29:53AM +0200, Florian Weimer wrote:
 * Michael Koch:
 
  There is no GNU extension (yet) that van work around this that I'm aware
  of.
 
 Do you think this (i.e. non-accessible files) is a problem at all?

 Does files really contain such filenames with weird characters in the wild
 out there ?

It can easily happen if you have two users with different LC_CTYPE
settings.

Most users I know avoid non-ASCII characters in file names like hell,
but this is by no means a universal standard.

 I dont think this a real problem for now. We can just tell them to
 rename the files and it will work if needed.

This depends on what your application does. 8-/


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Re: Multi-byte locales

2005-10-18 Thread Tom Tromey
 Florian == Florian Weimer [EMAIL PROTECTED] writes:

Florian Is there some GNU extension which can work around this issue?

Nope.  You could do it some hacky way, e.g. exec a second VM with
different locale settings.  Eww...

Florian On a related note, is it possible to access the command line
Florian as an array of byte arrays?

Nope.

Tom


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath


Multi-byte locales

2005-10-17 Thread Florian Weimer
It seems that with Sun's JDK, some files are unaccessible if you run
in a multi-byte locale (something which uses UTF-8, for example)
because it's not possible to specify an UTF-16 string which is encoded
to the name of the file you are interested, provided that the file has
a name which is not a valid character sequence in the current locale.
(This is not a big deal on UCS-2/UTF-16 platforms, but I'm not really
sure what the designers thought when they tried to fit this model on
UNIX.)

Is there some GNU extension which can work around this issue?  On a
related note, is it possible to access the command line as an array of
byte arrays?


___
Classpath mailing list
Classpath@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath