Everyone, There was a very interesting question about encoding character in the user group. It appears as though Jorn and I have narrowed the issue down to the code-page being unable to display the characters; however, it may also affect the input when using < and > input/output redirection to create or use the stdio path for input to the parsers. The default encoding for Windows is ANSI ... I'm not too sure what it is for the Mac or Linux platforms; however the default today may not hold for tomorrow. So, we may need to propose a way of wrapping the input and output streams for the System.out and System.in classes to handle the proper encoding / decoding.
What I'm proposing is having a general input / output class that wraps the System.in / System.out classes to handle the proper character encoding. Unfortunately, this means we may want to add a -encoding parameter to the parsers / tokenizers / etc. to allow this to happen on the I/O. There is a simple way to handle the output using the method below: --- http://www.velocityreviews.com/forums/t137667-changing-system-out-encoding.html <quote> PrintStream out = new PrintStream(System.out, true, "ISO-8859-1"); out.println("\u00E0\u00E1\u00E2\u00E9\u00EA\u00EB" ); </quote> --- Of course, we would have to use the proper encoding this was just an example from the post. The other was informational from here: --- http://illegalargumentexception.blogspot.com/2009/05/java-rough-guide-to-character-encoding.html --- I don't see us having a major problem now; but, we may need to either look for other methods or risk loosing added support for say Arabic, Chinese or Japanese. Any comments, suggestions, or rants welcome. James
