Problem in Process Command Line Arguments
-----------------------------------------
Key: JRUBY-2677
URL: http://jira.codehaus.org/browse/JRUBY-2677
Project: JRuby
Issue Type: Bug
Components: Interpreter
Affects Versions: JRuby 1.1.2
Environment: WinXP SP2
Reporter: Tsing
I'm a Chinese user, the language set of my system is Simplified Chinese.
For example, here is a test.rb:
#------------------------------
ARGV.each do |arg|
puts "#{arg}\n"
end
#-----------------------------
And I will use @ to replace any Simplified Chinese Characters(Take the
consideration of your non-Chinese Support system).
Now test.rb is under D:\@@\
The command is : jruby.bat D:\@@\test.rb arg1 @@@ arg3
Then jruby told me that she can't find D:\??\test.rb. As you see, jruby use the
same amouts of ? to replace Chinese Characters which results to an error.
Well, I move test.rb to D:\test\ , then execute : jruby.bat D:\test\test.rb
arg1 @@@ arg3 .
Now the output is :
arg1
???
arg3
It seems that this bug exists in every case involed command line arguments.
Yesterday I downloaded the source code and find something maybe useful(Now I'm
using another machine, I can only write them down according to my memory):
In org\jruby\utils\RubyFile.java , there is a createfile method. After
"filepath=new String(filepath.getBytes("ISO-8859-1"),"UTF-8")", the Chinese
Characters in the String "filepath" are replaced with '?'. I don't understand
the meaning of this line of code, but I know the "ISO-8859-1" is the main
encoding method of English speaking countries, when there is a Chinese
character in "filepath", "getBytes()" can't recognize it within ISO-5589-1,
and then replace it with '?' in the return bytes.
Then I commented this line, and it worked! Jruby can process .rb files whose
path including Chinese characters. But maybe this line of code is of use
somewhere else, which I hope you can tell me.
Now the problem is only the arguments.
I traced it from org\jruby\RubyGlobal.java to the build_lib\ByteList.jar which
I can't find source code(why?), now I'm sure the problem lies in Bytelist.jar,
and it should be something like "getBytes("ISO-8859-1")".
In org\jruby\RubyGlobal.java, there's a RubyArray called argvArray[], the
program adds every arguments to it, then define it as "ARGV". Just after
"argvArray.add(runtime.newString(argv[i]))", the Chinese characters in the
argv[i] is replaced with '?'. When I start with runtime.newString finally I got
nothing because I only traced it to ByteList.jar.
Then I replace that line with simply "argvArray.add(argv[i])". But in my WinXP
SP2 it still didn't work(useing the same example, the output text is arg1\n
<TextOfAMess>\n arg3\n), but things is getting better because there is no
longer '?'s. Later I restart my computer to enter my Linux, whose locale is
UTF-8, what surprised me is that everything became OK. Thinking of my WinXp's
locale is GB2312(Simplified Chinese), maybe jruby had converted the arguments
in the ARGV to UTF-8, and the problem of XP is XP just can't display them
normally.
But if I use the arguments as a path, more complex problem occurs.
For example: jruby.bat test2.rb "D:\test\@@\readme.txt"
And I tried to read the contents of "readme.txt" in the test2.rb, even in
Linux, jruby would told my can't find D:\test\<TextOfAMess>\readme.txt.
Then I have to stopped because things was worse than I had expected.
My English is poor, and I hope you can get want I'm writing about. Thanks for
reading, thanks for your help.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe from this list, please visit:
http://xircles.codehaus.org/manage_email