Problem in Process Command Line Arguments -----------------------------------------
Key: JRUBY-2677 URL: http://jira.codehaus.org/browse/JRUBY-2677 Project: JRuby Issue Type: Bug Components: Interpreter Affects Versions: JRuby 1.1.2 Environment: WinXP SP2 Reporter: Tsing I'm a Chinese user, the language set of my system is Simplified Chinese. For example, here is a test.rb: #------------------------------ ARGV.each do |arg| puts "#{arg}\n" end #----------------------------- And I will use @ to replace any Simplified Chinese Characters(Take the consideration of your non-Chinese Support system). Now test.rb is under D:\@@\ The command is : jruby.bat D:\@@\test.rb arg1 @@@ arg3 Then jruby told me that she can't find D:\??\test.rb. As you see, jruby use the same amouts of ? to replace Chinese Characters which results to an error. Well, I move test.rb to D:\test\ , then execute : jruby.bat D:\test\test.rb arg1 @@@ arg3 . Now the output is : arg1 ??? arg3 It seems that this bug exists in every case involed command line arguments. Yesterday I downloaded the source code and find something maybe useful(Now I'm using another machine, I can only write them down according to my memory): In org\jruby\utils\RubyFile.java , there is a createfile method. After "filepath=new String(filepath.getBytes("ISO-8859-1"),"UTF-8")", the Chinese Characters in the String "filepath" are replaced with '?'. I don't understand the meaning of this line of code, but I know the "ISO-8859-1" is the main encoding method of English speaking countries, when there is a Chinese character in "filepath", "getBytes()" can't recognize it within ISO-5589-1, and then replace it with '?' in the return bytes. Then I commented this line, and it worked! Jruby can process .rb files whose path including Chinese characters. But maybe this line of code is of use somewhere else, which I hope you can tell me. Now the problem is only the arguments. I traced it from org\jruby\RubyGlobal.java to the build_lib\ByteList.jar which I can't find source code(why?), now I'm sure the problem lies in Bytelist.jar, and it should be something like "getBytes("ISO-8859-1")". In org\jruby\RubyGlobal.java, there's a RubyArray called argvArray[], the program adds every arguments to it, then define it as "ARGV". Just after "argvArray.add(runtime.newString(argv[i]))", the Chinese characters in the argv[i] is replaced with '?'. When I start with runtime.newString finally I got nothing because I only traced it to ByteList.jar. Then I replace that line with simply "argvArray.add(argv[i])". But in my WinXP SP2 it still didn't work(useing the same example, the output text is arg1\n <TextOfAMess>\n arg3\n), but things is getting better because there is no longer '?'s. Later I restart my computer to enter my Linux, whose locale is UTF-8, what surprised me is that everything became OK. Thinking of my WinXp's locale is GB2312(Simplified Chinese), maybe jruby had converted the arguments in the ARGV to UTF-8, and the problem of XP is XP just can't display them normally. But if I use the arguments as a path, more complex problem occurs. For example: jruby.bat test2.rb "D:\test\@@\readme.txt" And I tried to read the contents of "readme.txt" in the test2.rb, even in Linux, jruby would told my can't find D:\test\<TextOfAMess>\readme.txt. Then I have to stopped because things was worse than I had expected. My English is poor, and I hope you can get want I'm writing about. Thanks for reading, thanks for your help. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email