I don't think the Windows logic will be quite that simple - I think we'll have to recreate the mapping defined by the Windows API [1]. In the case of 936, we'd convert to gb2312, per [1].
The default value is going to vary on each platform. On Windows, if the we can't determine locale information, then we'll default to "en" and encoding of "Windows-1252" -Nathan [1] http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx On Mon, Jul 20, 2009 at 5:05 AM, Charles Lee<littlee1...@gmail.com> wrote: > Hi guys, > > A new patch is attached but still fail on the windows. It *seems* VM do not > support CP936. > > 1. I have tried to hard code "CP936" in the luniglob.c, make the > file.encoding always be CP936. The vm failed to launch with the message > "HMYEXEL054E vm inner fault: can not create java/lang/String, FAILED to > invoke JVM" (The original msg is Chinese, I am translating it) > 2. I have tried to hard code "UTF-8" in the luniglob.c, make the > file.encoding always be UTF-8. The vm sucessfully launch and tests have been > passed. > > Does somebody know where the vm load the String? And what does "HMYEXEL054E" > mean? > > On Sat, Jul 18, 2009 at 11:10 AM, Nathan Beyer <ndbe...@apache.org> wrote: >> >> On Fri, Jul 17, 2009 at 6:03 AM, Alexey >> Varlamov<alexey.v.varla...@gmail.com> wrote: >> > 2009/7/17, Nathan Beyer <ndbe...@apache.org>: >> >> On Thu, Jul 16, 2009 at 8:50 PM, Nathan Beyer<ndbe...@apache.org> >> >> wrote: >> >> > On Thu, Jul 16, 2009 at 8:35 PM, Nathan Beyer<ndbe...@apache.org> >> >> > wrote: >> >> >> On Thu, Jul 16, 2009 at 8:26 PM, Nathan Beyer<ndbe...@apache.org> >> >> >> wrote: >> >> >>> On Thu, Jul 16, 2009 at 8:18 PM, Charles Lee<littlee1...@gmail.com> >> >> >>> wrote: >> >> >>>> Hi Nathan, >> >> >>>> >> >> >>>> What I got is 936, the code page identifier. Is there a api for us >> >> >>>> to map >> >> >>>> 936 to the gb2312? >> >> >>> >> >> >>> Oh, the 'identifier' bit was missing - yeah, we'll need to >> >> >>> translate >> >> >>> that into a name of some sort. I'll poke around a bit and see what >> >> >>> I >> >> >>> can find. >> >> >> >> >> >> We'll probably just have to put in a mapping ourselves based on the >> >> >> documentation. We'd call GetACP [1] and map that to a known alias in >> >> >> java.nio.charset that matches the definitions[2] of the identifiers. >> >> >> >> >> >> [1] http://msdn.microsoft.com/en-us/library/dd318070%28VS.85%29.aspx >> >> >> [2] http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx >> >> > >> >> > This may be better - APR has a function for getting the OS default >> >> > encoding. This would work across all platforms that APR supports and >> >> > I >> >> > believe we already use APR. >> >> > >> >> > >> >> > http://apr.apache.org/docs/apr/1.3/group__apr__portabile.html#g6e21845a4a5f3b7dd107b2beea50c91e >> >> >> >> However, the Windows version of this is simply - return >> >> apr_psprintf(pool, "CP%u", (unsigned) GetACP());. Which is essentially >> >> "CP" + codePageId. >> >> >> >> And the Unix version of this method doesn't look very good for our >> >> purposes. >> >> > >> >> > -Nathan >> > >> > Yep - that's why APR was not used here initially. I guess your idea of >> > GetACP() + hardcoded mapping is the most suitable approach. We already >> > have similar solution for timezone detection, see >> > working_vm\vm\port\src\misc\win\timezone.c (which also should be moved >> > to classlib eventually, HARMONY-2053). >> >> I'd be inclined to combine these all together into the portlib >> (luni?). Perhaps in some sort of OS environment portion, which can be >> used by the rest of the class library. >> >> -Nathan >> >> > >> > -- >> > Alexey >> > > > > > -- > Yours sincerely, > Charles Lee > >