https://issues.apache.org/bugzilla/show_bug.cgi?id=50562

lu ye <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|VERIFIED                    |REOPENED
         Resolution|REMIND                      |

--- Comment #7 from lu ye <[email protected]> 2011-01-13 08:25:22 EST ---
After test, it is quite sure that apache has done wrong utf8 to unicode
conversion with non-ascii charcater. It simply extend the bytes of utf-8 with
\x00 to get the unicode encoding. Here is the test:
1. request a url http://127.0.0.1/你好
2. get "PATH_INFO" with windows api GetEnvironmentVariableW, it returns
"\x00\x2f \x00\xe4 \x00\xbd \x00\xa0 \x00\xe5 \x00\xa5 \x00\xbd" which should
be "\x00\x2f \x60\x4f \x7d\x59"
3. my cmd codepage is cp936(GBK), so any programming language which depends on
C lib to get this environment will get wrong result as long as a non-ascii uri
is requeted.
hope to check.
(In reply to comment #5)
> What does CHCP return at the command line?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to