Re: Building libcurl on MS-Windows with UNICODE defined

Tom Bishop, Wenlin Institute Tue, 01 Nov 2011 11:56:21 -0700

On Oct 15, 2011, at 2:57 PM, Daniel Stenberg wrote:

> On Mon, 10 Oct 2011, Tom Bishop, Wenlin Institute wrote:
> 
> Thanks for your work and research!
> 
>> I don't know whether this significantly affects the operation of libcurl as 
>> it is actually used. If libcurl needs any of these functions to handle 
>> non-Latin strings, it will presumably fail.
> 
> ...
> 
>> Is there any documentation of this issue for libcurl? I don't find any 
>> mention of it in the source code itself. Excuse me if the issue has already 
>> been addressed on this mailing list.
> 
> No, I don't believe we've discussed this to any particular degree in the 
> past. At least I can't recall it.
> 
>> And, is there any interest in adding support for UNICODE? MS-Windows has 
>> supported Unicode since 1995, sixteen years ago. Half the world's population 
>> uses non-Latin scripts, and even for languages such as English, Unicode 
>> provides useful characters that aren't in the ANSI/Windows code pages.
> 
> Well, does the current code cause some kind of problem? The way I read your 
> mail is that you think there _might_ be problems, but I don't think anyone 
> has reported/mentioned any up until now and you're not being very specific 
> either so in my view this is not a criticial issue.


Please excuse the lateness of this reply. I agree this does not appear to be a 
critical issue and I understand that a new version is coming soon, so I don't 
suggest making any immediate changes. I'm not aware of any problem with the 
current code, provided that it's built with the makefile (with UNICODE not 
defined), and assuming that user-names, etc., passed to these MS-Windows 
functions, are always ASCII in actual current usage (probably true since nobody 
has complained).

> But if we can fix problems by altering the code, and not cause backwards 
> compatible problems, then I'm all for it!
> 
> The only unicode related issue on windows that I can recall is people trying 
> to use curl_formadd() and pass in a unicode file name path.
> 
> I'll certainly appreciate a patch and I hope I can get more Windows savy 
> people than me to help me review it for correctness.

Thank you! When I have more time to spare I'd like to study it further and 
possibly make some suggestions. The only change I'd suggest testing after the 
next version is to replace FormatMessage(), etc., with FormatMessageA(), etc., 
so that the code will compile and run correctly (for code points < U+0080) 
regardless of whether UNICODE is defined, for the benefit of people who compile 
CURL with different makefiles. (I've done that already in my own copy, and it 
compiles without warnings, but so far I'm using only a tiny part of CURL's 
functionality so I can't say it's well-tested. Still, the logic is simple: if 
UNICODE is not defined, then FormatMessage is defined as FormatMessageA anyway, 
so the replacement has no effect. If UNICODE is defined, then FormatMessage is 
defined as FormatMessageW, and it triggers a compiler warning and run-time 
failure if called with "char *". Therefore it's better to use the name 
FormatMessageA explicitly, as long as you're calling it with "char *".)

Probably there are also potential improvements that would use the 
Unicode-capable versions of the MS-Windows functions, possibly supporting UTF-8 
strings that get converted to UTF-16 so that CURL's API can still use "char *" 
but non-ASCII characters will get passed correctly to the MS-Windows functions.

Best wishes,

Tom

文林 Wenlin Institute, Inc.        Software for Learning Chinese
E-mail: [email protected]     Web: http://www.wenlin.com
Telephone: 1-877-4-WENLIN (1-877-493-6546)
☯




-------------------------------------------------------------------
List admin: http://cool.haxx.se/list/listinfo/curl-library
Etiquette:  http://curl.haxx.se/mail/etiquette.html

Re: Building libcurl on MS-Windows with UNICODE defined

Reply via email to