> Le 11 f?vr. 2016 ? 23:13, Clemens Ladisch <clemens at ladisch.de> a ?crit : > > As far as I can see, there are five problems: > - stdin from the console
Convert from the codepage returned by GetConsoleCP() to UTF8. > - stdin redirected from a file Personal opinion: I'd like it to treat input as implicitly in UTF8 as today. > - stdout to the console Convert from UTF8 to the codepage returned by GetConsoleOutputCP(). > - stdout redirected to a file Personal opinion: I'd like it to output UTF8. > - command-line arguments They're presented to the application code, through the argv[] pointers, as system default ANSI code page. Converting from CP_ACP to UTF8 is appropriate. I'm adding a 6th point: - make sure that if sqlite3 needs to present a filename to any ...A Windows API, that conversion continues to choose between CP_ACP or CP_OEMCP depending on the AreFileApisANSI() function. This is the case right now, and nothing related to shell.c should change that. And a 7th point: - check that when sqlite gets a text string from Windows from a ...A API (an error message string for instance), it is considered to be in CP_ACP and converted to whatever needed, from CP_ACP (AreFileApisANSI() should not be used). > This would be too much for 3.11.0. Of course. About UINT GetConsoleCP() and UINT GetConsoleOutputCP() functions... They're present since Windows 2000. I don't know about various WinCE editions. I wasn't so sure of since when they're available, so I coded the quick and dirty change for tests purpose using hardcoded CP_OEMCP, but it is better to use GetConsole(Output)CP() APIs. Indeed, among the codepages to which the console can be switched (or defaulted to on various localized editions of Windows), some codepages are considered 'OEM', others 'ANSI'. Using CP_OEMCP when the console has been set for an ANSI codepage, gives wrong result. And reciprocal too. I'm advocating for using GetConsoleCP() and GetConsoleOutputCP() in order to convert the input or the output as needed, instead of being tempted to use their Set counterparts (SetConsoleCP(65001) and SetConsoleOutputCP(65001)). That would look simpler to use them to turn the console IO to UTF8, but it's a bumpy road. Because unless the display font actually supports unicode and UTF8 encoding, display issues can appear. And using 65001 does not goes back in time as far on the Windows timeline. Using the Get... path, the user can change its codepage himself through command chcp ..., knowingly. -- Meilleures salutations, Met vriendelijke groeten, Best Regards, Olivier Mascia, integral.be/om -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 842 bytes Desc: Message signed with OpenPGP using GPGMail URL: <http://mailinglists.sqlite.org/cgi-bin/mailman/private/sqlite-users/attachments/20160212/1f0df2ea/attachment.pgp>