2009/9/15 Itagaki Takahiro <itagaki.takah...@oss.ntt.co.jp>: > > Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > >> Can't we use MultiByteToWideChar() to convert directly to the required >> encoding, avoiding the double conversion? > > Here is an updated version of the patch. > I use direct conversion in pgwin32_toUTF16() if a corresponding codepage > is available. If not available, I still use double conversion. > > Now pgwin32_toUTF16() is exported from mbutil.c. I used the function > in following parts, although the main target of the patch is eventlog. > > * WriteConsoleW() - write unredirected stderr log. > * ReportEventW() - write evenlog. > * CreateFileW() - open non-ascii filename (ex. COPY TO/FROM 'mb-path'). > > This approach is only available for Windows because any other platform > don't support locale-independent and wide-character-based system calls. > Other platforms require a different approach, but even then we'd still > better have win32-specific routines because UTF16 is the native encoding > in Windows.
I did a quick check of this, and here are the things I would like to have changed: First of all, the change to port/open.c seems to be unrelated to the rest, and should be a separate patch, correct? I'm sure there's a usecase for it, but it's not actually included in the patches description, so I assume this was a mistake? Per your own comments earlier, and in the code, what will happen if pg_do_encoding_conversion() calls ereport()? Didn't you say we need a non-throwing version of it? pgwin32_toUTF16() needs error checking on the API calls, and needs to do something reasonable if it fails. For example, it can fail because of out of memory error. I suggest just returning the error code in some way in that case, and have the callers fall back to logging in the incorrect encoding - in a lot of cases that will produce an at least partially readable message. A second message should also be logged saying that the conversion failed - this needs to be done directly with the eventlog API functions and not ereport, so we don't end up in infinite recursion. The encoding_to_codepage array needs to go in encnames.c, where other such tables are. Perhaps it can even be integrated in pg_enc2name_tbl as a separate field? I don't have the time to clean this up right now, so if you have, please do so and resubmit. If not, I can clean it up later and apply. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers