2009/9/15 Itagaki Takahiro <itagaki.takah...@oss.ntt.co.jp>:
>
> Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote:
>
>> Can't we use MultiByteToWideChar() to convert directly to the required
>> encoding, avoiding the double conversion?
>
> Here is an updated version of the patch.
> I use direct conversion in pgwin32_toUTF16() if a corresponding codepage
> is available. If not available, I still use double conversion.
>
> Now pgwin32_toUTF16() is exported from mbutil.c. I used the function
> in following parts, although the main target of the patch is eventlog.
>
>  * WriteConsoleW() - write unredirected stderr log.
>  * ReportEventW()  - write evenlog.
>  * CreateFileW()   - open non-ascii filename (ex. COPY TO/FROM 'mb-path').
>
> This approach is only available for Windows because any other platform
> don't support locale-independent and wide-character-based system calls.
> Other platforms require a different approach, but even then we'd still
> better have win32-specific routines because UTF16 is the native encoding
> in Windows.

I did a quick check of this, and here are the things I would like to
have changed:

First of all, the change to port/open.c seems to be unrelated to the
rest, and should be a separate patch, correct? I'm sure there's a
usecase for it, but it's not actually  included in the patches
description, so I assume this was a mistake?

Per your own comments earlier, and in the code, what will happen if
pg_do_encoding_conversion() calls ereport()? Didn't you say we need a
non-throwing version of it?

pgwin32_toUTF16() needs error checking on the API calls, and needs to
do something reasonable if it fails. For example, it can fail because
of out of memory error. I suggest just returning the error code in
some way in that case, and have the callers fall back to logging in
the incorrect encoding - in a lot of cases that will produce an at
least partially readable message. A second message should also be
logged saying that the conversion failed - this needs to be done
directly with the eventlog API functions and not ereport, so we don't
end up in infinite recursion.

The encoding_to_codepage array needs to go in encnames.c, where other
such tables are. Perhaps it can even be integrated in pg_enc2name_tbl
as a separate field?

I don't have the time to clean this up right now, so if you have,
please do so and resubmit. If not, I can clean it up later and apply.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to