David Wheeler wrote:
Bother, that would have been easy. :-)On Wednesday, December 18, 2002, at 01:27 AM, Dominic Mitchell wrote:% psql -lI think the above does. I don't think you could have the encoding set to UNICODE if it hadn't been compiled with --enable-multibyte.
List of databases
Name | Owner | Encoding
-----------+----------+-----------
dom | dom | UNICODE
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
(3 rows)
I'm using the -PGDG rpm, and looking at the spec file, it seems to indicate that --enable-multibyte is not specified, but it should be the default anyway. Is there a way that I can verify this from the installed software?
Attached is a patch to the driver which makes it work as expected for me. I don't think it's the right patch, however... It should probably only set the UTF8 flag when there is a high bit set in the data. Attached is another patch which attempts to do that, although my C skills are pretty rusty these days, so it could probably be written a lot better!
Thanks,
-Dom
--
| Semantico: creators of major online resources |
| URL: http://www.semantico.com/ |
| Tel: +44 (1273) 722222 |
| Address: 33 Bond St., Brighton, Sussex, BN1 1RD, UK. |
Index: dbdimp.c =================================================================== RCS file: /usr/local/cvsroot/dbdpg/dbdpg/dbdimp.c,v retrieving revision 1.8 diff -u -r1.8 dbdimp.c --- dbdimp.c 27 Nov 2002 02:02:39 -0000 1.8 +++ dbdimp.c 18 Dec 2002 18:03:36 -0000 @@ -1391,6 +1391,7 @@ val[val_len] = '\0'; } sv_setpvn(sv, val, val_len); + SvUTF8_on(sv); } }
Index: dbdimp.c =================================================================== RCS file: /usr/local/cvsroot/dbdpg/dbdpg/dbdimp.c,v retrieving revision 1.8 diff -u -r1.8 dbdimp.c --- dbdimp.c 27 Nov 2002 02:02:39 -0000 1.8 +++ dbdimp.c 18 Dec 2002 18:07:37 -0000 @@ -1391,6 +1391,12 @@ val[val_len] = '\0'; } sv_setpvn(sv, val, val_len); + { + char *s; + while( *s++ ) { + if (*s & 0x80) SvUTF8_on(sv); + } + } } }