On Mon, Jul 18, 2011 at 7:36 PM, Florian Pflug <f...@phlo.org> wrote:
> On Jul19, 2011, at 00:17 , Joey Adams wrote:
>> I suppose a simple solution would be to convert all escapes and
>> outright ban escapes of characters not in the database encoding.
>
> +1. Making JSON work like TEXT when it comes to encoding issues
> makes this all much simpler conceptually. It also avoids all kinds
> of weird issues if you extract textual values from a JSON document
> server-side.

Thanks for the input.  I'm leaning in this direction too.  However, it
will be a tad tricky to implement the conversions efficiently, since
the wchar API doesn't provide a fast path for individual codepoint
conversion (that I'm aware of), and pg_do_encoding_conversion doesn't
look like a good thing to call lots of times.

My plan is to scan for escapes of non-ASCII characters, convert them
to UTF-8, and put them in a comma-delimited string like this:

    a,b,c,d,

then, convert the resulting string to the server encoding (which may
fail, indicating that some codepoint(s) are not present in the
database encoding).  After that, read the string and plop the
characters where they go.

It's "clever", but I can't think of a better way to do it with the existing API.


- Joey

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to