Craig Ringer <cr...@2ndquadrant.com> writes:
> Interesting issue. Mainly because the "Å¥" char it complains about
> (utf-8 0xc5 0xa5) is accepted in the SELECT that generates the record.

Uh, no, actually it's the SELECT that's failing.

> The regress script in question sets:
> SET client_encoding = 'utf8';
> but we're apparently round-tripping the data through the database encoding
> at some point, then converting back to client_encoding for output.

The conversion to DB encoding will happen the instant the query string
reaches the database.  You can set client_encoding to whatever you want,
but the only characters that can appear in queries are those that exist
in both the client encoding and the database encoding.

> In some ways it seems like the argument to pg_logical_emit_message(...) should
> be 'bytea'. That'd be much more convenient for application use. But then
> it's a pain when using it via the text-format SQL interface calls, where
> we've got no sensible way to output it.

Well, that's something worth thinking about.  I assume that
pg_logical_slot_get_changes could be executed in a database different from
the one where a change was originated?  What's going to happen if a string
in WAL contains characters unrepresentable in that database?  Do we even
have logic in there that will attempt to perform the necessary conversion?
And it is *necessary*, not optional, if you are going to claim that the
output of pg_logical_slot_get_changes is of type text.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to