Re: DBD::Pg && handling of UTF-8 char columns

Maurice Aubrey Wed, 16 Oct 2019 21:35:26 -0700

On Mon, Oct 14, 2019 at 10:46 PM Matthias Apitz <[email protected]> wrote:


> i.e. from the database PG server is coming the code point correctly as
> (octal) \303\244 which is the same as \xc3\xa4. And Perl mangles this to
>
> [UTF8 "P\x{e4}dagogische Hochschule Weingarten"]
>
> which is IMHO not correct and causing all this confusion.
>
> We have to deal with this in our perl code. It's not a PostrgreSQL
> problem.
>

We use UTF8 extensively with Pg and Perl and have no issues,
so I suspect there's a configuration issue somewhere.

And yes, I don't think you should be worrying about how Perl encodes things
internally.

>From perlunifaq:

> *I lost track; what encoding is the internal format really?*

It's good that you lost track, because you shouldn't depend on the internal
> format being any specific encoding. But since you asked: by default, the
> internal format is either ISO-8859-1 (latin-1), or utf8, depending on the
> history of the string. On EBCDIC platforms, this may be different even.
> Perl knows how it stored the string internally, and will use that
> knowledge when you encode . In other words: don't try to find out what the
> internal encoding for a certain string is, but instead just encode it into
> the encoding that you want.


https://perldoc.perl.org/perlunifaq.html#INTERNALS

Maurice

Re: DBD::Pg && handling of UTF-8 char columns

Reply via email to