The issue is not specific to E'\\x..' literals. A normal COPY FROM data file with ENCODING 'EUC_CN' can create text rows that later cannot be retrieved with SELECT.
This suggests that input validation for EUC_CN is only structural, while the EUC_CN-to-UTF8 conversion table is stricter. On Sat, May 2, 2026 at 10:31 AM Zhongpu Chen <[email protected]> wrote: > See the related bug report > https://www.postgresql.org/message-id/CA%2B1gyqL7uiQhfLcYWpHNUKQgHjQc7sOPthSTiaxLDZzcrGFYSg%40mail.gmail.com > > Currently PostgreSQL accepts structurally well-formed EUC_CN byte > sequences such as 0xA2A3 into text columns. The value round-trips when > client_encoding is EUC_CN, but fails when client_encoding is UTF8 because > euc_cn_to_utf8 has no mapping. > > If this behavior is intentional for compatibility, the documentation > should explicitly say that validation for some legacy encodings is > byte-structure validation, not mapping-table validation. > If it is not intentional, stricter validation could reject unassigned byte > positions at input time. > > -- > Zhongpu Chen > -- Zhongpu Chen
