[ 
https://issues.apache.org/jira/browse/FLINK-38110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18021801#comment-18021801
 ] 

haiqingchen commented on FLINK-38110:
-------------------------------------

[~ouyangwuli] could you help review the pull request? 
https://github.com/apache/flink-cdc/pull/4128

> PostgreSQL connector reads Chinese columns with garbled characters
> ------------------------------------------------------------------
>
>                 Key: FLINK-38110
>                 URL: https://issues.apache.org/jira/browse/FLINK-38110
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.4.0
>            Reporter: haiqingchen
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: image-2025-07-17-14-53-02-657.png
>
>
> When there's column name in Chinese in PG tables, Postgresql connector with 
> pgoutput plugin will decode them as garbled characters, especially during 
> incremental capure.
> The reason is when handling column names and table names,
> io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder
> doesn't convert the String to utf8 charset,
> {code:java}
> private static String readString(ByteBuffer buffer) {
>     StringBuilder sb = new StringBuilder();
>     boolean var2 = false;
>     byte b;
>     while((b = buffer.get()) != 0) {
>         sb.append((char)b);
>     }
>     return sb.toString();
> } {code}
> while when it handle column value,  it will convert the string into utf8 
> charset.
> {code:java}
> private static String readColumnValueAsString(ByteBuffer buffer) {
>     int length = buffer.getInt();
>     byte[] value = new byte[length];
>     buffer.get(value, 0, length);
>     return new String(value, Charset.forName("UTF-8"));
> } {code}
> My solution is 
> copy PgOutputMessageDecoder from debezium and fix the readString to reading 
> utf8 string



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to