haruki-830 opened a new pull request, #4447:
URL: https://github.com/apache/flink-cdc/pull/4447
### Summary
This commit fixes CHAR / VARCHAR mapping in the Flink CDC StarRocks
connector for utf8mb4 characters by introducing a configurable
unicode-char.max-bytes option.
### Key Changes
**Configurable Character Length Mapping**
- Added a new optional StarRocks sink option: unicode-char.max-bytes
- Default value is 3, preserving existing behavior
- Users can set it to 4 for utf8mb4 sources to avoid underestimating target
column lengths
**Schema Mapping and Evolution Support**
- Updated StarRocksUtils.CdcDataTypeTransformer to use a configurable byte
multiplier instead of the hard-coded 3
- Integrated the option into TableCreateConfig
- Updated create table, add column, and alter column type paths to honor the
configured value
- Preserved existing primary key handling and length capping behavior
**Validation, Tests, and Docs**
- Added validation to ensure unicode-char.max-bytes is positive
- Added unit and integration test coverage for both default behavior and
unicode-char.max-bytes = 4
- Updated English and Chinese StarRocks connector docs to document the new
option
### Configuration Example
Enable 4-byte character length mapping for utf8mb4 sources
```
sink:
type: starrocks
jdbc-url: jdbc:mysql://fe_host1:9030
load-url: fe_host1:8030
username: root
password: password
unicode-char.max-bytes: 4
```
Default behavior remains unchanged
```
sink:
type: starrocks
jdbc-url: jdbc:mysql://fe_host1:9030
load-url: fe_host1:8030
username: root
password: password
```
### JIRA Reference
[https://issues.apache.org/jira/browse/FLINK-39759](url)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]