Hi everyone,
Currently, the Datagen connector generates data that doesn't match the schema
definition
when dealing with fixed-length and variable-length fields. It defaults to a
unified length of 100
and requires manual configuration by the user. This violates the correctness of
schema constraints
and hampers ease of use.
Jane Chan and I have discussed offline and I will summarize our discussion
below.
To enhance the datagen connector to automatically generate data that conforms
to the schema
definition without additional manual configuration, we propose handling the
following data types
appropriately [1]:
1. For fixed-length data types (char, binary), the length should be
defined by the schema definition
and not be user-defined.
2. For variable-length data types (varchar, varbinary), the length should
be defined by the schema
definition, but allow for user-defined lengths that are smaller than
the schema definition.
Looking forward to your feedback :)
[1] https://issues.apache.org/jira/browse/FLINK-32993
Best,
Yubin