[
https://issues.apache.org/jira/browse/FLINK-37528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17937228#comment-17937228
]
Sai Sharath Dandi commented on FLINK-37528:
-------------------------------------------
[~libenchao] Could you please review this Jira and assign to me
> Protobuf Format (proto3): Handle default values for optional primitive types
> and primitive types in one of fields
> ------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-37528
> URL: https://issues.apache.org/jira/browse/FLINK-37528
> Project: Flink
> Issue Type: Improvement
> Affects Versions: 2.0-preview
> Reporter: Sai Sharath Dandi
> Priority: Minor
>
> The Read Default Values is
> [forced|https://github.com/apache/flink/blob/master/flink-formats/flink-protobuf/src/main/java/org/apache/flink/formats/protobuf/deserialize/ProtoToRowConverter.java#L74]
> to be true for primitive types in proto3. This can cause bugs in some cases
> for messages like below
> {code:java}
> oneof test {
> string aa = 1;
> int32 bb = 2;
> bool cc = 3;
> Corpus dd = 4;
> } {code}
> Even if a only the first field is set in the oneOf, reading default values
> makes it so that all the fields are non-null after decoding. When such data
> is encoded back to protobuf, it will produce a different protobuf message
> than the original and cause data correctness issues.
> solution:
> {code:java}
> if (PbFormatUtils.isSimpleType(subType) && !(elementFd.getContainingOneof()
> != null || elementFd.hasOptionalKeyword())) {
> readDefaultValues =
> formatContext.isReadDefaultValuesForPrimitiveTypes();
> } {code}
> For primitive types in proto3, we can still do field presence checks when it
> is defined an optional field or it is part of a oneOf message.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)