[ https://issues.apache.org/jira/browse/SPARK-43051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-43051. ---------------------------------- Fix Version/s: 3.5.0 Resolution: Fixed Issue resolved by pull request 40686 [https://github.com/apache/spark/pull/40686] > Allow emitting zero values when deserializing protobuf messages > --------------------------------------------------------------- > > Key: SPARK-43051 > URL: https://issues.apache.org/jira/browse/SPARK-43051 > Project: Spark > Issue Type: Improvement > Components: Protobuf > Affects Versions: 3.4.0 > Reporter: Parth Upadhyay > Assignee: Parth Upadhyay > Priority: Major > Fix For: 3.5.0 > > > Currently, when deserializing protobufs using from_protobuf, fields that are > not explicitly present in the serialized message are deserialized as null in > the resulting struct. However this includes singular proto3 scalars set > explicitly to their default values, as they will [not appear > in|https://protobuf.dev/programming-guides/field_presence/#presence-in-tag-value-stream-wire-format-serialization] > the serialized protobuf. > For example, given a message format like > > {code:java} > syntax = "proto3"; > message Person { > string name = 1; > int64 age = 2; > optional string middle_name = 3; > optional int64 salary = 4; > } > {code} > and an example message like > > {code:java} > SearchRequest(age = 0, middle_name = ""){code} > the result from calling from_protobuf on the serialized form of the above > message would be > > {code:java} > {"name": null, "age": null, "middle_name": "", "salary": null}{code} > > It can be useful to deserialize these fields as their defaults, e.g.: > > {code:java} > {"name": "", "age": 0, "middle_name": "", "salary": null}{code} > > This behavior also exists in other major libraries, e.g. > * java's jsonformat `includingDefaultValues` > [https://protobuf.dev/reference/java/api-docs/com/google/protobuf/util/JsonFormat.Printer.html#includingDefaultValueFields--] > * golang's jsonpb `emitDefaults` > [https://pkg.go.dev/github.com/golang/protobuf/jsonpb#Marshaler] > I propose extending the spark-protobuf library to support this behavior. > > PR: [https://github.com/apache/spark/pull/40686] -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org