[
https://issues.apache.org/jira/browse/SPARK-46275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun closed SPARK-46275.
---------------------------------
> Protobuf: Permissive mode should return null rather than struct with null
> fields
> --------------------------------------------------------------------------------
>
> Key: SPARK-46275
> URL: https://issues.apache.org/jira/browse/SPARK-46275
> Project: Spark
> Issue Type: Bug
> Components: Protobuf, Structured Streaming
> Affects Versions: 3.4.2, 3.5.0, 4.0.0
> Reporter: Raghu Angadi
> Assignee: Raghu Angadi
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.5.1, 3.4.3, 4.0.0
>
>
> Consider a protobuf with two fields {{message Person { string name = 1; int
> id = 2; }}
> * The struct returned by {{from_protobuf("Person")}} like this:
> ** STRUCT<name STRING, id INT>
> * If the underlying binary record fails to deserialize, it results in a
> exception and query fails.
> * Buf if the option {{mode}} is set to {{PERMISSIVE}} , malformed records
> are tolerated {{null}} is returned.
> ** {*}BUT{*}: The retuned struct looks like this \{"name: null, id: "null"}
> *
> **
> *** This is not convenient to the user.
> *** *Ideally,* {{from_protobuf()}} *should return* {{null}} *.*
> ** {{from_protobuf()}} borrowed the current behavior from {{from_avro()}}
> implementation. It is not clear what the motivation was.
> I think we should update the implementation to return {{null}} rather than a
> struct with null-fields inside.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]