rangadi opened a new pull request, #44265:
URL: https://github.com/apache/spark/pull/44265

   This is a cherry-pick of #44214 into 3.4 branch.
   
   From the original PR:
   
   ### What changes were proposed in this pull request?
   This updates the the behavior of `from_protobuf()` built function when 
underlying record fails to deserialize.
   
     * **Current behvior**:
       * By default, this would throw an error and the query fails. [This part 
is not changed in the PR]
       * When `mode` is set to 'PERMISSIVE' it returns a non-null struct with 
each of the inner fields set to null e.g. `{ "field_a": null, "field_b": null 
}`  etc.
          * This is not very convenient to the users. They don't know if this 
was due to malformed record or if the input itself has null. It is very hard to 
check for each field for null in SQL query (imagine a sql query with a struct 
that has 10 fields).
   
     * **New behavior**
       * When `mode` is set to 'PERMISSIVE' it simply returns `null`.
   
   ### Why are the changes needed?
   This makes it easier for users to detect and handle malformed records.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, but this does not change the contract. In fact, it clarifies it.
   
   ### How was this patch tested?
    - Unit tests are updated.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to