bozhang2820 opened a new pull request #30224:
URL: https://github.com/apache/spark/pull/30224


   ### What changes were proposed in this pull request?
   This change is to support user provided nullable Avro schema for data with 
non-nullable catalyst schema in Avro writing. 
   
   Without this change, when users try to use a nullable Avro schema to write 
data with a non-nullable catalyst schema, it will throw an 
`IncompatibleSchemaException` with a message like `Cannot convert Catalyst type 
StringType to Avro type ["null","string"]`. With this change it will assume 
that the data is non-nullable, log a warning message for the nullability 
difference and serialize the data to Avro format with the nullable Avro schema 
provided.
   
   ### Why are the changes needed?
   This change is needed because sometimes our users do not have full control 
over the nullability of the Avro schemas they use, and this change provides 
them with the flexibility.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. Users are allowed to use nullable Avro schemas for data with 
non-nullable catalyst schemas in Avro writing after the change.
   
   ### How was this patch tested?
   Added unit tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to