afilipchik commented on pull request #1514: URL: https://github.com/apache/incubator-hudi/pull/1514#issuecomment-632839321
@umehrot2 yep, it is attempt to fix schema generated by spark-avro. Moving generation in house makes sense, but, if I recall correctly, the issue is not coming from spark itself but from underlying library they are using. So, it can be a bit of work to rewrite it. On the test case -> incoming dataset is transformed using Spark Sql with schema derived from the query result (NullTargetConverter). Then we add new field to the output, write a batch and run a compaction. At this point new schema can't be used to read old data as it will fail on new non default fields. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org