cdmikechen edited a comment on issue #2705: URL: https://github.com/apache/hudi/issues/2705#issuecomment-804636641
I've found the problem: There is a new configuration named `hoodie.deltastreamer.schemaprovider.spark_avro_post_processor.enable` and it is `true` by default. If I use my custom transformer and set `target scheme` null, hudi will not work by a null schema. I set `target scheme` to the same as `source schema` for testing, so that spark will not work and report above errors. If I set `hoodie.deltastreamer.schemaprovider.spark_avro_post_processor.enable` to false, hudi will successfully deal with the Kafka message and write it to hdfs. However, when synchronizing hive, I encountered the same problem as this https://github.com/apache/hudi/issues/1751#issuecomment-648460431 (set `hoodie.datasource.hive_sync.use_jdbc` false). When I set `hoodie.datasource.hive_sync.use_jdbc` to true, hive-sync can work. I think hudi still lost related packets by hive3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org