cdmikechen edited a comment on issue #2705:
URL: https://github.com/apache/hudi/issues/2705#issuecomment-804636641


   I've found the problem: 
   There is a new configuration named 
`hoodie.deltastreamer.schemaprovider.spark_avro_post_processor.enable` and it 
is `true` by default. If I use my custom transformer and set `target scheme` 
null, hudi will not work by a null schema.
   I set `target scheme` to the same as `source schema` for testing, so that 
spark will not work and report above errors. If I set 
`hoodie.deltastreamer.schemaprovider.spark_avro_post_processor.enable` to 
false, hudi will successfully deal with the Kafka message and write it to hdfs.
   
   However, when synchronizing hive, I encountered the same problem as this  
https://github.com/apache/hudi/issues/1751#issuecomment-648460431 (set 
`hoodie.datasource.hive_sync.use_jdbc` false). When I set 
`hoodie.datasource.hive_sync.use_jdbc` to true, hive-sync can work. I think 
hudi still lost related packets by hive3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to