Sam-Serpoosh commented on issue #8519: URL: https://github.com/apache/hudi/issues/8519#issuecomment-1542839452
> for the first issue regarding the schema, this is because we are fetching that schema as a string. If that class is not defined in the string, we won't know how it is defined. Maybe there is some arg to pass to the api to get the schemas that this schema relies on as well? @the-other-tim-brown That makes perfect sense and I ended up resolving that issue by simply using **Confluent Schema Registry** instead of the `Apicurio` I was previously using. Since Confluent's includes everything correctly in **one place** so Hudi/DeltaStreamer can fetch it in one-swoop properly. > For the second, it is hard to tell without looking at your data. If you pull the data locally and step through, you may have a better shot of understanding. The main thing I have seen trip people up is the requirements for the delete records in the topic. You can also try out the same patch Sydney posted above for filtering out the tombstones in kafka. I **highly** doubt in my case it's caused by a tombstone record or the like. Because I'm testing this Data-Flow on a dummy/test Postgres table to which I've **only** applied `INSERT` operations so far. And BTW, I could **successfully** get a **vanilla Kafka ingestion** running end-to-end and populate a partitioned Hudi table as expected. So definitely the issue is specific to when I switch to `PostgresDebeziumSource` and `PostgresDebeziumAvroPayload`. Thank you very much for your input. I'll try to see what's the best way to debug this and how to figure out what's causing the exception I shared above when it comes to DeltaStreamer <> Debezium ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org