You can use kafka to subscribe mysql binlog ,then consume historical data directly. For details, please refer to [1]
[1] http://hudi.apache.org/docs/writing_data.html [email protected] From: Gurudatt Kulkarni Date: 2020-08-21 15:18 To: dev Subject: [Question] How to use Hudi for migrating a historical mysql table? Hi All, I have a use case where there is historical data available in MySQL table which is being populated by a Kafka topic. My plan is to create a spark job that will migrate data from MySQL using Hudi Datasource. Once the migration of historical data is done from MySQL, use Deltastreamer to tap the Kafka topic for real-time data and write to the same location. Is this possible? How to approach this problem, as it may corrupt Hudi metadata. Regards, Gurudatt
