Hi Gurudatt, You can use Debezium for migrating historical data as well. Using Debezium will enable you to migrate existing as well as new data using DeltaStreamer. I have used it in my previous org for the same use case.
On Fri, Aug 21, 2020 at 12:30 PM [email protected] <[email protected]> wrote: > > You can use kafka to subscribe mysql binlog ,then consume historical data > directly. > For details, please refer to [1] > > > [1] http://hudi.apache.org/docs/writing_data.html > > > [email protected] > > From: Gurudatt Kulkarni > Date: 2020-08-21 15:18 > To: dev > Subject: [Question] How to use Hudi for migrating a historical mysql table? > Hi All, > > I have a use case where there is historical data available in MySQL table > which is being populated by a Kafka topic. > > My plan is to create a spark job that will migrate data from MySQL using > Hudi Datasource. Once the migration of historical data is done from MySQL, > use Deltastreamer to tap the Kafka topic for real-time data and write to > the same location. Is this possible? How to approach this problem, as it > may corrupt Hudi metadata. > > Regards, > Gurudatt >
