Hi Gurudatt,

You can use Debezium for migrating historical data as well. Using Debezium
will enable you to migrate existing as well as new data using
DeltaStreamer. I have used it in my previous org for the same use case.

On Fri, Aug 21, 2020 at 12:30 PM [email protected] <[email protected]>
wrote:

>
> You can use kafka to subscribe  mysql binlog ,then consume historical data
> directly.
> For details, please refer to [1]
>
>
> [1] http://hudi.apache.org/docs/writing_data.html
>
>
> [email protected]
>
> From: Gurudatt Kulkarni
> Date: 2020-08-21 15:18
> To: dev
> Subject: [Question] How to use Hudi for migrating a historical mysql table?
> Hi All,
>
> I have a use case where there is historical data available in MySQL table
> which is being populated by a Kafka topic.
>
> My plan is to create a spark job that will migrate data from MySQL using
> Hudi Datasource. Once the migration of historical data is done from MySQL,
> use Deltastreamer to tap the Kafka topic for real-time data and write to
> the same location. Is this possible? How to approach this problem, as it
> may corrupt Hudi metadata.
>
> Regards,
> Gurudatt
>

Reply via email to