Another take: * Debezium<https://debezium.io/documentation/reference/stable/connectors/mysql.html> to read Write Ahead logs(WAL) and send to Kafka * Kafka connect to write to cloud storage -> Hive * OR
* Spark streaming to parse WAL -> Storage -> Hive Regards ________________________________ From: Gibson <[email protected]> Sent: 17 August 2022 16:53 To: Akash Vellukai <[email protected]> Cc: [email protected] <[email protected]> Subject: [EXTERNAL] Re: Spark streaming - Data Ingestion Caution! This email originated outside of FedEx. Please do not open attachments or click links from an unknown or suspicious origin. If you have space for a message log like, then you should try: MySQL -> Kafka (via CDC) -> Spark (Structured Streaming) -> HDFS/S3/ADLS -> Hive On Wed, Aug 17, 2022 at 5:40 PM Akash Vellukai <[email protected]<mailto:[email protected]>> wrote: Dear sir I have tried a lot on this could you help me with this? Data ingestion from MySql to Hive with spark- streaming? Could you give me an overview. Thanks and regards Akash P
