Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-13 Thread Mahesh Raju Somalaraju
+1 for the streamer tool Thanks & Regards -Mahesh Raju S Githubid: maheshrajus On Tue, Aug 31, 2021 at 11:18 PM Akash Nilugal wrote: > Hi Community, > > OLTP systems like Mysql are used heavily for storing transactional data in > real-time and the same data is later used for doing fraud

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-11 Thread Pratyaksh Sharma
Hi Indhu, Apologies for the late reply. Please find the below inline answers - 1. For Multi-Table merge scenario, does it support concurrent cdc or sequential cdc to target table? > In phase 1, we are supporting a scenario where multiple tables (all with the same schema) are being pushed to

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-06 Thread Nihal ojha
+1, good idea to implement streamer tool. Regards Nihal On 2021/08/31 17:47:35, Akash Nilugal wrote: > Hi Community, > > OLTP systems like Mysql are used heavily for storing transactional data in > real-time and the same data is later used for doing fraud detection and > taking various

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-06 Thread Shreelekhya Gampa
+1 for the feature. On 2021/08/31 17:47:35, Akash Nilugal wrote: > Hi Community, > > OLTP systems like Mysql are used heavily for storing transactional data in > real-time and the same data is later used for doing fraud detection and > taking various data-driven business decisions. Since OLTP

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-06 Thread Indhumathi M
+1 for the streamer tool. I have some questions listed below. 1. For Multi-Table merge scenario, does it support concurrent cdc or sequential cdc to target table? 2. On failure scenarios (like streamer tool is killed/crashed), how we can ensure data is not duplicated on restarting the Kafka

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-01 Thread Akash r
Hi Ravi, Thanks for the approval and the questions, please find the comments below 1. Generally CDC includes IUD operations, so how are you planning to handle them? Are you planning to merge command? If yes how frequently do you want to merge it? You are right, it includes mainly the IUD

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-01 Thread Akash r
Hi Likun, Thanks for the approval. As you can see in the design doc the last section, I have divided as the scope and added points we will proceed in that manner Regards, Akash On Wed, Sep 1, 2021 at 11:54 AM Jacky Li wrote: > +1 > It is a really good feature, looking forward to it. > >

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-01 Thread Ravindra Pesala
+1 I want to understand few clarifications regarding the design. 1. Generally CDC includes IUD operations, so how are you planning to handle them? Are you planning to merge command? If yes how frequent you want to merge it? 2. How you can make sure the Kafka exactly once semantics( how can you

Re: [DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-09-01 Thread Jacky Li
+1 It is a really good feature, looking forward to it. Suggest to break it down to small tasks so that it is easy to review Regards, Jackhy On 2021/08/31 17:47:35, Akash Nilugal wrote: > Hi Community, > > OLTP systems like Mysql are used heavily for storing transactional data in > real-time

[DISCUSSION]Carbondata Streamer tool and Schema change capture in CDC merge

2021-08-31 Thread Akash Nilugal
Hi Community, OLTP systems like Mysql are used heavily for storing transactional data in real-time and the same data is later used for doing fraud detection and taking various data-driven business decisions. Since OLTP systems are not suited for analytical queries due to their row-based storage,