Hello Guys, I would like to understand different approach for Distributed Incremental load from HBase, Is there any *tool / incubactor tool* which satisfy requirement ?
*Approach 1:* Write Kafka Producer and maintain manually column flag for events and ingest it with Linkedin Gobblin to HDFS / S3. *Approach 2:* Run Scheduled Spark Job - Read from HBase and do transformations and maintain flag column at HBase Level. In above both approach, I need to maintain column level flags. such as 0 - by default, 1-sent,2-sent and acknowledged. So next time Producer will take another 1000 rows of batch where flag is 0 or 1. I am looking for best practice approach with any distributed tool. Thanks. - Chetan Khatri