I haven't used Gobblin.
You can consider asking Gobblin mailing list of the first option.

The second option would work.


On Wed, Dec 21, 2016 at 2:28 AM, Chetan Khatri <chetan.opensou...@gmail.com>
wrote:

> Hello Guys,
>
> I would like to understand different approach for Distributed Incremental
> load from HBase, Is there any *tool / incubactor tool* which satisfy
> requirement ?
>
> *Approach 1:*
>
> Write Kafka Producer and maintain manually column flag for events and
> ingest it with Linkedin Gobblin to HDFS / S3.
>
> *Approach 2:*
>
> Run Scheduled Spark Job - Read from HBase and do transformations and
> maintain flag column at HBase Level.
>
> In above both approach, I need to maintain column level flags. such as 0 -
> by default, 1-sent,2-sent and acknowledged. So next time Producer will take
> another 1000 rows of batch where flag is 0 or 1.
>
> I am looking for best practice approach with any distributed tool.
>
> Thanks.
>
> - Chetan Khatri
>

Reply via email to