Re: Approach: Incremental data load from HBASE

Chetan Khatri Wed, 21 Dec 2016 08:02:11 -0800

Ok, Sure will ask.

But what would be generic best practice solution for Incremental load from
HBASE.


On Wed, Dec 21, 2016 at 8:42 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> I haven't used Gobblin.
> You can consider asking Gobblin mailing list of the first option.
>
> The second option would work.
>
>
> On Wed, Dec 21, 2016 at 2:28 AM, Chetan Khatri <
> chetan.opensou...@gmail.com> wrote:
>
>> Hello Guys,
>>
>> I would like to understand different approach for Distributed Incremental
>> load from HBase, Is there any *tool / incubactor tool* which satisfy
>> requirement ?
>>
>> *Approach 1:*
>>
>> Write Kafka Producer and maintain manually column flag for events and
>> ingest it with Linkedin Gobblin to HDFS / S3.
>>
>> *Approach 2:*
>>
>> Run Scheduled Spark Job - Read from HBase and do transformations and
>> maintain flag column at HBase Level.
>>
>> In above both approach, I need to maintain column level flags. such as 0
>> - by default, 1-sent,2-sent and acknowledged. So next time Producer will
>> take another 1000 rows of batch where flag is 0 or 1.
>>
>> I am looking for best practice approach with any distributed tool.
>>
>> Thanks.
>>
>> - Chetan Khatri
>>
>
>

Re: Approach: Incremental data load from HBASE

Reply via email to