RE: Kafka or Flume

JP gupta Thu, 29 Jun 2017 21:35:19 -0700

The ideal sequence should be:

1.      Ingress using Kafka -> Validation and processing using Spark -> Write 
into any NoSql DB or Hive.

>From my recent experience, writing directly to HDFS can be slow depending on 
>the data format.

Thanks

JP 

From: Sudeep Singh Thakur [mailto:[email protected]] 
Sent: 30 June 2017 09:26
To: Sidharth Kumar
Cc: Maggy; [email protected]
Subject: Re: Kafka or Flume

In your use Kafka would be better because you want some transformations and 
validations.

Kind regards,
Sudeep Singh Thakur

On Jun 30, 2017 8:57 AM, "Sidharth Kumar" <[email protected]> wrote:

Hi,

I have a requirement where I have all transactional data injestion into hadoop 
in real time and before storing the data into hadoop, process it to validate 
the data. If the data failed to pass validation process , it will not be stored 
into hadoop. The validation process also make use of historical data which is 
stored in hadoop. So, my question is which injestion tool will be best for this 
Kafka or Flume?

Any suggestions will be a great help for me.

Warm Regards

Sidharth Kumar | Mob: +91 8197 555 599/7892 192 367 |  
LinkedIn:www.linkedin.com/in/sidharthkumar2792

RE: Kafka or Flume

Reply via email to