Hey Jay, 
  It's awesome to get reply from one of the key Kafka contributor :) .  Thanks 
for suggesting Kafka Connect.

How does Kafka-Connect deals with HDFS small files? ( I assume setting large 
flus.size allows user to maintain minimum HDFS file size.  )
Does Kafka-Connect keep file handle open until file is committed?  ( Flume 
keeps file handles open resulting into too many files open) 
Can I write custom serializer for kafka-connect ?

Thanks,
R P

________________________________________
From: Jay Kreps <j...@confluent.io>
Sent: Thursday, February 11, 2016 11:45 AM
To: users@kafka.apache.org
Subject: Re: What is the best way to write Kafka data into HDFS?

Check out Kafka Connect:

http://www.confluent.io/blog/how-to-build-a-scalable-etl-pipeline-with-kafka-connect

-Jay


On Wed, Feb 10, 2016 at 5:09 PM, R P <hadoo...@outlook.com> wrote:

> Hello All,
>   New Kafka user here. What is the best way to write Kafka data into HDFS?
> I have looked into following options and found that Flume is quickest and
> easiest to setup.
>
> 1. Flume
> 2. KaBoom
> 3. Kafka Hadoop Loader
> 4. Camus -> Gobblin
>
> Although Flume can result into small file problems when your data is
> partitioned and some partitions generate sporadic data.
>
> What are some best practices and options to write data from Kafka to HDFS?
>
> Thanks,
> R P
>
>
>
>

Reply via email to