Re: Using spark streaming to load data from Kafka to HDFS

2015-08-22 Thread Xu (Simon) Chen
Last time I checked, Camus doesn't support storing data as parquet, which is a deal breaker for me. Otherwise it works well for my Kafka topics with low data volume. I am currently using spark streaming to ingest data, generate semi-realtime stats and publish to a dashboard, and dump full dataset

Re: Using spark streaming to load data from Kafka to HDFS

2015-05-06 Thread Saisai Shao
Also Kafka has a Hadoop consumer API for doing such things, please refer to http://kafka.apache.org/081/documentation.html#kafkahadoopconsumerapi 2015-05-06 12:22 GMT+08:00 MrAsanjar . afsan...@gmail.com: why not try https://github.com/linkedin/camus - camus is kafka to HDFS pipeline On

Re: Using spark streaming to load data from Kafka to HDFS

2015-05-06 Thread Rendy Bambang Junior
Because using spark streaming looks like a lot simpler. Whats the difference between Camus and Kafka Streaming for this case? Why Camus excel? Rendy On Wed, May 6, 2015 at 2:15 PM, Saisai Shao sai.sai.s...@gmail.com wrote: Also Kafka has a Hadoop consumer API for doing such things, please

Using spark streaming to load data from Kafka to HDFS

2015-05-05 Thread Rendy Bambang Junior
Hi all, I am planning to load data from Kafka to HDFS. Is it normal to use spark streaming to load data from Kafka to HDFS? What are concerns on doing this? There are no processing to be done by Spark, only to store data to HDFS from Kafka for storage and for further Spark processing Rendy

Re: Using spark streaming to load data from Kafka to HDFS

2015-05-05 Thread MrAsanjar .
why not try https://github.com/linkedin/camus - camus is kafka to HDFS pipeline On Tue, May 5, 2015 at 11:13 PM, Rendy Bambang Junior rendy.b.jun...@gmail.com wrote: Hi all, I am planning to load data from Kafka to HDFS. Is it normal to use spark streaming to load data from Kafka to HDFS?