Latest Release of Receiver based Kafka Consumer for Spark Streaming.

2016-08-25 Thread Dibyendu Bhattacharya
Hi , Released latest version of Receiver based Kafka Consumer for Spark Streaming. Receiver is compatible with Kafka versions 0.8.x, 0.9.x and 0.10.x and All Spark Versions Available at Spark Packages : https://spark-packages.org/package/dibbhatt/kafka-spark-consumer Also at github :

Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2015-10-14 Thread Dibyendu Bhattacharya
Hi, I have raised a JIRA ( https://issues.apache.org/jira/browse/SPARK-11045) to track the discussion but also mailing dev group for your opinion. There are some discussions already happened in Jira and love to hear what others think. You can directly comment against the Jira if you wish. This

Re: Spark Streaming with Tachyon : Data Loss on Receiver Failure due to WAL error

2015-09-26 Thread Dibyendu Bhattacharya
byendu, > > How does one go about configuring spark streaming to use tachyon as its > place for storing checkpoints? Also, can one do this with tachyon running > on a completely different node than where spark processes are running? > > Thanks > Nikunj > > > On Thu

Re: Spark Streaming with Tachyon : Data Loss on Receiver Failure due to WAL error

2015-05-21 Thread Dibyendu Bhattacharya
FileSystem interface, is returning zero. On Mon, May 11, 2015 at 4:38 AM, Dibyendu Bhattacharya dibyendu.bhattach...@gmail.com wrote: Just to follow up this thread further . I was doing some fault tolerant testing of Spark Streaming with Tachyon as OFF_HEAP block store. As I said in earlier

Re: Spark Streaming with Tachyon : Some findings

2015-05-08 Thread Dibyendu Bhattacharya
-Dtachyon.worker.hierarchystore.level1.dirs.path=/mnt/tachyon -Dtachyon.worker.hierarchystore.level1.dirs.quota=50GB -Dtachyon.worker.allocate.strategy=MAX_FREE -Dtachyon.worker.evict.strategy=LRU Regards, Dibyendu On Thu, May 7, 2015 at 1:46 PM, Dibyendu Bhattacharya dibyendu.bhattach

Spark Streaming with Tachyon : Some findings

2015-05-07 Thread Dibyendu Bhattacharya
Dear All , I have been playing with Spark Streaming on Tachyon as the OFF_HEAP block store . Primary reason for evaluating Tachyon is to find if Tachyon can solve the Spark BlockNotFoundException . In traditional MEMORY_ONLY StorageLevel, when blocks are evicted , jobs failed due to block not

Re: Which committers care about Kafka?

2014-12-19 Thread Dibyendu Bhattacharya
Hi, Thanks to Jerry for mentioning the Kafka Spout for Trident. The Storm Trident has done the exact-once guarantee by processing the tuple in a batch and assigning same transaction-id for a given batch . The replay for a given batch with a transaction-id will have exact same set of tuples and

Re: Some Serious Issue with Spark Streaming ? Blocks Getting Removed and Jobs have Failed..

2014-09-12 Thread Dibyendu Bhattacharya
ContextCleaner? I met very similar issue before…but haven’t get resolved Best, -- Nan Zhu On Thursday, September 11, 2014 at 10:13 AM, Dibyendu Bhattacharya wrote: Dear All, Not sure if this is a false alarm. But wanted to raise to this to understand what is happening. I am testing

Re: Low Level Kafka Consumer for Spark

2014-08-24 Thread Dibyendu Bhattacharya
to understand the details, but I want to do it really soon. In particular, I want to understand the improvements, over the existing Kafka receiver. And its fantastic to see such contributions from the community. :) TD On Tue, Aug 5, 2014 at 8:38 AM, Dibyendu Bhattacharya dibyendu.bhattach

Re: Low Level Kafka Consumer for Spark

2014-08-05 Thread Dibyendu Bhattacharya
. It's great to see community effort on adding new streams/receivers, adding a Java API for receivers was something we did specifically to allow this :) - Patrick On Sat, Aug 2, 2014 at 10:09 AM, Dibyendu Bhattacharya dibyendu.bhattach...@gmail.com wrote: Hi, I have implemented a Low Level

Low Level Kafka Consumer for Spark

2014-08-02 Thread Dibyendu Bhattacharya
Hi, I have implemented a Low Level Kafka Consumer for Spark Streaming using Kafka Simple Consumer API. This API will give better control over the Kafka offset management and recovery from failures. As the present Spark KafkaUtils uses HighLevel Kafka Consumer API, I wanted to have a better