setup and cleanup function in spark

2017-08-29 Thread Mohammad Kargar
To implement setup/cleanup function in Spark we follow the pattern below as discussed here . rdd.mapPartitions { partition => if (!partition.isEmpty) { // Some setup code here

OffsetOutOfRangeException

2017-03-14 Thread Mohammad Kargar
To work around an out of space issue in a Direct Kafka Streaming application we create topics with a low retention policy (retention.ms=30) which works fine from Kafka perspective. However this results into OffsetOutOfRangeException in Spark job (red line below). Is there any configuration in

Re: streaming-kafka-0-8-integration (direct approach) and monitoring

2017-02-14 Thread Mohammad Kargar
I will. Thanks anyway Mohammad On Feb 14, 2017 7:24 PM, "Cody Koeninger" <c...@koeninger.org> wrote: > Not sure what to tell you at that point - maybe compare what is > present in ZK to a known working group. > > On Tue, Feb 14, 2017 at 9:06 PM, Mohammad Kargar

Re: streaming-kafka-0-8-integration (direct approach) and monitoring

2017-02-14 Thread Mohammad Kargar
ct to ZK and do e.g. > > get /consumers/mygroup/offsets/test/0 > > If those don't exist, those are the ZK nodes you would need to make > sure get created / updated from your spark job. > > > > On Tue, Feb 14, 2017 at 8:40 PM, Mohammad Kargar <mkar...@phemi.com> > wrot

streaming-kafka-0-8-integration (direct approach) and monitoring

2017-02-14 Thread Mohammad Kargar
As explained here , direct approach of integration between spark streaming and kafka does not update offsets in Zookeeper, hence Zookeeper-based Kafka monitoring tools will not show progress (details). We followed the