Spark UI Streaming batch time interval does not match batch interval

2018-03-12 Thread Jordan Pilat
Hello, I am running a streaming app on Spark 2.1.2. The batch interval is set to 5000ms, and when I go to the "Streaming" tab in the Spark UI, it correctly reports a 5 second batch interval, but the list of batches below only shows one batch every two minutes (IE the batch time for each batch

Re: Spark or Storm

2015-06-17 Thread Jordan Pilat
not being able to read from Kafka using multiple nodes Kafka is plenty capable of doing this, by clustering together multiple consumer instances into a consumer group. If your topic is sufficiently partitioned, the consumer group can consume the topic in a parallelized fashion. If it isn't, you