- Do you happen to see how busy are the nodes in terms of CPU and how
much heap each executor is allocated with.
- If there is enough capacity ,you may want to increase number of cores
per executor to 2 and do the needed heap tweaking.
- How much time did it take to process 4M+
I have 2 machines in my cluster with the below specifications:
128 GB RAM and 8 cores machine
Regards,
~Vinti
On Sun, Mar 6, 2016 at 7:54 AM, Vinti Maheshwari
wrote:
> Thanks Supreeth and Shahbaz. I will try adding
> spark.streaming.kafka.maxRatePerPartition.
>
> Hi
Thanks Supreeth and Shahbaz. I will try adding
spark.streaming.kafka.maxRatePerPartition.
Hi Shahbaz,
Please see comments, inline:
- Which version of Spark you are using. ==> *1.5.2*
- How big is the Kafka Cluster ==> *2 brokers*
- What is the Message Size and type.==>
*String, 9,550
Try setting spark.streaming.kafka.maxRatePerPartition, this can help control
the number of messages read from Kafka per partition on the spark streaming
consumer.
-S
> On Mar 5, 2016, at 10:02 PM, Vinti Maheshwari wrote:
>
> Hello,
>
> I am trying to figure out why my
Hello,
I am trying to figure out why my kafka+spark job is running slow. I found
that spark is consuming all the messages out of kafka into a single batch
itself and not sending any messages to the other batches.
2016/03/05 21:57:05