I have a process in Spark Streamin which lasts 2 seconds. When I check
where the time is spent I see about 0.8s-1s in processing time although the
global time is 2s. This one second is spent in the driver.
I reviewed the code which is executed by the driver and I commented some of
this code with the same result. So I don't have any idea where the time is
spent.

Righ now, I'm executing in client mode from one the node inside the cluster
so I can't set the number the cores to the driver (although I don't think
that it's going to make the difference) .

How could I know where the driver is spending the time? I'm not sure if it
possible to improve the performance in this point or that second is spent
scheduling the graph of each microbatch mainly

Reply via email to