a Spark Streaming program, which consumes data
from Kakfa and does the group by operation on the data. I try to optimize
the running time of the program because it looks slow to me. It seems the
stage named:
* combineByKey at ShuffledDStream.scala:42 *
always takes the longest running time
Hi all,
I am currently running a Spark Streaming program, which consumes data from
Kakfa and does the group by operation on the data. I try to optimize the
running time of the program because it looks slow to me. It seems the stage
named:
* combineByKey at ShuffledDStream.scala:42 *
always
the group by operation on the data. I try to optimize the
running time of the program because it looks slow to me. It seems the stage
named:
* combineByKey at ShuffledDStream.scala:42 *
always takes the longest running time. And If I open this stage, I only
see two executors on this stage. Does