Hi all,

Spark's 100ms+ stage-launching overhead limits its applicability in
low-latency stream processing and deep learning. The Drizzle paper
published in SOSP '17 seems to solve this problem well by submitting a
group of stages together to amortize the stage-launching overhead. It is
also used by deep learning framework BigDL. Unfortunately, its current
open-source repository (https://github.com/amplab/drizzle-spark) is based
on an old version of Spark (2.1.1).

My question is: Does Spark support group-scheduling techniques like
Drizzle? If not, does Spark plan to develop this feature in the future?

Best,
Bowen Yu

Reply via email to