Driver zombie process (standalone cluster)

Tomer Benyamini Wed, 29 Jun 2016 00:05:52 -0700

Hi,

I'm trying to run spark applications on a standalone cluster, running on
top of AWS. Since my slaves are spot instances, in some cases they are
being killed and lost due to bid prices. When apps are running during this
event, sometimes the spark application dies - and the driver process just
hangs, and stays up forever (zombie process), capturing memory / cpu
resources on the master machine. Then we have to manually kill -9 to free
these resources.


Has anyone seen this kind of problem before? Any suggested solution to work
around this problem?

Thanks,
Tomer

Driver zombie process (standalone cluster)

Reply via email to