What are all the things to Monitor to keep the spark jobs from failure

Akhil Das Fri, 14 Feb 2014 08:42:24 -0800

Hi

I have a spark project running on 4 Core 16GB (both master/worker)
instance, now can anyone tell me what are all the things to keep monitoring
so that my cluster/jobs will never go down?


I have created a small list which includes the following items, please
extend the list if you know more:

1. Monitor Spark Master/Worker from failing
2. Monitor HDFS from getting filled/going down
3. Monitor network connectivity for master/worker
4. Monitor Spark Jobs from getting killed


-- 
Thanks
Best Regards

What are all the things to Monitor to keep the spark jobs from failure

Reply via email to