Hi

I have a spark project running on 4 Core 16GB (both master/worker)
instance, now can anyone tell me what are all the things to keep monitoring
so that my cluster/jobs will never go down?

I have created a small list which includes the following items, please
extend the list if you know more:

1. Monitor Spark Master/Worker from failing
2. Monitor HDFS from getting filled/going down
3. Monitor network connectivity for master/worker
4. Monitor Spark Jobs from getting killed


-- 
Thanks
Best Regards

Reply via email to