Someone on the list is looking at monitoring hadoop features with nagios. Nagios can be configured with an event_handler. In the past I have written event handlers to do operations like this. If down --- use SSH key and restart.
However....Since you have an SSH key on your master node, you should be able to have a centralized node restarter running from the master cron. Maybe an interesting argument to run a separate nagios as your hadoop user! In any case you can also run a cronjob on each slave as suggested above. The thing about all systems like this is you have to remember to shut them down when you actually want the service down for service etc. We run Nagios and cacti so I would like to develop check scripts for these services. I am going to get SVN repo together if anyone is interested in contributing let me know.