Having done similar, one issue was the scripts made it too easy to NOT get a clean shutdown. A clean shutdown goes through the master so it can change it's state, tell all the tservers to flush their WALs and do some other cleanup I think. Like not reassigning tablets or creating splits.
A clean shutdown is a best practice IMHO. I was unable to achieve this with systemd. Cheers Mike On Thu, Dec 12, 2019 at 2:52 PM Aishwarya Thangappa <aishwarya.thanga...@microsoft.com.invalid> wrote: > Hi everyone, > > While using fluo-muchos to deploy an Accumulo cluster, we recognized the > need for various Accumulo and Hadoop services to be run under a service > manager like systemd which will ensure that all these services are brought > up correctly in the event of VM / OS reboots / cold starts. We have made > the required changes for this and would like to contribute it back to the > community if there is any interest around it. > > Summarizing what we have done: > > * Crafted separate systemd unit files for Accumulo (master, monitor, > gc, traser, tserver), Hadoop (journalnode, namenode, datanode, > resourcemanager, nodemanager, zkfc) and Zookeeper services. > * All of these unit files will be copied to the respective nodes' > /etc/systemd/system directory; the services will then be started and > enabled by ansible systemd module. > * In case of num_tservers > 1, multiple tserver systemd units will be > copied to the node and each will be independently managed. > * Also made necessary changes to the existing cluster-wide scripts > including accumulo_cluster, accumulo_service, start_dfs, start_yarn etc., > to have them work seamlessly with sytemd. > > Is there an appetite to look at the details? If so, we can post a PR or if > there are any feedbacks and other considerations, please let us know and we > can discuss them. > > Thanks, > Aishwarya > >