Thanks, Christopher. I see your point. The changes to the accumulo-cluster scripts aside,
1. Is there a value in landing the systemd changes in muchos repo? If it is deemed valuable, we can put up a PR with the systemd units as template files and ansible tasks to copy these to the cluster nodes and enable/start them. This will be easy for us to upstream as we already have the work done. 2. Alternatively would you find value if we re-worked a set of shell scripts which would do the equivalent of above changes and have a PR opened against the Accumulo repo? 2.1 . In this case, would reference scripts to do the start/stop operations using systemd similar to that of accumulo-cluster scripts be of value? 2.2 . We found that it was necessary to do minor changes to accumulo-service script to support the multiple tserver case. Is there any concerns on modifying it? And, not sure why you are getting a 404 on the gist files. I am able to access them from a private browser window without issues. On 2019/12/18 01:54:00, Christopher <ctubb...@apache.org> wrote: > On Tue, Dec 17, 2019 at 8:07 PM Aishwarya Thangappa > <aishwarya.thanga...@gmail.com> wrote: > > > > Sorry, I wasn't aware that attachments are not allowed in ASF Mailing > > lists. I have now created them as gists. Please have a look. > > > > master systemd unit: > > https://gist.github.com/ata18/e8f7577c99cd08ba46544aacef26969f > > accumulo-service: > > https://gist.github.com/ata18/48014ea78b09e4febb88480ea48ed62c > > These first two links don't work for me. I get a 404 error message. > > For reference, here's the basic unit files I wrote for Accumulo from > Fedora 29: https://src.fedoraproject.org/rpms/accumulo/tree/f29 > They used a /usr/bin/accumulo script generated using the > %jpackage_script macro (see accumulo.spec file for that) which worked > a lot like Accumulo 2.0's bin/accumulo file works (not a coincidence, > since the 2.0 script was written with insight gained from the attempt > to package in Fedora). > > > accumulo-cluster: > > https://gist.github.com/ata18/234c2e63d2718aec65bd2037ec3125cd > > This appears to be based on an older version of our accumulo-cluster > script (from 2.0?) rather than the current one in the master branch, > but I think I got the sense of what was changed by glancing at the > diff. Once you have systemd, I'm not convinced it's beneficial to use > something like accumulo-cluster anymore, as it doesn't really serve > any added value beyond what you would get with using systemctl via > pssh or pdsh and a hostsfile. The accumulo-cluster script's purpose is > for when you don't have an existing service management tool for the > cluster, and its intent is to be very basic, to support the "deploy > out of tarball" use case, with no other vendor or downstream > packaging. Modifying it to wrap systemd seems a bit unnecessarily > complex to me, since I don't think you need it when using systemd. > > It might be better to create a simpler script that makes it easy to > run specific tasks using pdsh or pssh, a hostsfile, to be used when > using systemd, rather than trying to put those features into the > accumulo-cluster script. > > > > > Thanks, > > Aishwarya > > > > On 2019/12/15 16:16:56, Michael Wall <mjw...@gmail.com> wrote: > > > Hi Aishwarya, > > > > > > I didn't get any attachments on this. > > > > > > Thanks > > > > > > Mike > > > > > > On Fri, Dec 13, 2019 at 5:46 PM Aishwarya Thangappa > > > <aishwarya.thanga...@microsoft.com.invalid> wrote: > > > > > > > Hello everyone, > > > > > > > > I have not subscribed to the dev mailing list earlier and missed on some > > > > of your questions. I will address them here. > > > > > > > > @Christopher > > > > Most of the changes except the actual installation of the systemd units > > > > could possibly go into Accumulo. These would be the systemd units for > > > > various accumulo services, modification to cluster-wide scripts in > > > > accumulo > > > > to use systemd instead of directly starting/stopping the processes. We > > > > would be happy to accommodate/answer any suggestions or follow-up > > > > questions > > > > you may have. > > > > > > > > Attached the accumulo_cluster and accumulo_service scripts with systemd > > > > changes. > > > > > > > > > > > > @Keith Turner > > > > Once we determine where the different pieces should land, I can post PRs > > > > accordingly. In our current setup, in muchos.properties file I have > > > > added a > > > > `use_systemd` flag which when set to true, will overwrite the accumulo > > > > cluster-wide scripts in the nodes with the attached ones. These files > > > > currently reside at ansible/roles/accumulo/files. If we determine that > > > > these scripts and the systemd unit files will instead go to Accumulo > > > > project, I will have to make changes accordingly. > > > > > > > > @Michael Wall > > > > Systemd units internally call the same scripts that accumulo_cluster > > > > commands currently use. The change is that accumulo_cluster commands > > > > would > > > > call systemd start/stop which inturn would call accumulo_service > > > > commands. > > > > Attached a sample systemd_unit template. Can you please elaborate if > > > > this > > > > is still an issue? > > > > > > > > ------------------------------ > > > > *From:* Aishwarya Thangappa > > > > *Sent:* Thursday, December 12, 2019 11:25 AM > > > > *To:* dev@fluo.apache.org <dev@fluo.apache.org> > > > > *Cc:* Arvind Shyamsundar <arvin...@microsoft.com>; Billie Rinaldi < > > > > billie.rina...@microsoft.com> > > > > *Subject:* Run Accumulo and Hadoop services under systemd > > > > > > > > Hi everyone, > > > > > > > > While using fluo-muchos to deploy an Accumulo cluster, we recognized the > > > > need for various Accumulo and Hadoop services to be run under a service > > > > manager like systemd which will ensure that all these services are > > > > brought > > > > up correctly in the event of VM / OS reboots / cold starts. We have made > > > > the required changes for this and would like to contribute it back to > > > > the > > > > community if there is any interest around it. > > > > > > > > Summarizing what we have done: > > > > > > > > - Crafted separate systemd unit files for Accumulo (master, monitor, > > > > gc, traser, tserver), Hadoop (journalnode, namenode, datanode, > > > > resourcemanager, nodemanager, zkfc) and Zookeeper services. > > > > - All of these unit files will be copied to the respective nodes' > > > > /etc/systemd/system directory; the services will then be started and > > > > enabled by ansible systemd module. > > > > - In case of num_tservers > 1, multiple tserver systemd units will be > > > > copied to the node and each will be independently managed. > > > > - Also made necessary changes to the existing cluster-wide scripts > > > > including accumulo_cluster, accumulo_service, start_dfs, start_yarn > > > > etc., > > > > to have them work seamlessly with sytemd. > > > > > > > > Is there an appetite to look at the details? If so, we can post a PR or > > > > if > > > > there are any feedbacks and other considerations, please let us know > > > > and we > > > > can discuss them. > > > > > > > > Thanks, > > > > Aishwarya > > > > > > > > > > > >