Hello Christopher and everyone else in this thread, First of all, I hope everyone is doing well and safe with everything that is going on in the world. I am sorry for such a long delay in responding to this email and thank you all for your valuable inputs. I have now submitted a PR incorporating your suggestions in fluo-muchos. Please take a look when you have time.
https://github.com/apache/fluo-muchos/pull/334 Thanks, Aishwarya On 2019/12/20 05:43:38, Christopher <ctubb...@apache.org> wrote: > On Thu, Dec 19, 2019 at 9:57 PM Aishwarya Thangappa > <aishwarya.thanga...@gmail.com> wrote: > > > > Thanks, Christopher. I see your point. The changes to the accumulo-cluster > > scripts aside, > > > > 1. Is there a value in landing the systemd changes in muchos repo? If it is > > deemed valuable, we can put up a PR with the systemd units as template > > files and ansible tasks to copy these to the cluster nodes and enable/start > > them. This will be easy for us to upstream as we already have the work done. > > There is probably some value in that, assuming the use cases Keith > mentioned aren't made more difficult. But, the details of the changes > might matter. > > > > > 2. Alternatively would you find value if we re-worked a set of shell > > scripts which would do the equivalent of above changes and have a PR opened > > against the Accumulo repo? > > That would very much depend on the details, but I am wary of adding > downstream integration tooling directly into Accumulo's main > repository, even if it had significant added value, rather than have > such tooling live along side it separately in its own repo (possibly > as another repo maintained by the Accumulo PMC, or by a community > member). This is because the Accumulo PMC cannot possibly maintain > everything of value that is marginally related to Accumulo under its > own umbrella. I've seen projects try to do that, and it doesn't go > well. > > > 2.1 . In this case, would reference scripts to do the start/stop > > operations using systemd similar to that of accumulo-cluster scripts be of > > value? > > Perhaps yes, but probably not maintained in Accumulo's main repo. > However, I think it would make a good blog post on Accumulo's website, > either way. > > > 2.2 . We found that it was necessary to do minor changes to > > accumulo-service script to support the multiple tserver case. Is there any > > concerns on modifying it? > > There's a lot to say about accumulo-service, so I'll try to be brief. > In short, I don't think accumulo-service (and accumulo-cluster) should > be used for for systemd integration. Work was done in bin/accumulo in > 2.0 to more easily support downstream integration by dramatically > simplifying its implementation. This allowed > accumulo-cluster/accumulo-service to be easily created as one such set > of "downstream" tools that built off of the simplicity of the new > bin/accumulo, and which was provided within the main repo as > convenient out-of-the-box cluster management / service management > tools for when we build the binary tarball. However, they were not > intended as integration points for downstream tools... bin/accumulo > was. > > As for accumulo-service: > > 1. accumulo-service uses old SysV init patterns for managing services, > none of which are needed under systemd > 2. it does PIDfile stuff that is unnecessary to do at all with systemd > (assuming Type=simple, which is what you should probably use, since > you don't need to background it, not Type=forking; and even if you did > use forking, systemd has its own way of managing PIDfiles) > 3. it does custom, manual log file rotation stuff, which we probably > should never have had in there at all, but definitely isn't needed > with systemd/journald > 4. supporting multiple tservers is so much simpler with unit files > using systemd instances (parameter injection in unit file templates) > 5. accumulo-service should really only be used by accumulo-cluster, or > perhaps as part of a suite of legacy SysV init scripts > > accumulo-cluster and accumulo-service go together, and were written > with a specific use case in mind. Systemd integration is an altogether > different use case, and I think a much simpler set of tooling could be > built using systemd and bin/accumulo than it could by trying to use > accumulo-service in a way it wasn't intended to be used (but > bin/accumulo was). > > > > > And, not sure why you are getting a 404 on the gist files. I am able to > > access them from a private browser window without issues. > > Sorry, I figured this out. The href got mangled in the HTML version of > the email. > > > > > On 2019/12/18 01:54:00, Christopher <ctubb...@apache.org> wrote: > > > On Tue, Dec 17, 2019 at 8:07 PM Aishwarya Thangappa > > > <aishwarya.thanga...@gmail.com> wrote: > > > > > > > > Sorry, I wasn't aware that attachments are not allowed in ASF Mailing > > > > lists. I have now created them as gists. Please have a look. > > > > > > > > master systemd unit: > > > > https://gist.github.com/ata18/e8f7577c99cd08ba46544aacef26969f > > > > accumulo-service: > > > > https://gist.github.com/ata18/48014ea78b09e4febb88480ea48ed62c > > > > > > These first two links don't work for me. I get a 404 error message. > > > > > > For reference, here's the basic unit files I wrote for Accumulo from > > > Fedora 29: https://src.fedoraproject.org/rpms/accumulo/tree/f29 > > > They used a /usr/bin/accumulo script generated using the > > > %jpackage_script macro (see accumulo.spec file for that) which worked > > > a lot like Accumulo 2.0's bin/accumulo file works (not a coincidence, > > > since the 2.0 script was written with insight gained from the attempt > > > to package in Fedora). > > > > > > > accumulo-cluster: > > > > https://gist.github.com/ata18/234c2e63d2718aec65bd2037ec3125cd > > > > > > This appears to be based on an older version of our accumulo-cluster > > > script (from 2.0?) rather than the current one in the master branch, > > > but I think I got the sense of what was changed by glancing at the > > > diff. Once you have systemd, I'm not convinced it's beneficial to use > > > something like accumulo-cluster anymore, as it doesn't really serve > > > any added value beyond what you would get with using systemctl via > > > pssh or pdsh and a hostsfile. The accumulo-cluster script's purpose is > > > for when you don't have an existing service management tool for the > > > cluster, and its intent is to be very basic, to support the "deploy > > > out of tarball" use case, with no other vendor or downstream > > > packaging. Modifying it to wrap systemd seems a bit unnecessarily > > > complex to me, since I don't think you need it when using systemd. > > > > > > It might be better to create a simpler script that makes it easy to > > > run specific tasks using pdsh or pssh, a hostsfile, to be used when > > > using systemd, rather than trying to put those features into the > > > accumulo-cluster script. > > > > > > > > > > > Thanks, > > > > Aishwarya > > > > > > > > On 2019/12/15 16:16:56, Michael Wall <mjw...@gmail.com> wrote: > > > > > Hi Aishwarya, > > > > > > > > > > I didn't get any attachments on this. > > > > > > > > > > Thanks > > > > > > > > > > Mike > > > > > > > > > > On Fri, Dec 13, 2019 at 5:46 PM Aishwarya Thangappa > > > > > <aishwarya.thanga...@microsoft.com.invalid> wrote: > > > > > > > > > > > Hello everyone, > > > > > > > > > > > > I have not subscribed to the dev mailing list earlier and missed on > > > > > > some > > > > > > of your questions. I will address them here. > > > > > > > > > > > > @Christopher > > > > > > Most of the changes except the actual installation of the systemd > > > > > > units > > > > > > could possibly go into Accumulo. These would be the systemd units > > > > > > for > > > > > > various accumulo services, modification to cluster-wide scripts in > > > > > > accumulo > > > > > > to use systemd instead of directly starting/stopping the processes. > > > > > > We > > > > > > would be happy to accommodate/answer any suggestions or follow-up > > > > > > questions > > > > > > you may have. > > > > > > > > > > > > Attached the accumulo_cluster and accumulo_service scripts with > > > > > > systemd > > > > > > changes. > > > > > > > > > > > > > > > > > > @Keith Turner > > > > > > Once we determine where the different pieces should land, I can > > > > > > post PRs > > > > > > accordingly. In our current setup, in muchos.properties file I have > > > > > > added a > > > > > > `use_systemd` flag which when set to true, will overwrite the > > > > > > accumulo > > > > > > cluster-wide scripts in the nodes with the attached ones. These > > > > > > files > > > > > > currently reside at ansible/roles/accumulo/files. If we determine > > > > > > that > > > > > > these scripts and the systemd unit files will instead go to Accumulo > > > > > > project, I will have to make changes accordingly. > > > > > > > > > > > > @Michael Wall > > > > > > Systemd units internally call the same scripts that accumulo_cluster > > > > > > commands currently use. The change is that accumulo_cluster > > > > > > commands would > > > > > > call systemd start/stop which inturn would call accumulo_service > > > > > > commands. > > > > > > Attached a sample systemd_unit template. Can you please elaborate > > > > > > if this > > > > > > is still an issue? > > > > > > > > > > > > ------------------------------ > > > > > > *From:* Aishwarya Thangappa > > > > > > *Sent:* Thursday, December 12, 2019 11:25 AM > > > > > > *To:* dev@fluo.apache.org <dev@fluo.apache.org> > > > > > > *Cc:* Arvind Shyamsundar <arvin...@microsoft.com>; Billie Rinaldi < > > > > > > billie.rina...@microsoft.com> > > > > > > *Subject:* Run Accumulo and Hadoop services under systemd > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > While using fluo-muchos to deploy an Accumulo cluster, we > > > > > > recognized the > > > > > > need for various Accumulo and Hadoop services to be run under a > > > > > > service > > > > > > manager like systemd which will ensure that all these services are > > > > > > brought > > > > > > up correctly in the event of VM / OS reboots / cold starts. We have > > > > > > made > > > > > > the required changes for this and would like to contribute it back > > > > > > to the > > > > > > community if there is any interest around it. > > > > > > > > > > > > Summarizing what we have done: > > > > > > > > > > > > - Crafted separate systemd unit files for Accumulo (master, > > > > > > monitor, > > > > > > gc, traser, tserver), Hadoop (journalnode, namenode, datanode, > > > > > > resourcemanager, nodemanager, zkfc) and Zookeeper services. > > > > > > - All of these unit files will be copied to the respective nodes' > > > > > > /etc/systemd/system directory; the services will then be started > > > > > > and > > > > > > enabled by ansible systemd module. > > > > > > - In case of num_tservers > 1, multiple tserver systemd units > > > > > > will be > > > > > > copied to the node and each will be independently managed. > > > > > > - Also made necessary changes to the existing cluster-wide > > > > > > scripts > > > > > > including accumulo_cluster, accumulo_service, start_dfs, > > > > > > start_yarn etc., > > > > > > to have them work seamlessly with sytemd. > > > > > > > > > > > > Is there an appetite to look at the details? If so, we can post a > > > > > > PR or if > > > > > > there are any feedbacks and other considerations, please let us > > > > > > know and we > > > > > > can discuss them. > > > > > > > > > > > > Thanks, > > > > > > Aishwarya > > > > > > > > > > > > > > > > > > > > >