#29684: setup a grafana server somewhere -------------------------------------------------+------------------------- Reporter: anarcat | Owner: anarcat Type: defect | Status: | assigned Priority: Medium | Milestone: Component: Internal Services/Tor Sysadmin Team | Version: Severity: Normal | Resolution: Keywords: | Actual Points: Parent ID: #29681 | Points: Reviewer: | Sponsor: -------------------------------------------------+-------------------------
Comment (by anarcat): the first step here, to be clear, is a choice between the following options: 1. Grafana installed with the upstream Debian package, no isolation (current situation) 2. Grafana installed with the upstream Debian package, in its own VM 3. Grafana installed with the upstream Docker image 4. Something else than Grafana, but still using Prometheus 5. Going back to Munin TL;DR: I'm for '''option 1''' for now and eventually '''option 3''' if upstream can't figure out Debian packaging. I need a decision on this to move forward with the munin-node cleanup and Grafana configuration, but I'll continue the deployment of Prometheus exporters everywhere in any case (unless people feel strongly for '''option 5'''). Taking those in reverse order: I don't think anyone is seriously considering '''option 5''' here, but I just added it to make things clear. I am somewhat opposed to '''option 4''': I don't know of any good replacement for Prometheus that is better packaged in Debian and will allow us to graph metrics from Prometheus the way we need. We *can* build custom graphs and dashboards using the [https://prometheus.io/docs/visualization/consoles/ console templates] but my experience with Prometheus graphs so far has been painful at best. They are hard to make and hard to share, while there is already a library of Grafana dashboards we can draw from (even if a little small). Regarding '''option 3''', I don't care that much about Debian vs Docker. I originally wanted to try Docker images because I didn't feel comfortable installing arbitrary upstream code as root in our infrastructure. I also liked the idea to get the little extra isolation Docker provides, from that non-vetted upstream code, even if it means a few extra layers of abstractions and weird tradeoffs. But (understandably) ln5 wasn't comfortable using containers altogether and I figured it might be simpler to just use a Debian package for now, since it's something we're all familiar with. ('''Option 2''') So that's why we're running the upstream Debian package now, without isolation - that is, in the same VM as the Prometheus server. As discussed with ln5 over IRC, the catastrophic scenario that we would avoid by setting up Grafana in a separate VM is that someone takes over the Grafana server, and use that to start attacking other nodes in the network running the Prometheus exporters. They would need to hack ''those'' '''and''' also escape ''their'' sandboxes to do any more significant damage to other nodes. Another attack vector is getting to the Prometheus data itself, but that is currently protected by a "invite" password so it's not really that much of a concern. If an attacker could get privilege escalation and access to the Prometheus accounts, they might be able to silence alarms and inject arbitrary data in the Prometheus database, that said. Setting up a separate VM for Grafana would mean that the Grafana server wouldn't talk to Prometheus locally anymore, which could have performance impact over the graph generation time. We *could* host the two VMs on the same physical box, but that would require rebuilding the Prometheus server as well. So I don't think the tradeoffs of running Grafana in a separate VM is worth it. I would continue with the current Debian-based setup ('''option 1''') or, if we're worried about trusting those packages, switch to the Docker image ('''option 3'''). In any case, I would prefer if we could continue the implementation to be on par with what we get with Munin out of the box, which involves adding a few more exporters to get stats about databases and webservers. This is all Prometheus stuff and so far I haven't seen resistance to that technology, so from now on I'll go under the assertion that I can continue deploying those exporters, which are well packaged in Debian and easier to deploy anyways, with minimal dependencies. The open question for me is whether I should tear out the traces of Munin configuration on the hosts. There are still munin-node daemons running everywhere and failing cronjobs doing noises. By removing that stuff, I would also see what's there that's missing from our Prometheus setup which would be useful in itself. The other question is if we go with Grafana at all or find "something else" ('''option 4'''). I'd like to keep going with Grafana and finish its configuration, naturally, but I'm open to alternative suggestions of course. Alright, sorry for the long email, but I figured it was worth documenting all the options carefully. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/29684#comment:5> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs