#30006: Monitor "aliveness" of default bridges in Tor Browser -------------------------------------------------+------------------------- Reporter: phw | Owner: phw Type: defect | Status: | assigned Priority: Medium | Milestone: Component: Applications/Quality Assurance and | Version: Testing | Severity: Normal | Resolution: Keywords: default bridge | Actual Points: Parent ID: | Points: Reviewer: | Sponsor: -------------------------------------------------+-------------------------
Comment (by anarcat): Some more details on how this works. Prometheus is just a scraping/alerting system and relies on "exporters" to do the work. For example, we have "node exporters" installed on every TPA machine which provide stats like disk, CPU, and memory usage and also have "apache exporters" which provide internal stats on webservers as well. Details of that deployment are in #29681. The exporter that seem to fit the bill of "probe a TCP port for liveness" seem to be the [https://github.com/prometheus/blackbox_exporter blackbox exporter]. It could be deployed on the Prometheus server and check each public tor bridge for reachability. The blackbox exporter is not very well documented (not surprising considering its name), so I found more documentation on how it works [https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusBlackboxNotes here] and [https://michael.stapelberg.ch/posts/2016-01-01-prometheus- blackbox-exporter/ here]. The example you pasted was ran on my home workstation, and was simply a matter of running: {{{ apt install prometheus-blackbox-exporter }}} The exporter supports probing arbitrary hosts on the fly like this. The final targets would need to be added to the [https://github.com/prometheus/blackbox_exporter/blob/master/CONFIGURATION.md configuration file] (see also [https://github.com/prometheus/blackbox_exporter/blob/master/example.yml this example]). This could all be done somewhat automatically as well, with a cron job polling the list of bridges from some canonical location. The blackbox exporter is pretty powerful: in theory, we could make it do a simple send/expect dialog to verify the other end is really a Tor server, if that would be useful. Once the exporter is setup, the Prometheus server would be configured to scrape those metrics, which would be collected every "scrape interval" (currently 15 seconds). Note that we do not have alerting capabilities yet: this is still handled by Icinga (previously known as Nagios) (see #29864 and #29863 for that discussion). Instead, we could make a Grafana dashboard that displays those metrics. There are a few dashboards that exist already that process those metrics out of the box, but they would probably require at least some tweaking: * https://grafana.com/dashboards/5990 * https://grafana.com/dashboards/5345 * https://grafana.com/dashboards/7587 * full list: https://grafana.com/dashboards?dataSource=prometheus&search=blackbox I'm not sure alerting is really a necessity. It might be sufficient to check that dashboardas part of the release process, for example. The open questions for me are: 1. is this the metrics team responsability? or TPA? 2. what is the canonical reference for the list of public bridges? [https://gitweb.torproject.org/builders/tor-browser- build.git/plain/projects/tor-browser/Bundle-Data/PTConfigs/bridge_prefs.js this javascript file]? how stable is that file format? do I need to parse it as javascript or can I get away with a regex? 3. what is the threshold for failure? say we ping the bridge every 15 seconds, how many failures per which time period is a considered a failure? an example would be less than 50% of probes in the last day, for example. we can also check for latency as well 4. are latency metrics sensitive? currently, the Prometheus metrics are more or less publicly accessible, so if this is implemented, it would expose the latency of those hosts which could be leveraged for correlation attacks (although arguably *anyone* could run a similar setup and do a similar attack). if we are worried about this, a separate Prometheus server could be deployed with stronger security. (see also the discussion in #29863) -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30006#comment:2> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs