On Mon, Apr 18, 2022 at 03:45:29PM -0600, David Fifield wrote: > I was initially interested in this for the purpose of better estimating > the number of Snowflake users. But now I've decided "frac" is not useful > for that purpose: since there is only one bridge we care about, it does > not make sense to adjust the numbers to account for other bridges that > may not report the same set of statistics. I don't plan to take this > investigation any further for the time being, but here is source code to > reproduce the above tables. You will need: > https://collector.torproject.org/archive/relay-descriptors/consensuses/consensuses-2022-04.tar.xz > https://collector.torproject.org/archive/relay-descriptors/extra-infos/extra-infos-2022-04.tar.xz > > ./relay_uptime.py consensuses-2022-04.tar.xz > relay_uptime.csv > ./relay_dir.py extra-infos-2022-04.tar.xz > relay_dir.csv > ./frac.py relay_uptime.csv relay_dir.csv
Missed one of the source files.
import datetime NUM_PROCESSES = 4 # "If the contained statistics end time is more than 1 week older than the # descriptor publication time in the "published" line, skip this line..." END_THRESHOLD = datetime.timedelta(days = 7) # "Also skip statistics with an interval length other than 1 day." # We set the threshold higher, because some descriptors have an interval a few # seconds larger than 86400. INTERVAL_THRESHOLD = datetime.timedelta(seconds = 90000) def datetime_floor(d): return d.replace(hour = 0, minute = 0, second = 0, microsecond = 0) TIMEDELTA_1DAY = datetime.timedelta(seconds = 86400) def segment_datetime_interval(begin, end): cur = begin while cur < end: next = min(datetime_floor(cur + TIMEDELTA_1DAY), end) delta = next - cur yield (cur.date(), delta / (end - begin), delta / TIMEDELTA_1DAY) cur = next
_______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev