On Mon, Apr 18, 2022 at 03:45:29PM -0600, David Fifield wrote:
> I was initially interested in this for the purpose of better estimating
> the number of Snowflake users. But now I've decided "frac" is not useful
> for that purpose: since there is only one bridge we care about, it does
> not make sense to adjust the numbers to account for other bridges that
> may not report the same set of statistics. I don't plan to take this
> investigation any further for the time being, but here is source code to
> reproduce the above tables. You will need:
> https://collector.torproject.org/archive/relay-descriptors/consensuses/consensuses-2022-04.tar.xz
> https://collector.torproject.org/archive/relay-descriptors/extra-infos/extra-infos-2022-04.tar.xz
> 
> ./relay_uptime.py consensuses-2022-04.tar.xz > relay_uptime.csv
> ./relay_dir.py extra-infos-2022-04.tar.xz > relay_dir.csv
> ./frac.py relay_uptime.csv relay_dir.csv

Missed one of the source files.
import datetime

NUM_PROCESSES = 4

# "If the contained statistics end time is more than 1 week older than the
# descriptor publication time in the "published" line, skip this line..."
END_THRESHOLD = datetime.timedelta(days = 7)

# "Also skip statistics with an interval length other than 1 day."
# We set the threshold higher, because some descriptors have an interval a few
# seconds larger than 86400.
INTERVAL_THRESHOLD = datetime.timedelta(seconds = 90000)

def datetime_floor(d):
    return d.replace(hour = 0, minute = 0, second = 0, microsecond = 0)

TIMEDELTA_1DAY = datetime.timedelta(seconds = 86400)
def segment_datetime_interval(begin, end):
    cur = begin
    while cur < end:
        next = min(datetime_floor(cur + TIMEDELTA_1DAY), end)
        delta = next - cur
        yield (cur.date(), delta / (end - begin), delta / TIMEDELTA_1DAY)
        cur = next
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to