#19544: Add graph on bridge users by country and transport
     Reporter:  karsten                  |      Owner:
         Type:  enhancement              |     Status:  new
     Priority:  Medium                   |  Milestone:
    Component:  Metrics/Metrics website  |    Version:
     Severity:  Normal                   |   Keywords:
Actual Points:                           |  Parent ID:
       Points:                           |   Reviewer:
      Sponsor:                           |
 The following idea came up in the
 discussion to provide "users-per-transport-per-country" statistics for
 obfsbridges].  This ticket is about graphing existing data, whereas the
 discussion of reporting new data will continue on #10218.  Quoting a bit
 from that ticket to have enough context here:

 > It turns out that most large bridges (4 out of 5 on February 1, 2016)
 only see noteworthy usage via a single transport or have requests via one
 transport dominating the others in numbers (74% on the 5th large bridge on
 February 1, 2016).
 > We could assume that the distribution by country is the same for all
 transports, that is, if `CC` (in `[0..1]`) requests came from a given
 country and `PT` (also in `[0..1]`) requests came in via a given
 transport, `x * y` requests can be attributed to that country and
 transport.  But that assumption may be wrong.
 > What we could also do as first approximation is find a lower and upper
 bound of users by country and transport.  The lower bound would probably
 be defined as something like `max(0, PT + CC - 1)` (not just `0` to
 account for cases where `CC > 1 - PT`) and the upper bound as `min(PT,
 CC)`, even though I could be convinced that other formulas are even more

 dcf kindly graphed responses by country and transport on #10218
 [https://trac.torproject.org/projects/tor/ticket/10218#comment:22 here]
 and [https://trac.torproject.org/projects/tor/ticket/10218#comment:24
 here], indicating that this approach may produce actually useful results.

 The next step was to perform these calculations in the database and
 transform number of responses to estimated user numbers.  I finally found
 time to work on that step.  Here's a graph on Tor Metrics which is yet
 "hidden" under "Advanced" until I'm more confident that it's doing the
 right thing.


 Example (image link to that graph, may look different over time):


 Next steps:

  - Become more confident in the particular math and code behind this
 graph.  Once that's done, move the graph to "Basic" so that people will
 find it.  I'm attaching a branch in a minute.

  - Make the user interface better.  For example, we could also graph top
 countries by transport, not just top transports by all countries or top
 transports in a given country.  Maybe we can graph other things using this
 data as well.

  - Make the raw data available.  There's a .csv file behind this graph,
 but I didn't put that on Tor Metrics yet, because we might have to change
 the data format and lack a versioning system to do that.  I'm putting up a
 combined-2016-07-01.csv snapshot of that file] (36.4M) for review.

Ticket URL: <https://trac.torproject.org/projects/tor/ticket/19544>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
tor-bugs mailing list

Reply via email to