[tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-11 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
---+---
 Reporter:  cohosh |  Owner:  (none)
 Type:  enhancement| Status:  new
 Priority:  Medium |  Milestone:
Component:  Obfuscation/Snowflake  |Version:
 Severity:  Normal |   Keywords:  snowflake, geoip,
   |  stats
Actual Points: |  Parent ID:  #29207
   Points:  1  |   Reviewer:
  Sponsor:  Sponsor19  |
---+---
 We can use existing geoip data to collect statistics about where clients
 are connecting from in order to detect possible blocking events. These
 should be gathered both from the initial domain-fronted client connection
 and from the proxies (to be passed to the broker) in order to detect the
 blocking of individual proxies or the blocking of the WebRTC connections.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-11 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+---
 Reporter:  cohosh   |  Owner:  (none)
 Type:  enhancement  | Status:  new
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+---

Comment (by ahf):

 We have a couple of options here for the implementation:

 - The broker strictly doesn't depend on anything Tor, but we could re-use
 the tor-geoipdb databases that is bundled in Debian/Ubuntu to get updates.
 These databases have a slightly different format than the official MaxMind
 GeoIP databases.

 - We can use MaxMind's own releases with the Go API found in
 https://github.com/oschwald/maxminddb-golang - this would require us to
 maintain the DB's ourselves.

 Once the broker is able to update per-country stats for the domain-fronted
 client connection it should also be able to relay information about which
 database it is using to the Snowflake proxies, such that they can keep
 stats about incoming proxy connections from clients and where these are
 coming from. This would (maybe?) allow us to notice if WebRTC filtering is
 happening in a country in that the Broker will see multiple connections
 from the given country, but the proxies reports no incoming clients from
 the given country.

 The proxies MUST NOT have to forward the client IP to the broker, which is
 why it is better for the proxies to fetch the GeoIP DB from the broker and
 cache it locally.

 The format used by Tor itself is very simple (IP-encoded as an integer
 followed by the country) that you keep in an ordered vector  where you do
 a binary search in whenever you need to look up a  country from a given
 IP. The simplicity of this data-structure might make it more interesting
 than MaxMind's binary format since we need to do the same implementation
 in both Go and JavaScript.

 The Tor implementation can be found in
 
https://github.com/torproject/tor/blob/2f683465d4b666c5d8f84fb3b234ad539d8511cd/src/lib/geoip/geoip.c

 The Tor GeoIP database format can be seen here:
 https://github.com/torproject/tor/tree/master/src/config (see geoip,
 geoip6 and the mmdb-convert.py conversion script)

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-11 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+---
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  assigned
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+---
Changes (by cohosh):

 * status:  new => assigned
 * owner:  (none) => cohosh


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-12 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+---
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  assigned
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+---

Comment (by cohosh):

 Replying to [comment:1 ahf]:

 > Once the broker is able to update per-country stats for the domain-
 fronted client connection it should also be able to relay information
 about which database it is using to the Snowflake proxies, such that they
 can keep stats about incoming proxy connections from clients and where
 these are coming from. This would (maybe?) allow us to notice if WebRTC
 filtering is happening in a country in that the Broker will see multiple
 connections from the given country, but the proxies reports no incoming
 clients from the given country.
 >
 Are we already collecting per-country usage stats for snowflake bridges
 (as we do for other types of bridges)? If so, this might give us what we
 need automatically for noticing WebRTC filtering. Especially at the moment
 where there is one broker and one bridge, if clients are able to connect
 to snowflake proxies, there shouldn't be any '''censorship''' related
 reason that they cannot connect to bridges.

 I think per-country usage stats at the broker side are still useful of
 course, it gives us extra information if clients are able to connect to
 the broker but not able to connect to the snowflake bridge eventually.

 On a different note, it might also be useful to us to collect per-country
 stats on where the proxies are being run from.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-12 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+---
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  assigned
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+---

Comment (by dcf):

 Replying to [comment:3 cohosh]:
 > Are we already collecting per-country usage stats for snowflake bridges
 (as we do for other types of bridges)?

 Yes, this was #18628. How it works is, the snowflake proxy forwards
 ([https://gitweb.torproject.org/pluggable-
 
transports/snowflake.git/tree/proxy/proxypair.coffee?id=88f282c7334f9ee76dccbf9b84dc7bcf0b39cd5b#n98
 proxy], [https://gitweb.torproject.org/pluggable-
 transports/snowflake.git/tree/proxy-
 go/snowflake.go?id=88f282c7334f9ee76dccbf9b84dc7bcf0b39cd5b#n235 proxy-
 go]) the client's IP address to the bridge in a `client_ip=` URL query
 parameter. Then the server [https://gitweb.torproject.org/pluggable-
 
transports/snowflake.git/tree/server/server.go?id=88f282c7334f9ee76dccbf9b84dc7bcf0b39cd5b#n160
 parses it] and passes it to tor in the `pt.DialOr` call. It's similar to
 what we worked out for meek, which was #13171.

 I don't think that Snowflake has enough users to show up on any of the by-
 country graphs at Tor Metrics, but you can see the stats in the uploaded
 descriptor files. Example: https://collector.torproject.org/archive
 /bridge-descriptors/extra-infos/bridge-extra-infos-2019-02.tar.xz
 {{{
 $ tar -O -xf bridge-extra-infos-2019-02.tar.xz | grep -A 24 '^extra-info
 flakey 5481936581E23D2D178105D44DB6915AB06BFB7F$' | grep -E
 '^dirreq-v3-reqs '
 dirreq-v3-reqs ru=16,tr=16,ae=8,cn=8,gb=8,je=8,us=8
 dirreq-v3-reqs tr=24,cn=16,ae=8,je=8,nl=8,ru=8,us=8
 dirreq-v3-reqs tr=16,cn=8,gb=8,ru=8,us=8
 ...
 }}}

 > If so, this might give us what we need automatically for noticing WebRTC
 filtering. Especially at the moment where there is one broker and one
 bridge, if clients are able to connect to snowflake proxies, there
 shouldn't be any '''censorship''' related reason that they cannot connect
 to bridges.

 This logic makes sense to me.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-13 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+---
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  assigned
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+---

Comment (by cohosh):

 Here's a first commit that does something similar to little-t-tor for
 mapping IP addresses to country codes. The functions parse and load a
 database file into memory and then binary search that on a provided
 address to efficiently find the country code.

 
https://github.com/cohosh/snowflake/commit/eedca1cbe49ff84468806fd630a9f104d9ca230a

 For now I've just included the geoip geoipv6 database files in the
 repository... is there any easier way to get these?

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-13 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+--
Changes (by cohosh):

 * status:  assigned => needs_review


Comment:

 Here's a merge candidate for geoip in the broker:
 https://github.com/cohosh/snowflake/compare/geoip

 I added some very simple count-based usage statistics for clients (mostly
 just to show how it works). We can do something a lot nicer here. We can
 add the same statics for proxies as well.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-13 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+--

Comment (by dcf):

 Replying to [comment:5 cohosh]:
 > For now I've just included the geoip geoipv6 database files in the
 repository... is there any easier way to get these?

 One way, if we're comfortable relying on Debian dependencies, is to ask
 the operator to install [https://packages.debian.org/stretch/tor-geoipdb
 tor-geoipdb] or [https://packages.debian.org/stretch/geoip-database geoip-
 database] package.

 In the tests, I would also test an address that maps to `""` and perhaps
 special cases like 127.0.0.1, 0.0.0.0, 255.255.255.255.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-14 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+--

Comment (by ahf):

 I think this looks good, with a few comments/questions:

 - I don't think we should include the two geoip databases in the
 repository by default?
 - We should make the path to the two GeoIP databases configurable (either
 via a command line parameter and/or a small config file?)
 - I don't know if this is a common thing in Go code to do, but in many
 functional languages where you have type aliases people tend to do type
 aliases for `string` types to make them "more specific". In this case the
 country-string type could be called `Country` so the metrics table would
 be a mapping of a Country to a monotonically increasing counter.
 - What should we do with these values when they are here? Should we have
 an API end-point that can dump them? Should we save them to a log file
 with some heartbeat interval? Chelsea Komlo showed me a neat library for
 collecting internal metrics in Go applications, but it might be too early
 to introduce additional dependencies just for this. It was this library:
 https://github.com/armon/go-metrics

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-14 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:   |Sponsor:  Sponsor19
-+--

Comment (by ahf):

 Oh, and more thing I forgot. Should we have a SIGHUP handler that reloads
 the tables?

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-15 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+
Changes (by cohosh):

 * status:  needs_review => needs_revision
 * reviewer:   => ahf


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-15 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+

Comment (by cohosh):

 > One way, if we're comfortable relying on Debian dependencies, is to ask
 the operator to install ​tor-geoipdb or ​geoip-database package.

 > We should make the path to the two GeoIP databases configurable (either
 via a command line parameter and/or a small config file?)

 I think this is the best of both worlds:
 
https://github.com/cohosh/snowflake/commit/fbb87b508641bbbcfd3163d1f2a43b9aff4e0085

 The broker now allows the operator to pass in a path to geop files (for
 IPv4 and IPv6) as command-line arguments. The default is the install
 location of the debian tor-geoip package. If an invalid filename is
 provided (or none are provided and the package is not installed), the
 table will fail to load but not cause any crashes. There's a test for that
 here:
 
https://github.com/cohosh/snowflake/commit/09dd27f9408b1ff3ff916e374bcd5f659ad5b26b

 > In the tests, I would also test an address that maps to "" and perhaps
 special cases like 127.0.0.1, 0.0.0.0, 255.255.255.255.

 Thanks! Got some bugs :) Here's the tests and fixes:
 
https://github.com/cohosh/snowflake/commit/be4d245375722d958dd85f1a53849cdc37b3382b

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-15 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+

Comment (by cohosh):

 Here's a new candidate: https://github.com/cohosh/snowflake/compare/geoip

 In addition to the changes above, here are the other changes I made:

 Replying to [comment:8 ahf]:
 > - I don't know if this is a common thing in Go code to do, but in many
 functional languages where you have type aliases people tend to do type
 aliases for `string` types to make them "more specific". In this case the
 country-string type could be called `Country` so the metrics table would
 be a mapping of a Country to a monotonically increasing counter.
 I did that for CountryStats (which is the map from country codes to
 counts) is doing this for the country strings too noisy?
 > - What should we do with these values when they are here? Should we have
 an API end-point that can dump them? Should we save them to a log file
 with some heartbeat interval? Chelsea Komlo showed me a neat library for
 collecting internal metrics in Go applications, but it might be too early
 to introduce additional dependencies just for this. It was this library:
 https://github.com/armon/go-metrics
 I think it's still very early... my suggestion is to do something simple,
 close this ticket, and then think about what we want a bit more before
 adding new dependencies. Right now it just logs the country counts to a
 log file every hour

 >Oh, and more thing I forgot. Should we have a SIGHUP handler that reloads
 the tables?
 Added.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-15 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+--
Changes (by cohosh):

 * status:  needs_revision => needs_review


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-19 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-
Changes (by ahf):

 * status:  needs_review => merge_ready


Comment:

 I think this looks good. Maybe dcf has some comments?

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-21 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-

Comment (by cohosh):

 Oops, just fixed the test to look for the default geoip files:
 
https://github.com/cohosh/snowflake/commit/fbcb980dd5e5b42dd80c2e0a8b3501cd17fa34a2

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-21 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-
Changes (by cohosh):

 * actualpoints:   => 2


Comment:

 Updating actual points.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-25 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-

Comment (by cohosh):

 Updated PR (fixed paths in the tests):
 https://github.com/cohosh/snowflake/compare/geoip

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-03-28 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-

Comment (by dcf):

 I have many comments, but overall my impression is good and I think you
 can move ahead with this.

 My big-picture question is: what do we plan to do with this data? If it's
 to detect blocking events by comparing the broker's statistics against the
 bridge's, I think we should at least sketch out those analysis scripts, in
 order to see whether the client geoip data we will be collecting is suited
 to the requirements. My main point is that we shouldn't collect data just
 because it may be useful; instead we should design safe data collection
 around some question we want to answer. As it stands, the branch will
 collect more precise client data than Tor Metrics does (Tor Metrics
 doesn't publish raw numbers but applies some fuzzing and binning). Having
 /debug display precise counts is a danger in the following scenario: an
 observer wants to determine whether a particular client is accessing the
 Snowflake broker. Whenever the observer sees a suspected connection, it
 checks the /debug output to see whether the count has incremented.

 Perhaps we could do a test deployment for a few days, to get an idea of
 what the data looks like. In fact, I think it's a good idea to try that,
 before merging. If there's a research question that we think this data
 could help us answer, we could ask the
 [https://research.torproject.org/safetyboard.html Safety Board] to
 evaluate it.

 

 * This branch only tracks client accesses, not proxy accesses. What will
 happen with proxy accesses, are they planned to go in the same metrics.log
 file?
 *
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/broker.go#L271
 broker.go]: The SIGHUP handler introduces a race condition because other
 goroutines may be using the tables while they are being replaced. Even
 though the table is replaced
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/metrics.go#L64
 all at once with an assignment], I don't think that even that assignment
 is guaranteed to be atomic. The table really needs a lock—not only while
 being reloaded from a file but on every access (because SIGHUP means the
 data structure can change at any time). Alternately, consider removing the
 SIGHUP feature and only loading the database once at startup.
 *
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/broker.go#L243
 broker.go]
   {{{
 flag.StringVar(&geoipDatabase, "geoipdb", "/usr/share/tor/geoip", "path to
 correctly formatted geoip database mapping IPv4 address ranges to country
 codes")
 flag.StringVar(&geoip6Database, "geoip6db", "/usr/share/tor/geoip6", "path
 to correctly formatted geoip database mapping IPv6 address ranges to
 country codes")
   }}}
   Is there any way to disable the geoip feature, if I don't have the
 necessary files, other than by providing an invalid path? Maybe we don't
 care to provide a way to disable the feature and it's the operator's
 responsibility to provide some kind of geoip files; but in that case, it
 may be better to exit with an error if the files cannot be read, rather
 than logging the error and continuing on. It is perhaps surprising that
 you can explicitly ask for a certain file on the command line with
 `-geoipdb path`, and the broker will run even if the path cannot be read.
 *
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/metrics.go#L47
 UpdateCountryStats]: It looks to me like if a geoip lookup fails, it is
 silently ignored—this could be misleading. It would be good to distinguish
 three cases: geoip file present and lookup found a country code; geoip
 file present but lookup found nothing; geoip file absent so no lookup is
 possible.
 * Adding to that, tor records the hashes of its geoip files:
 [https://gitweb.torproject.org/torspec.git/tree/dir-
 spec.txt?id=6e58fa857a95e18c5fda0ecf17955da7919b260b#n855 geoip-db-digest]
 and [https://gitweb.torproject.org/torspec.git/tree/dir-
 spec.txt?id=6e58fa857a95e18c5fda0ecf17955da7919b260b#n861 geoip6-db-
 digest]. It would be good to record that information so we can
 retroactively determine if geoip files were out of date, or diagnose geoip
 failures.
 * I think that some of the logging is happening

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-01 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-

Comment (by cohosh):

 Replying to [comment:18 dcf]:
 > I have many comments, but overall my impression is good and I think you
 can move ahead with this.
 >
 > My big-picture question is: what do we plan to do with this data? If
 it's to detect blocking events by comparing the broker's statistics
 against the bridge's, I think we should at least sketch out those analysis
 scripts, in order to see whether the client geoip data we will be
 collecting is suited to the requirements. My main point is that we
 shouldn't collect data just because it may be useful; instead we should
 design safe data collection around some question we want to answer. As it
 stands, the branch will collect more precise client data than Tor Metrics
 does (Tor Metrics doesn't publish raw numbers but applies some fuzzing and
 binning). Having /debug display precise counts is a danger in the
 following scenario: an observer wants to determine whether a particular
 client is accessing the Snowflake broker. Whenever the observer sees a
 suspected connection, it checks the /debug output to see whether the count
 has incremented.
 >
 > Perhaps we could do a test deployment for a few days, to get an idea of
 what the data looks like. In fact, I think it's a good idea to try that,
 before merging. If there's a research question that we think this data
 could help us answer, we could ask the
 [https://research.torproject.org/safetyboard.html Safety Board] to
 evaluate it.
 >
 Thanks for this, I agree we should think about it some more.

 Whatever we decide, we should eventually not be displaying this data in
 /debug in the end, but rather logging it and using that log file to
 display metrics somewhere else. I also think that we should not be
 revealing '''more''' information about clients than the bridge is.

 I'm also willing to believe that collecting client country stats at the
 broker, even though it would tell us more information about censorship
 events, may not be *that* useful to us at the moment and is undesirable
 due to privacy concerns. We could always take a deeper dive into our
 investigations if we notice a drop in clients from a specific region at
 the bridge to figure out exactly what is going on.

 On the other hand, perhaps we want to collect country stats of the
 snowflake proxies? This is discussed to some extent in #21315. Do we have
 privacy concerns about proxies that are similar to those concerning
 clients?

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-02 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-

Comment (by dcf):

 Replying to [comment:19 cohosh]:
 > On the other hand, perhaps we want to collect country stats of the
 snowflake proxies? This is discussed to some extent in #21315. Do we have
 privacy concerns about proxies that are similar to those concerning
 clients?

 I think that doing it for proxies is less concerning that doing it for
 clients. I would be fine with merging code to collect stats on proxies
 right away, as I think the risk is low. We can ask the safety board if
 they can think of any dangers we missed, but I don't think we have to wait
 for that before starting.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-05 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+
Changes (by cohosh):

 * status:  merge_ready => needs_revision


Comment:

 Putting this in needs_revision before we've actually modified it to
 collect proxy not client data.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-08 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+

Comment (by cohosh):

 Replying to [comment:18 dcf]:
 Thanks, I went with most of these suggestions. Here's a new PR:
 [https://github.com/cohosh/snowflake/compare/geoip geoip] There are a few
 items that require additional notes.
 > * This branch only tracks client accesses, not proxy accesses. What will
 happen with proxy accesses, are they planned to go in the same metrics.log
 file?
 We're now just tracking proxy accesses.

 > * Is there any way to disable the geoip feature, if I don't have the
 necessary files, other than by providing an invalid path? Maybe we don't
 care to provide a way to disable the feature and it's the operator's
 responsibility to provide some kind of geoip files; but in that case, it
 may be better to exit with an error if the files cannot be read, rather
 than logging the error and continuing on. It is perhaps surprising that
 you can explicitly ask for a certain file on the command line with
 `-geoipdb path`, and the broker will run even if the path cannot be read.
 > *  It looks to me like if a geoip lookup fails, it is silently
 ignored—this could be misleading. It would be good to distinguish three
 cases: geoip file present and lookup found a country code; geoip file
 present but lookup found nothing; geoip file absent so no lookup is
 possible.
 Added a -disablegeoip option that will not load geoip tables and will not
 return an error if UpdateCountryStats fails. Otherwise UpdateCountryStats
 will return an error if it fails and loading the table will panic if the
 file does not exist or if it is incorrectly formatted.

 > * The `parseEntry` functions need error checking on `geoipStringToIP`
 and `net.ParseIP`, or else they will store a `nil` that will result in a
 panic later on.
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/geoip.go#L93
 This error] will be more useful if it includes a line number. Let me
 suggest a different factoring of the geoip parsing code. Have `parseEntry`
 be a plain function, not a method on `GeoIPv4Table`, and have it return
 `(GeoIPEntry, error)` rather than inserting into the table itself.
 GeoIPLoadFile can do the entry insertion, count lines and annotate the
 error result from `parseEntry`.
 The difficulty in refactoring it this way is that ipv4 and ipv6 tables
 have differently formated entries that need to be parse differently.
 I added error checking to parseEntry and have it return `(GeoIPEntry,
 error)` as suggested, but leave it as a method on GeoIPv4Table and
 GeoIPv6Table.

 > * If `NewMetrics` is called more than once, there will be multiple open
 file handles to metrics.log (and only the first one will actually work, I
 think). Is it possible that `NewMetrics` could be called more than once?
 Making a singleton *Metrics variable causes problems with how Convey does
 tests. It shouldn't be called more than once, but for now I'm using
 sync.Once on the logging at least so it's explicit


 > *
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker
 /snowflake-broker_test.go#L274-L279 snowflake-broker_test.go]: I'm not
 crazy about having the tests depend on external files, especially ones
 that may change. Ideally the test code would include its own small geoip
 files, either as separate files or as literals in the source code. But at
 least, the test should [https://golang.org/pkg/testing/#hdr-Skipping
 t.Skip] if a file is missing, rather than report spurious failure. I
 suggest adding a layer of abstraction: a `GeoIPLoad(io.Reader)` that
 parses any `io.Reader` (for examples a `bytes.Reader`), and write
 `GeoIPLoadFile` in terms of it.
 > * It would be nice to also have tests that exercise the error paths in
 geoip parsing.

 I haven't made this abstraction at the moment and went for the small geoip
 files option.

 > * I would like to see unit tests for basic functions like
 `GeoIPRangeClosure` and `GeoIPCheckRange`. They don't have to be long,
 just test the edge conditions of an instance.
 I replaced these functions with much simpler one line statements, but I
 added edge cases to the existing geoip test to make sure we correctly
 categorize IPs at the edge of a range

--
Ticket URL: 
Tor Bu

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-08 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+--
Changes (by cohosh):

 * status:  needs_revision => needs_review


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-18 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+
Changes (by dcf):

 * status:  needs_review => needs_revision


Comment:

 I have just a few further changes to recommend.

 *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/broker.go#L221
   {{{
 strings.Split(r.RemoteAddr, ":")[0]
   }}}
   This will fail for IPv6 addresses. Better to use net.SplitHostPort and
 check the error return. The host–port splitting could also happen inside
 of `Metrics.UpdateCountryStats`.
 *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L196
   {{{
 _, err = io.WriteString(hash, scanner.Text())
   }}}
   A better way to do this may be to do `hashedFile :=
 io.TeeReader(geoipFile, hash)` and then `scanner :=
 bufio.NewScanner(hashedFile)` so that the hash is calculated as a side
 effect of reading the file. There's no need to handle errors when writing
 to a `hash.Hash` because it is documented to never error.
 *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L14
   {{{
 Recognized line formats for IPv4 are:
 INTIPLOW,INTIPHIGH,CC
 and
 "INTIPLOW","INTIPHIGH","CC","CC3","COUNTRY NAME"
   }}}
   It looks like the code doesn't recognize the 5-element syntax, so it
 should be omitted here, or, if it's a common format, documented as not
 being supported.

 Replying to [comment:22 cohosh]:
 > Replying to [comment:18 dcf]:
 > > * Is there any way to disable the geoip feature, if I don't have the
 necessary files, other than by providing an invalid path? Maybe we don't
 care to provide a way to disable the feature and it's the operator's
 responsibility to provide some kind of geoip files; but in that case, it
 may be better to exit with an error if the files cannot be read, rather
 than logging the error and continuing on. It is perhaps surprising that
 you can explicitly ask for a certain file on the command line with
 `-geoipdb path`, and the broker will run even if the path cannot be read.
 > > *  It looks to me like if a geoip lookup fails, it is silently
 ignored—this could be misleading. It would be good to distinguish three
 cases: geoip file present and lookup found a country code; geoip file
 present but lookup found nothing; geoip file absent so no lookup is
 possible.
 > Added a -disablegeoip option that will not load geoip tables and will
 not return an error if UpdateCountryStats fails. Otherwise
 UpdateCountryStats will return an error if it fails and loading the table
 will panic if the file does not exist or if it is incorrectly formatted.

 Okay. Let's replace the panic with a `log.Fatal` so the failure gets
 logged.

 > > * The `parseEntry` functions need error checking on `geoipStringToIP`
 and `net.ParseIP`, or else they will store a `nil` that will result in a
 panic later on.
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/geoip.go#L93
 This error] will be more useful if it includes a line number. Let me
 suggest a different factoring of the geoip parsing code. Have `parseEntry`
 be a plain function, not a method on `GeoIPv4Table`, and have it return
 `(GeoIPEntry, error)` rather than inserting into the table itself.
 GeoIPLoadFile can do the entry insertion, count lines and annotate the
 error result from `parseEntry`.
 > The difficulty in refactoring it this way is that ipv4 and ipv6 tables
 have differently formated entries that need to be parse differently.
 > I added error checking to parseEntry and have it return `(GeoIPEntry,
 error)` as suggested, but leave it as a method on GeoIPv4Table and
 GeoIPv6Table.

 I didn't explain this well. I meant that there should be separate
 functions for the two formats e.g. `parseEntryIPv4` and `parseEntryIPv6`.
 The existing `parseEntry` methods never refer to `table`, so they don't
 need to be methods. But leaving them as methods is fine too.

 IMO annotating the error message with the problematic line should be done
 in `GeoIPLoadFile`, not in `parseEntry`. This will eliminate the
 duplication of the common `"Provided geoip file is incorrectly formatted"`
 string and ensure that all the error paths get annotated. What I mean is,
 in the `scanner.Scan()` loop, you can replace `return err` with `return
 fmt.Errorf("Provided 

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-18 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+

Comment (by dcf):

 Oh and it looks like country stats don't get incremented whenever
 `GetCountryByAddr` doesn't find a match. I'm afraid this will make
 analysis difficult if a large fraction of proxies aren't covered by the
 geoip database, or the database is improperly installed, or something.
 Could we count them with a country code of `"??"` or something?

 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L213
 `GetCountryByAddr` returns `(string, error)`, but failure to find an entry
 in a database is not really an "error". It makes it seem like there are
 three return stats possible (found, not found, error) when really there
 are only two (found, not found). How about changing the signature to
 {{{
 func GetCountryByAddr(table GeoIPTable, ip net.IP) (string, bool)
 }}}
 where the second return value is an `ok` flag, like when indexing a map or
 reading from a channel. The caller can then do the `"??"` substitution
 whenever `!ok`.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-18 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+
Changes (by cohosh):

 * cc: irl, karsten (added)


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-29 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+--
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_review
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+--
Changes (by cohosh):

 * status:  needs_revision => needs_review


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-29 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  needs_revision
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+

Comment (by cohosh):

 Replying to [comment:24 dcf]:
 > I have just a few further changes to recommend.
 Updated branch: https://github.com/cohosh/snowflake/compare/geoip
 >
 > *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/broker.go#L221
 >   {{{
 > strings.Split(r.RemoteAddr, ":")[0]
 >   }}}
 >   This will fail for IPv6 addresses. Better to use net.SplitHostPort and
 check the error return. The host–port splitting could also happen inside
 of `Metrics.UpdateCountryStats`.
 Ack >.< thanks. Done!
 > *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L196
 >   {{{
 > _, err = io.WriteString(hash, scanner.Text())
 >   }}}
 >   A better way to do this may be to do `hashedFile :=
 io.TeeReader(geoipFile, hash)` and then `scanner :=
 bufio.NewScanner(hashedFile)` so that the hash is calculated as a side
 effect of reading the file. There's no need to handle errors when writing
 to a `hash.Hash` because it is documented to never error.
 Done
 > *
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L14
 >   {{{
 > Recognized line formats for IPv4 are:
 > INTIPLOW,INTIPHIGH,CC
 > and
 > "INTIPLOW","INTIPHIGH","CC","CC3","COUNTRY NAME"
 >   }}}
 >   It looks like the code doesn't recognize the 5-element syntax, so it
 should be omitted here, or, if it's a common format, documented as not
 being supported.
 Documented as unsupported so we know what our line of thought was in case
 we want to support it later
 >
 > Okay. Let's replace the panic with a `log.Fatal` so the failure gets
 logged.
 >
 Done
 > > > * The `parseEntry` functions need error checking on
 `geoipStringToIP` and `net.ParseIP`, or else they will store a `nil` that
 will result in a panic later on.
 
[https://github.com/cohosh/snowflake/blob/4de81ed039ce28264a98a040a0fde2237844e779/broker/geoip.go#L93
 This error] will be more useful if it includes a line number. Let me
 suggest a different factoring of the geoip parsing code. Have `parseEntry`
 be a plain function, not a method on `GeoIPv4Table`, and have it return
 `(GeoIPEntry, error)` rather than inserting into the table itself.
 GeoIPLoadFile can do the entry insertion, count lines and annotate the
 error result from `parseEntry`.
 > > The difficulty in refactoring it this way is that ipv4 and ipv6 tables
 have differently formated entries that need to be parse differently.
 > > I added error checking to parseEntry and have it return `(GeoIPEntry,
 error)` as suggested, but leave it as a method on GeoIPv4Table and
 GeoIPv6Table.
 >
 > I didn't explain this well. I meant that there should be separate
 functions for the two formats e.g. `parseEntryIPv4` and `parseEntryIPv6`.
 The existing `parseEntry` methods never refer to `table`, so they don't
 need to be methods. But leaving them as methods is fine too.
 Okay, I'm going to leave them as methods to avoid having two different
 LoadFile functions for the different types of table.
 >
 > IMO annotating the error message with the problematic line should be
 done in `GeoIPLoadFile`, not in `parseEntry`. This will eliminate the
 duplication of the common `"Provided geoip file is incorrectly formatted"`
 string and ensure that all the error paths get annotated. What I mean is,
 in the `scanner.Scan()` loop, you can replace `return err` with `return
 fmt.Errorf("Provided geoip file is incorrectly formatted: %s. Line is:
 %+q")`.
 Done, is there a better way of handling the error stubs in `parseEntry`
 other than returning `fmt.Errorf("")`?

 Replying to [comment:25 dcf]:
 > Oh and it looks like country stats don't get incremented whenever
 `GetCountryByAddr` doesn't find a match. I'm afraid this will make
 analysis difficult if a large fraction of proxies aren't covered by the
 geoip database, or the database is improperly installed, or something.
 Could we count them with a country code of `"??"` or something?
 >
 >
 
https://github.com/cohosh/snowflake/blob/fb01c2d1c9d402cc4d96f01df2e6fb6cb37bc9a6/broker/geoip.go#L213
 > `GetCountryByAddr` returns `(string, error)`, but failure to find an
 entry in a database is not really an "error". It makes it seem like there
 are three return stats possible (found, n

Re: [tor-bugs] #29734 [Obfuscation/Snowflake]: Broker should receive country stats information from Proxy and Client

2019-04-30 Thread Tor Bug Tracker & Wiki
#29734: Broker should receive country stats information from Proxy and Client
-+-
 Reporter:  cohosh   |  Owner:  cohosh
 Type:  enhancement  | Status:  merge_ready
 Priority:  Medium   |  Milestone:
Component:  Obfuscation/Snowflake|Version:
 Severity:  Normal   | Resolution:
 Keywords:  snowflake, geoip, stats  |  Actual Points:  2
Parent ID:  #29207   | Points:  1
 Reviewer:  ahf  |Sponsor:  Sponsor19
-+-
Changes (by dcf):

 * status:  needs_review => merge_ready


Comment:

 Looks good! Nice work on this.

 Replying to [comment:27 cohosh]:
 > Replying to [comment:24 dcf]:
 > > IMO annotating the error message with the problematic line should be
 done in `GeoIPLoadFile`, not in `parseEntry`. This will eliminate the
 duplication of the common `"Provided geoip file is incorrectly formatted"`
 string and ensure that all the error paths get annotated. What I mean is,
 in the `scanner.Scan()` loop, you can replace `return err` with `return
 fmt.Errorf("Provided geoip file is incorrectly formatted: %s. Line is:
 %+q")`.
 > Done, is there a better way of handling the error stubs in `parseEntry`
 other than returning `fmt.Errorf("")`?

 There's nothing wrong with returning a specific "couldn't parse IP: "
 error message IMO, as long as the message gets printed at the top level in
 `GeoIPLoadFile`, not in `parseEntry`. Or, if you just want s generic error
 sentinel without a meaningful message, you can create an error object at
 the top-level scope and just return that:
 {{{
 var errParseError error = errors.New("parse error")
 }}}

 > Replying to [comment:25 dcf]:
 > > Oh and it looks like country stats don't get incremented whenever
 `GetCountryByAddr` doesn't find a match. I'm afraid this will make
 analysis difficult if a large fraction of proxies aren't covered by the
 geoip database, or the database is improperly installed, or something.
 Could we count them with a country code of `"??"` or something?
 >
 > Okay, so I've changed the GetCountryByAddr signature to what you've
 suggested and then just removed the error return value from
 UpdateCountryStats. I've also added setting the country to `"??"` in
 UpdateCountryStats.

 The documentation for `GetCountryByAddr` should explain the second return
 value.

 I think you can remove this condition now:
 {{{
 if country != "" {
 }}}

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs