[tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2017-05-08 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
--+-
 Reporter:  arma  |  Owner:
 Type:  enhancement   | Status:  new
 Priority:  Medium|  Milestone:
Component:  Core Tor/Tor  |Version:
 Severity:  Normal|   Keywords:
Actual Points:|  Parent ID:
   Points:|   Reviewer:
  Sponsor:|
--+-
 In config.c in config_maybe_load_geoip_files_() we have this line:
 {{{
   /*  Reload GeoIPFile on SIGHUP. -NM */
 }}}

 and sure enough, it looks like there's nothing in main.c's do_hup() that
 would make us reload the geoip files.

 It would be relatively easy to do I think:
 * In do_hup(), right around the call to routerlist_reset_warnings(), we
 call something in geoip.c that tells it to no longer consider itself
 initialized. Maybe that call is something like clear_geoip_db().
 * Then in config_maybe_load_geoip_files_(), since geoip_is_loaded()
 returns 0, it loads them.

 Things get tricky though: catalyst asked if reloading the geoip files
 messes up the statistics gathering.

 If reloading the geoip files *does* mess up statistics gathering, we have
 ourselves a minor bug, since config_maybe_load_geoip_files_() does reload
 them if the GeoIPFile location changes.

 But it turns out to be more complicated than that, since geoip_load_file()
 only clears selective things: it clears geoip_ipv4_entries and
 geoip_ipv6_entries, but it leaves geoip_countries alone! And I see
 elsewhere, at the bottom of geoip_note_client_seen(), that we are keeping
 statistics directly in the geoip_countries smartlist:
 {{{
   geoip_country_t *country = smartlist_get(geoip_countries,
 country_idx);
   ++country->n_v3_ns_requests;
 }}}

 So we would not want to call clear_geoip_db() on hup, or we'd lose some
 stats.

 I guess that means if we want to make do_hup reload the geoip stats file,
 the better (non-invasive) plan is to have a boolean
 want_to_reload_geoip{4,6} in geoip.c that we turn on in do_hup, and then
 we check the right boolean in geoip_is_loaded().

 I think there's a big argument for growing some good unit tests here,
 around "what happens when we reload this, collect those stats, reload
 that, examine those other stats, etc", so things are either subtly broken
 right now, or awfully fragile.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2018-07-01 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
-+-
 Reporter:  arma |  Owner:  (none)
 Type:  enhancement  | Status:  new
 Priority:  Medium   |  Milestone:  Tor:
 |  unspecified
Component:  Core Tor/Tor |Version:
 Severity:  Normal   | Resolution:
 Keywords:  intro tor-client tor-relay geoip |  Actual Points:
  hup configuration reload metrics-geoip |
Parent ID:   | Points:  2
 Reviewer:   |Sponsor:
-+-
Changes (by irl):

 * keywords:  intro tor-client tor-relay geoip  hup configuration reload =>
 intro tor-client tor-relay geoip  hup configuration reload metrics-
 geoip


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2017-05-08 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
--+-
 Reporter:  arma  |  Owner:
 Type:  enhancement   | Status:  new
 Priority:  Medium|  Milestone:
Component:  Core Tor/Tor  |Version:
 Severity:  Normal| Resolution:
 Keywords:|  Actual Points:
Parent ID:| Points:
 Reviewer:|Sponsor:
--+-
Description changed by arma:

Old description:

> In config.c in config_maybe_load_geoip_files_() we have this line:
> {{{
>   /*  Reload GeoIPFile on SIGHUP. -NM */
> }}}
>
> and sure enough, it looks like there's nothing in main.c's do_hup() that
> would make us reload the geoip files.
>
> It would be relatively easy to do I think:
> * In do_hup(), right around the call to routerlist_reset_warnings(), we
> call something in geoip.c that tells it to no longer consider itself
> initialized. Maybe that call is something like clear_geoip_db().
> * Then in config_maybe_load_geoip_files_(), since geoip_is_loaded()
> returns 0, it loads them.
>
> Things get tricky though: catalyst asked if reloading the geoip files
> messes up the statistics gathering.
>
> If reloading the geoip files *does* mess up statistics gathering, we have
> ourselves a minor bug, since config_maybe_load_geoip_files_() does reload
> them if the GeoIPFile location changes.
>
> But it turns out to be more complicated than that, since
> geoip_load_file() only clears selective things: it clears
> geoip_ipv4_entries and geoip_ipv6_entries, but it leaves geoip_countries
> alone! And I see elsewhere, at the bottom of geoip_note_client_seen(),
> that we are keeping statistics directly in the geoip_countries smartlist:
> {{{
>   geoip_country_t *country = smartlist_get(geoip_countries,
> country_idx);
>   ++country->n_v3_ns_requests;
> }}}
>
> So we would not want to call clear_geoip_db() on hup, or we'd lose some
> stats.
>
> I guess that means if we want to make do_hup reload the geoip stats file,
> the better (non-invasive) plan is to have a boolean
> want_to_reload_geoip{4,6} in geoip.c that we turn on in do_hup, and then
> we check the right boolean in geoip_is_loaded().
>
> I think there's a big argument for growing some good unit tests here,
> around "what happens when we reload this, collect those stats, reload
> that, examine those other stats, etc", so things are either subtly broken
> right now, or awfully fragile.

New description:

 In config.c in config_maybe_load_geoip_files_() we have this line:
 {{{
   /*  Reload GeoIPFile on SIGHUP. -NM */
 }}}

 and sure enough, it looks like there's nothing in main.c's do_hup() that
 would make us reload the geoip files.

 It would be relatively easy to do I think:
 * In do_hup(), right around the call to routerlist_reset_warnings(), we
 call something in geoip.c that tells it to no longer consider itself
 initialized. Maybe that call is something like clear_geoip_db().
 * Then in config_maybe_load_geoip_files_(), since geoip_is_loaded()
 returns 0, it loads them.

 Things get tricky though: catalyst asked if reloading the geoip files
 messes up the statistics gathering.

 If reloading the geoip files *does* mess up statistics gathering, we have
 ourselves a minor bug, since config_maybe_load_geoip_files_() does reload
 them if the GeoIPFile location changes.

 But it turns out to be more complicated than that, since geoip_load_file()
 only clears selective things: it clears geoip_ipv4_entries and
 geoip_ipv6_entries, but it leaves geoip_countries alone! And I see
 elsewhere, at the bottom of geoip_note_client_seen(), that we are keeping
 statistics directly in the geoip_countries smartlist:
 {{{
   geoip_country_t *country = smartlist_get(geoip_countries,
 country_idx);
   ++country->n_v3_ns_requests;
 }}}

 So we would not want to call clear_geoip_db() on hup, or we'd lose some
 stats.

 I guess that means if we want to make do_hup reload the geoip stats file,
 the better (non-invasive) plan is to have a boolean
 want_to_reload_geoip{4,6} in geoip.c that we turn on in do_hup, and then
 we check the right boolean in geoip_is_loaded().

 I think there's a big argument for growing some good unit tests here,
 around "what happens when we reload this, collect those stats, reload
 that, examine those other stats, etc", since things are either subtly
 broken right now, or awfully fragile.

--

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2017-05-09 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
--+-
 Reporter:  arma  |  Owner:
 Type:  enhancement   | Status:  new
 Priority:  Medium|  Milestone:
Component:  Core Tor/Tor  |Version:
 Severity:  Normal| Resolution:
 Keywords:|  Actual Points:
Parent ID:| Points:
 Reviewer:|Sponsor:
--+-
Changes (by karsten):

 * cc: karsten (added)


Comment:

 I'm not arguing against fixing bugs or writing better tests.

 But what's the practical value of being able to reload the geoip files?
 We only ship new geoip files in new releases, and people will need to
 restart their tor process in order to upgrade to those anyway.  I'd guess
 that only a tiny fraction of operators would want to put in newer geoip
 files manually and reload their tor process.  And those statistics are
 hardly statistically relevant.

 So, I was curious how old the geoip files used by relays and bridges are,
 which is very likely related to how old the tor versions running in the
 network are.  Relays and bridges contain `geoip-db-digest` lines with
 SHA-1 digests of their geoip file, and we can extract those digests from
 tor's Git repo.  Here are some statistics on the age of geoip files
 mentioned in descriptors published between January 1, 2017 and May 4,
 2017:

 {{{
 # Relays
 Min.  1st Qu.   Median Mean  3rd Qu.
 Max.
 "2009-06-03" "2015-03-03" "2015-12-01" "2016-02-15" "2017-01-04"
 "2017-04-04"

 # Bridges
 Min.  1st Qu.   Median Mean  3rd Qu.
 Max.
 "2009-06-03" "2016-11-03" "2016-12-07" "2016-11-04" "2017-02-08"
 "2017-04-04"
 }}}

 These statistics say that the geoip files used by relays are more than 1
 year old on average and those used by bridges a couple of months.  (No
 idea why the difference between the two is so big.)

 But anyway, it seems like improving support for manually reloading geoip
 files won't make much of a difference.

 What we should consider instead is support for downloading signed geoip
 files from the directory authorities.  That will be much more effective.
 But it's also a lot of work and requires looking into licenses.

--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2017-05-09 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
--+--
 Reporter:  arma  |  Owner:
 Type:  enhancement   | Status:  new
 Priority:  Medium|  Milestone:  Tor: unspecified
Component:  Core Tor/Tor  |Version:
 Severity:  Normal| Resolution:
 Keywords:|  Actual Points:
Parent ID:| Points:
 Reviewer:|Sponsor:
--+--
Changes (by dgoulet):

 * milestone:   => Tor: unspecified


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs

Re: [tor-bugs] #22203 [Core Tor/Tor]: Should a hup reload the geoip files?

2017-07-11 Thread Tor Bug Tracker & Wiki
#22203: Should a hup reload the geoip files?
-+-
 Reporter:  arma |  Owner:
 Type:  enhancement  | Status:  new
 Priority:  Medium   |  Milestone:  Tor:
 |  unspecified
Component:  Core Tor/Tor |Version:
 Severity:  Normal   | Resolution:
 Keywords:  intro tor-client tor-relay geoip |  Actual Points:
  hup configuration reload   |
Parent ID:   | Points:  2
 Reviewer:   |Sponsor:
-+-
Changes (by nickm):

 * keywords:   => intro tor-client tor-relay geoip  hup configuration reload
 * points:   => 2


--
Ticket URL: 
Tor Bug Tracker & Wiki 
The Tor Project: anonymity online
___
tor-bugs mailing list
tor-bugs@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs