Re: [tor-dev] Tor and IP2Location LITE

2017-08-23 Thread Zack Weinberg
On Wed, Aug 23, 2017 at 9:36 PM, KL Liew  wrote:
>
>> It is possible that this address is used by North Korea, they don't have
>> a massive IP allocation and I would expect that perhaps there are some
>> tunnels, but I can't figure out where MaxMind have got this idea from.
>
> We aware of a small number of IP ranges tunneling to North Korea through
> some specific ISP. However, this IP address is registered by a VPN provider
> which also registered ranges in many other countries. We have no evidence
> that this VPN provider has a server located in those countries reported for
> their VPN service.

Allow me to jump in here and mention that I have done some work on
auditing the locations of VPN servers via active probes (very briefly:
pingtimes to hosts in known locations give upper bounds on the
distances to those hosts), and I suspect I know which VPN provider you
are referring to and their claims are indeed ... let's say
questionable. I'm not yet at liberty to share any more details of my
results, but you may find the software at
https://github.com/zackw/active-geolocator/ of interest.

Applying the same techniques to Tor is something I would be interested
in helping with, though not a personal priority.

zw
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-23 Thread KL Liew
Please find my comments below.

> It is possible that this address is used by North Korea, they don't have
> a massive IP allocation and I would expect that perhaps there are some
> tunnels, but I can't figure out where MaxMind have got this idea from.

We aware of a small number of IP ranges tunneling to North Korea through
some specific ISP. However, this IP address is registered by a VPN provider
which also registered ranges in many other countries. We have no evidence
that this VPN provider has a server located in those countries reported for
their VPN service.

> I think GeoIP is actually a far more difficult problem when it's not
> typical residential customers. Satellite customers, for instance, may
> use IP blocks that are spread across multiple countries.
>
> I would expect that cloud providers and larger datacenter providers are
> using tunnels of sorts between their datacenters. Tunnels kill any
> visibility into the real routing path.

The large cloud providers such as AWS and Azure publishes their data center
and IP addresses range to public. Data centers usually avoiding tunnels due
to performance and cost-effectiveness. We do see rare cases required
tunnels such as DDoS protection.

> When it comes to measuring the accuracy of databases for datacenters, I
> wonder if there could be some means for relay operators to self-report a
> location and then we can compare this with different databases.

If this is possible, then it is a good way to perform benchmarking.
However, we need to make sure the relay operator is giving the right
information.


- Kim


On Thu, Aug 24, 2017 at 3:50 AM, Iain R. Learmonth 
wrote:

> Hi,
>
> On 23/08/17 03:45, KL Liew wrote:
> >> How is your accuracy for data centres?
> >
> > I don't aware of any research papers targeting data center only.
> > IP2Location should be highly accurate because we are using network
> > routing information to determine physical location instead of registrant
> > address.
> >
> > For example, IP2Location is reporting 185.56.163.144 as in France after
> > reviewing the network routing information as below. However, if you
> > search the same IP address in other geolocation providers, you might see
> > it as reported as North Korea, a country with limited Internet access.
>
> It is possible that this address is used by North Korea, they don't have
> a massive IP allocation and I would expect that perhaps there are some
> tunnels, but I can't figure out where MaxMind have got this idea from.
>
> I think GeoIP is actually a far more difficult problem when it's not
> typical residential customers. Satellite customers, for instance, may
> use IP blocks that are spread across multiple countries.
>
> I would expect that cloud providers and larger datacenter providers are
> using tunnels of sorts between their datacenters. Tunnels kill any
> visibility into the real routing path.
>
> When attempting to perform GeoIP for routers, the problem is compounded
> as you don't know who really owns the router based on IP addresses
> alone, routers having multiple IP addresses, etc.
>
> With the influx of new TLDs and TLDs being chosen for vanity purposes,
> they are also not a useful indicator.
>
> I fear its the smaller providers, the more Tor-friendly providers, that
> are missing or inaccurately represented in the databases.
>
> When it comes to measuring the accuracy of databases for datacenters, I
> wonder if there could be some means for relay operators to self-report a
> location and then we can compare this with different databases.
>
> Is there a safe way for relay operators to prove that they control a
> relay and self-report the location of the relay without us having to
> have an extra field in relay descriptors or overload the contact field?
> Some sort of out-of-band means?
>
> Thanks,
> Iain.
>
>
>
>
> ___
> tor-dev mailing list
> tor-dev@lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
>
>
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-23 Thread Iain R. Learmonth
Hi,

On 23/08/17 03:45, KL Liew wrote:
>> How is your accuracy for data centres?
> 
> I don't aware of any research papers targeting data center only.
> IP2Location should be highly accurate because we are using network
> routing information to determine physical location instead of registrant
> address.
> 
> For example, IP2Location is reporting 185.56.163.144 as in France after
> reviewing the network routing information as below. However, if you
> search the same IP address in other geolocation providers, you might see
> it as reported as North Korea, a country with limited Internet access.

It is possible that this address is used by North Korea, they don't have
a massive IP allocation and I would expect that perhaps there are some
tunnels, but I can't figure out where MaxMind have got this idea from.

I think GeoIP is actually a far more difficult problem when it's not
typical residential customers. Satellite customers, for instance, may
use IP blocks that are spread across multiple countries.

I would expect that cloud providers and larger datacenter providers are
using tunnels of sorts between their datacenters. Tunnels kill any
visibility into the real routing path.

When attempting to perform GeoIP for routers, the problem is compounded
as you don't know who really owns the router based on IP addresses
alone, routers having multiple IP addresses, etc.

With the influx of new TLDs and TLDs being chosen for vanity purposes,
they are also not a useful indicator.

I fear its the smaller providers, the more Tor-friendly providers, that
are missing or inaccurately represented in the databases.

When it comes to measuring the accuracy of databases for datacenters, I
wonder if there could be some means for relay operators to self-report a
location and then we can compare this with different databases.

Is there a safe way for relay operators to prove that they control a
relay and self-report the location of the relay without us having to
have an extra field in relay descriptors or overload the contact field?
Some sort of out-of-band means?

Thanks,
Iain.





signature.asc
Description: OpenPGP digital signature
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-22 Thread KL Liew
Hi Tim,

> Are there any accuracy comparisons between MaxMind and IP2Location?
>
> We have noticed that GeoIP providers often focus on location accuracy for
> residential customers. But we use our GeoIP databases to locate both Tor
> clients (mainly residential) and Tor relays (mainly data centre).
>
> How is your accuracy for data centres?

I don't aware of any research papers targeting data center only.
IP2Location should be highly accurate because we are using network routing
information to determine physical location instead of registrant address.

For example, IP2Location is reporting 185.56.163.144 as in France after
reviewing the network routing information as below. However, if you search
the same IP address in other geolocation providers, you might see it as
reported as North Korea, a country with limited Internet access.

Tracing route to 185.56.163.144 over a maximum of 30 hops

  1 1 ms   <10 ms 1 ms  192.168.1.1
  2 *** Request timed out.
  3 5 ms 4 ms 3 ms  10.233.65.32
  4   178 ms   179 ms   179 ms  10.55.200.67
  5   270 ms   267 ms   273 ms  ams-ix2.eu.iptransit.com [80.249.211.47]
  6   283 ms   285 ms   283 ms  xe-4-3-1.r2.ams.iptransit.com
[204.26.60.115]
  7   283 ms   284 ms   283 ms  te2-4.r3.ams.sara.nl.iptransit.com
[204.26.60.6]
  8   282 ms   279 ms   281 ms  te1-2.r2.lux.iptransit.com [204.26.60.9]
  9   285 ms   284 ms   284 ms  204.26.60.123
 10   288 ms   287 ms   287 ms  185.56.163.144


 - Kim

On Wed, Aug 23, 2017 at 8:51 AM, teor  wrote:

> Hi Kim,
>
>
> > On 16 Aug 2017, at 13:38, KL Liew  wrote:
> >
> > In term of accuracy, you can find the latest research paper published by
> TUM. IP2Location has good accuracy as reported in Table V.
> >
> > Title   : HLOC: Hints-Based Geolocation Leveraging Multiple
> Measurement Frameworks
> > Authors : Quirin Scheitle, Oliver Gasser, Patrick Sattler, Georg
> Carle from Technical University of Munich (TUM)
> > PDF Access  : https://arxiv.org/pdf/1706.09331.pdf
>
> Are there any accuracy comparisons between MaxMind and IP2Location?
>
> We have noticed that GeoIP providers often focus on location accuracy for
> residential customers. But we use our GeoIP databases to locate both Tor
> clients (mainly residential) and Tor relays (mainly data centre).
>
> How is your accuracy for data centres?
>
> T
>
> --
> Tim Wilson-Brown (teor)
>
> teor2345 at gmail dot com
> PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
> ricochet:ekmygaiu4rzgsk6n
> xmpp: teor at torproject dot org
> 
>
>
>
>
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-22 Thread teor
Hi Kim,


> On 16 Aug 2017, at 13:38, KL Liew  wrote:
> 
> In term of accuracy, you can find the latest research paper published by TUM. 
> IP2Location has good accuracy as reported in Table V.
> 
> Title   : HLOC: Hints-Based Geolocation Leveraging Multiple Measurement 
> Frameworks
> Authors : Quirin Scheitle, Oliver Gasser, Patrick Sattler, Georg Carle 
> from Technical University of Munich (TUM)
> PDF Access  : https://arxiv.org/pdf/1706.09331.pdf

Are there any accuracy comparisons between MaxMind and IP2Location?

We have noticed that GeoIP providers often focus on location accuracy for
residential customers. But we use our GeoIP databases to locate both Tor
clients (mainly residential) and Tor relays (mainly data centre).

How is your accuracy for data centres?

T

--
Tim Wilson-Brown (teor)

teor2345 at gmail dot com
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
ricochet:ekmygaiu4rzgsk6n
xmpp: teor at torproject dot org






signature.asc
Description: Message signed with OpenPGP
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-20 Thread KL Liew
>> All in all, we (Tor's metrics team) are considering it! But it'll be on
>> the order of weeks or maybe months before we can move this forward.

No problem. Just let me know if any helps needed.

On Mon, Aug 21, 2017 at 4:02 AM, Karsten Loesing 
wrote:

> On 2017-08-16 21:19, Karsten Loesing wrote:
> > On 2017-08-16 05:38, KL Liew wrote:
> >> All,
> >
> > Hi Kim,
> >
> >> My name is Kim, the founder of IP2Location, a geolocation service
> >> provider since 2002.
> >>
> >> It looks like Tor is looking to review other providers for GeoIP service
> >> while I was reading one of a meeting minute for a meeting back in March
> >> 2017.
> >>
> >> https://trac.torproject.org/projects/tor/wiki/org/
> meetings/2017Amsterdam/Notes/Metricsin5Years
> >>
> >> We are very interested in contributing to Tor and work on this matter.
> >> Tor can host and integrate IP2Location LITE
> >> (http://lite.ip2location.com) into their application. IP2Location has
> >> programming libraries in most languages. We can also work with
> >> developers if there is any technical issues.
> >>
> >> In term of accuracy, you can find the latest research paper published by
> >> TUM. IP2Location has good accuracy as reported in Table V.
> >>
> >> Title   : HLOC: Hints-Based Geolocation Leveraging Multiple
> >> Measurement Frameworks
> >> Authors : Quirin Scheitle, Oliver Gasser, Patrick Sattler, Georg
> >> Carle from Technical University of Munich (TUM)
> >> PDF Access  : https://arxiv.org/pdf/1706.09331.pdf
> >>
> >> Let me know if there is any questions.
> >
> > Thanks for reaching out to us!
> >
> > It's indeed on our list to evaluate other geolocation databases and
> > possibly switch over. I'll bring this topic up at tomorrow's metrics
> > team meeting to discuss possible next steps for such an evaluation. I'll
> > get back to you here to share the results.
>
> So, we discussed this at our team meeting on Thursday and decided to
> further evaluate switching to IP2Location.
>
> That would be a non-trivial project, because we're using geolocation
> data in at least two places: 1. shipped with the core Tor program and 2.
> deployed on Tor Metrics services like Onionoo. And at least the former
> requires close coordination with Tor's network team.
>
> In any case we'll want to be sure whether this switch is the right move
> before starting such a project. The paper is a good start, but we might
> want to run more evaluations ourselves. For example, we could involve
> relay operators by asking them which resolved location is closer to
> reality. But even this evaluation requires writing some code, which puts
> it on a long list of things we'd like to do.
>
> All in all, we (Tor's metrics team) are considering it! But it'll be on
> the order of weeks or maybe months before we can move this forward.
>
> > One question, though, that just came to mind: Are there archives
> > available for past IP2Location LITE databases, or do you provide just
> > the latest version? Having archives, possibly even back to 2002, would
> > be pretty useful for Tor Metrics. (I didn't look around as much on your
> > homepage, so please apologize if this question is already answered
> there.)
>
> You replied off-list:
>
> > We do not have archive for the IP2Location LITE. We just started this
> free database a few years back.
>
> Okay. Maybe we could do something with archive.org in that case. It's
> not that we do have a complete history for MaxMind's files, except that
> we could probably create our own history from Tor's Git repository which
> contains files based on MaxMind's files.
>
> All the best,
> Karsten
>
>
> >
> >> - Kim
> >
> > All the best,
> > Karsten
> >
>
>
>
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-20 Thread David Fifield
On Sun, Aug 20, 2017 at 10:02:20PM +0200, Karsten Loesing wrote:
> Okay. Maybe we could do something with archive.org in that case. It's
> not that we do have a complete history for MaxMind's files, except that
> we could probably create our own history from Tor's Git repository which
> contains files based on MaxMind's files.

I have a script that walks through the history of tor's git geoip files.
#!/usr/bin/env python

import datetime
import getopt
import os.path
import socket
import subprocess
import sys

# Counts the size of per-country geoip allocations in the tor source code.
#
# Usage: ./scrape-geoip.py ~/src/tor > tor-geoip.csv
#
# ~/src/tor (or whatever the path is) must be a tor source repo; i.e. a clone of
# https://git.torproject.org/tor.git.

def usage(f=sys.stdout):
print >> f, """\
Usage: %s /path/to/tor
""" % sys.argv[0]

def history(dirname, filename):
proc = subprocess.Popen(["git", "log", "--reverse", "--date=short", "--pretty=%H %ad", filename],
cwd=dirname, stdout=subprocess.PIPE)
return proc.stdout

def git_show(dirname, filename, commithash):
proc = subprocess.Popen(["git", "show", commithash+":"+filename],
cwd=dirname, stdout=subprocess.PIPE)
return proc.stdout

def parse_geoip(f):
ccs = {}
for line in f:
if line.startswith("#"):
continue
parts = line.strip().split(",")
start = int(parts[0])
end = int(parts[1])
cc = parts[2].lower()
ccs.setdefault(cc, 0)
ccs[cc] += end - start + 1
return ccs

def ipv6_to_int(ipstr):
return long("0x" + socket.inet_pton(socket.AF_INET6, ipstr).encode("hex"), 16)

def parse_geoip6(f):
ccs = {}
for line in f:
if line.startswith("#"):
continue
parts = line.strip().split(",")
start = ipv6_to_int(parts[0])
end = ipv6_to_int(parts[1])
cc = parts[2].lower()
ccs.setdefault(cc, 0)
ccs[cc] += end - start + 1
return ccs


opts, args = getopt.gnu_getopt(sys.argv[1:], "h", ["help"])
for o, a in opts:
if o == "-h" or o == "--help":
usage()
sys.exit()

try:
TOR_PATH, = args
except ValueError:
usage(sys.stderr)
sys.exit(1)

print "date,ipv,country,count"

for line in history(TOR_PATH, "src/config/geoip"):
parts = line.strip().split()
commithash = parts[0]
date = datetime.datetime.strptime(parts[1], "%Y-%m-%d")

try:
ccs = parse_geoip(git_show(TOR_PATH, "src/config/geoip", commithash))
except Exception, e:
print >> sys.stderr, "Skipping %s %s: %s" % ("src/config/geoip", commithash, e)
continue
for cc, count in sorted(ccs.items()):
print ",".join([date.strftime("%Y-%m-%d"), "4", cc, str(count)])

for line in history(TOR_PATH, "src/config/geoip6"):
parts = line.strip().split()
commithash = parts[0]
date = datetime.datetime.strptime(parts[1], "%Y-%m-%d")

try:
ccs = parse_geoip6(git_show(TOR_PATH, "src/config/geoip6", commithash))
except Exception, e:
print >> sys.stderr, "Skipping %s %s: %s" % ("src/config/geoip6", commithash, e)
continue
for cc, count in sorted(ccs.items()):
print ",".join([date.strftime("%Y-%m-%d"), "6", cc, str(count)])
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-20 Thread Karsten Loesing
On 2017-08-16 21:19, Karsten Loesing wrote:
> On 2017-08-16 05:38, KL Liew wrote:
>> All,
> 
> Hi Kim,
> 
>> My name is Kim, the founder of IP2Location, a geolocation service
>> provider since 2002.
>>
>> It looks like Tor is looking to review other providers for GeoIP service
>> while I was reading one of a meeting minute for a meeting back in March
>> 2017.
>>
>> https://trac.torproject.org/projects/tor/wiki/org/meetings/2017Amsterdam/Notes/Metricsin5Years
>>
>> We are very interested in contributing to Tor and work on this matter.
>> Tor can host and integrate IP2Location LITE
>> (http://lite.ip2location.com) into their application. IP2Location has
>> programming libraries in most languages. We can also work with
>> developers if there is any technical issues.
>>
>> In term of accuracy, you can find the latest research paper published by
>> TUM. IP2Location has good accuracy as reported in Table V.
>>
>> Title   : HLOC: Hints-Based Geolocation Leveraging Multiple
>> Measurement Frameworks
>> Authors : Quirin Scheitle, Oliver Gasser, Patrick Sattler, Georg
>> Carle from Technical University of Munich (TUM)
>> PDF Access  : https://arxiv.org/pdf/1706.09331.pdf
>>
>> Let me know if there is any questions.
> 
> Thanks for reaching out to us!
> 
> It's indeed on our list to evaluate other geolocation databases and
> possibly switch over. I'll bring this topic up at tomorrow's metrics
> team meeting to discuss possible next steps for such an evaluation. I'll
> get back to you here to share the results.

So, we discussed this at our team meeting on Thursday and decided to
further evaluate switching to IP2Location.

That would be a non-trivial project, because we're using geolocation
data in at least two places: 1. shipped with the core Tor program and 2.
deployed on Tor Metrics services like Onionoo. And at least the former
requires close coordination with Tor's network team.

In any case we'll want to be sure whether this switch is the right move
before starting such a project. The paper is a good start, but we might
want to run more evaluations ourselves. For example, we could involve
relay operators by asking them which resolved location is closer to
reality. But even this evaluation requires writing some code, which puts
it on a long list of things we'd like to do.

All in all, we (Tor's metrics team) are considering it! But it'll be on
the order of weeks or maybe months before we can move this forward.

> One question, though, that just came to mind: Are there archives
> available for past IP2Location LITE databases, or do you provide just
> the latest version? Having archives, possibly even back to 2002, would
> be pretty useful for Tor Metrics. (I didn't look around as much on your
> homepage, so please apologize if this question is already answered there.)

You replied off-list:

> We do not have archive for the IP2Location LITE. We just started this free 
> database a few years back.

Okay. Maybe we could do something with archive.org in that case. It's
not that we do have a complete history for MaxMind's files, except that
we could probably create our own history from Tor's Git repository which
contains files based on MaxMind's files.

All the best,
Karsten


> 
>> - Kim
> 
> All the best,
> Karsten
> 




signature.asc
Description: OpenPGP digital signature
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Tor and IP2Location LITE

2017-08-16 Thread Karsten Loesing
On 2017-08-16 05:38, KL Liew wrote:
> All,

Hi Kim,

> My name is Kim, the founder of IP2Location, a geolocation service
> provider since 2002.
> 
> It looks like Tor is looking to review other providers for GeoIP service
> while I was reading one of a meeting minute for a meeting back in March
> 2017.
> 
> https://trac.torproject.org/projects/tor/wiki/org/meetings/2017Amsterdam/Notes/Metricsin5Years
> 
> We are very interested in contributing to Tor and work on this matter.
> Tor can host and integrate IP2Location LITE
> (http://lite.ip2location.com) into their application. IP2Location has
> programming libraries in most languages. We can also work with
> developers if there is any technical issues.
> 
> In term of accuracy, you can find the latest research paper published by
> TUM. IP2Location has good accuracy as reported in Table V.
> 
> Title   : HLOC: Hints-Based Geolocation Leveraging Multiple
> Measurement Frameworks
> Authors : Quirin Scheitle, Oliver Gasser, Patrick Sattler, Georg
> Carle from Technical University of Munich (TUM)
> PDF Access  : https://arxiv.org/pdf/1706.09331.pdf
> 
> Let me know if there is any questions.

Thanks for reaching out to us!

It's indeed on our list to evaluate other geolocation databases and
possibly switch over. I'll bring this topic up at tomorrow's metrics
team meeting to discuss possible next steps for such an evaluation. I'll
get back to you here to share the results.

One question, though, that just came to mind: Are there archives
available for past IP2Location LITE databases, or do you provide just
the latest version? Having archives, possibly even back to 2002, would
be pretty useful for Tor Metrics. (I didn't look around as much on your
homepage, so please apologize if this question is already answered there.)

> - Kim

All the best,
Karsten



signature.asc
Description: OpenPGP digital signature
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev