On Wed, Jan 29, 2020 at 9:04 AM teor <t...@riseup.net> wrote:

Hello again!  This looks like another fine proposal.  I'm leaving
comments inline, and clipping sections that I'm not commenting on.


>
> Filename: 312-relay-auto-ipv6-addr.txt
> Title: Tor Relays Automatically Find Their IPv6 Address
> Author: teor
> Created: 28-January-2020
> Status: Draft
> Ticket: #33073
>
> 0. Abstract
>
>    We propose that Tor relays (and bridges) should automatically find their
>    IPv6 address, and use it to publish an IPv6 ORPort. For some relays to find
>    their IPv6 address, they may need to fetch some directory documents from
>    directory authorities over IPv6. (For anonymity reasons, bridges are unable
>    to fetch directory documents over IPv6, until clients start to do so.)
>
> 1. Introduction
>
>    Tor relays (and bridges) currently find their IPv4 address, and use it as
>    their ORPort and DirPort address when publishing their descriptor. But
>    relays and bridges do not automatically find their IPv6 address.

At the beginning of this document, we should be a bit more clear about
which address specifically we're trying to find.  If we wanted _some_
address, or if NAT and firewalls didn't exist, we could just open a
socket, call getsockname(), and be done with it.  What we are looking
for specifically is an address that we can advertise to the rest of
the world in our server descriptor.  [I know you know this, but we
should say so.]


 [...]
> 3. Finding Relay IPv6 Addresses
>
>    We propose that tor relays (and bridges) automatically find their IPv6
>    address, and use it to publish an IPv6 ORPort.
>
>    For some relays to find their IPv6 address, they may need to fetch some
>    directory documents from directory authorities over IPv6. (For anonymity
>    reasons, bridges are unable to fetch directory documents over IPv6, until
>    clients start to do so.)
>
> 3.1. Current Relay IPv4 Address Implementation
>
>    Currently, all relays (and bridges) must have an IPv4 address. IPv6
>    addresses are optional for relays.
>
>    Tor currently tries to find relay IPv4 addresses in this order:
>      1. the Address torrc option
>      2. the address of the hostname (resolved using DNS, if needed)
>      3. a local interface address
>         (by making a self-connected socket, if needed)
>      4. an address reported by a directory server (using X-Your-Address-Is)

Any server, or only an authority?  Over any connection, or only an
authenticated one?

 [...]
> 3.2. Finding Relay IPv6 Addresses
>
>    We propose that relays (and bridges) try to find their IPv6 address. For
>    consistency, we also propose to change the address resolution order for
>    IPv4 addresses.
>
>    We use the following general principles to choose the order of IP address
>    methods:
>      * Explicit is better than Implicit,
>      * Local Information is better than a Remote Dependency,
>      * Trusted is better than Untrusted, and
>      * Reliable is better than Unreliable.
>    Within these constraints, we try to find the simplest working design.

We should make sure to be clear about the impact of using an untrusted
source.  Anybody who can fool a relay about its IP can effectively
MITM that relay's incoming connections (traffic patterns only), so
using a non-trusted source can be risky for anonymity.

 [...]
>    (Each of these address resolution steps is described in more detail, in its
>    own subsection.)
>
>    While making these changes, we want to preserve tor's existing behaviour:
>      * resolve Address using the local resolver, if needed,
>      * ignore private addresses on public tor networks, and
>      * when there are multiple valid addresses, choose the first or latest
>        address, as appropriate.

Instead of "first or latest" I suggest "first-listed or most recently
received" here, to help non-native speakers.

> 3.2.1. Make the Address torrc Option Support IPv6
 [...]
>    It is an error to configure an Address option with a private IPv4 or IPv6
>    address, or with a hostname that does not resolve to any publicly routable
>    IPv4 or IPv6 addresses.

We should say "on a public network" here -- private addresses are fine
on private networks.

Also, this seems to mean that if the relay's DNS resolver goes down,
the relay should give an error and exit, even if it was already
running.  That seems undesired.

 [...]
> 3.2.2. Use the Advertised ORPort IPv4 and IPv6 Addresses
>
>    Next, we propose that relays (and bridges) use the first advertised ORPort
>    IPv4 and IPv6 addresses, as configured in their torrc.
>
>    The ORPort address may be a hostname. If it is, tor should try to use it to
>    resolve an IPv4 and IPv6 address, and open ORPorts on the first available
>    IPv4 and IPv6 address. Tor should respect the IPv4Only and IPv6Only port
>    flags, if specified. (Tor currently resolves IPv4 addresses in ORPort
>    lines. It may not look for an IPv6 address.)
>
>    Relays (and bridges) currently use the first advertised ORPort IPv6 address
>    as their IPv6 address. We propose to use the first advertised IPv4 ORPort
>    address in a similar way, for consistency.
>
>    Therefore, this change may affect existing relay IPv4 addressses. We expect
>    that a small number of relays may change IPv4 address, from a guessed IPv4
>    address, to their first advertised IPv4 ORPort address.
>
>    In rare cases, relays may have been using non-advertised ORPorts for their
>    addresses. This change may also change their addresses.
>
>    We propose ignoring private configured ORPort addresses on public tor
>    networks. (Binding to private ORPort addresses is supported, even on public
>    tor networks, for relays that use NAT to reach the Internet.) If an ORPort
>    address is private, address resolution should go to the next step.
>
> 3.2.3. Use the Advertised DirPort IPv4 Address
>
>    Next, we propose that relays use the first advertised DirPort IPv4 address,
>    as configured in their torrc.

I think that we could omit this method; it seems unlikely to me that
anybody is going to configure an advertised DirPort address but not an
advertised ORPort address.   In the long run, I think we want DirPorts
to disappear entirely as part of our official protocol.


> 3.2.4. Use Local Interface IPv6 Address
>
>    Next, we propose that relays (and bridges) use publicly routable addresses
>    from the OS interface addresses or routing table, as their IPv4 and IPv6
>    addresses.
>
>    Tor has local interface address resolution functions, which support most
>    major OSes. Tor uses these functions to guess its IPv4 address. We propose
>    using them to also guess tor's IPv6 address.
>
>    We also propose modifying the address resolution order, so interface
>    addresses are used before the local hostname. This decision is based
>    on our principles: interface addresses are local, trusted, and reliable;
>    hostname lookups may be remote, untrusted, and unreliable.
>
>    Some developer documentation also recommends using interface addresses,
>    rather than resolving the host's own hostname. For example, on recent
>    versions of macOS, the man pages tell developers to use interface addresses
>    (getifaddrs) rather than look up the host's own hostname (gethostname and
>    getaddrinfo). Unfortunately, these man pages don't seem to be available
>    online, except for short quotes (see [getaddrinfo man page] for the
>    relevant quote).
>
>    If the local interface addresses are unavailable, tor opens a 
> self-connected
>    UDP socket to a publicly routable address, but doesn't actually send any
>    packets. Instead, it uses the socket APIs to discover the interface address
>    for the socket.

I don't understand in which sense this socket is  "self-connected" --
maybe "unused" or something?  Also I'd suggest that Tor should use an
authority's IP address for this purpose.  Currently, we use 18.0.0.1,
which tends to confuse people who are looking at their firewall's
warnings.

>    Tor already ignores private IPv4 interface addresses on public relays.
>    (Binding to private DirPort addresses is supported, for networks that use
>    NAT.) We propose to also ignore private IPv6 interface addresses. If all
>    IPv4 or IPv6 interface addresses are private, address resolution should go
>    to the next step.
>
> 3.2.5. Use Own Hostname IPv6 Addresses
>
>    Next, we propose that relays (and bridges) get their local hostname, look
>    up its addresses, and use them as its IPv4 and IPv6 addresses.
>
>    We propose to use the same underlying lookup functions to look up the IPv4
>    and IPv6 addresses for:
>      * the Address torrc option (see section 3.2.1), and
>      * the local hostname.
>    However, OS APIs typically only return a single hostname.
>
>    Even though the hostname lookup may use remote DNS, we propose to use it on
>    directory authorities, to maintain compatibility with current
>    configurations. Even if it is remote, we expect the configured DNS to be
>    somewhat trusted by the operator.

Do you mean to say "directory authorities" here? I don't understand that part.

>    The hostname lookup should ignore private addresses on public relays. If
>    multiple IPv4 or IPv6 addresses are returned, the first public address from
>    each family should be used. If all IPv4 or IPv6 hostname addresses are
>    private, address resolution should go to the next step.


 [...]
> 3.2.6. Use Directory Header IPv6 Addresses
>
>    Finally, we propose that relays get their IPv4 and IPv6 addresses from the
>    X-Your-Address-Is HTTP header in tor directory documents. To support this
>    change, we propose that relays start fetching directory documents over IPv4
>    and IPv6.

Can we specify use of NETINFO cells additionally or instead?  Unlike
DirPort connections, ORPort connections are authenticated, so we know
who is telling us what our address is.

>    We propose that bridges continue to only fetch directory documents over
>    IPv4, because they try to imitate clients. (Most clients only fetch
>    directory documents over IPv4, a few clients are configured to only fetch
>    over IPv6.) When client behaviour changes to use both IPv4 and IPv6 for
>    directory fetches, bridge behaviour can also change to match. (See
>    section 3.4.1 and [Proposal 306: Client Auto IPv6 Connections].)
>
>    We propose that directory authorities should ignore addresses in directory
>    headers. Allowing other authorities (or relays?) to change a directory
>    authority's published IP address may lead to security issues. Instead,
>    if interface and hostname lookups fail, tor should stop address resolution,
>    and return a permanent error. (And issue a log to the operator, see below.)

I suggest that we simplify the whole directory authority logic and say
that authorities must have configured Address lines, or nothing.

[...]
> 3.3. Consequential Tor Client Changes
>
>    We do not propose any required client address resolution changes at this
>    time.
>
>    However, clients will use the updated address resolution functions to 
> detect
>    when they are on a new connection, and therefore need to rotate their TLS
>    keys.

Do clients have meaningful TLS keys any more, now that they have
dropped client support for the v1 link protocol?

(This is just a side question -- clients should still have a working
ip_address_changed() function.)

 [...]
> 3.5. Optional Efficiency and Reliability Changes
>
>    We propose some optional changes for efficiency and reliability, and
>    describe their impact.
>
>    Some of these changes may be more appropriate in future releases, or
>    along with other proposed features.
>
> 3.5.1. Only Use Authenticated Directory Header IPv4 and IPv6 Addresses
>
>    We propose this optional change, to improve relay address accuracy and
>    reliability.

I am +1 here, with a proviso that we should be able to use NETINFO cells.

 [...]
> 3.5.5. Add IPv6 Support to AuthDirMaxServersPerAddr
>
>    We propose this optional change, to improve the health of the network, by
>    rejecting too many relays on the same IPv6 address.
>
>    Modify get_possible_sybil_list() so it takes an address family argument,
>    and returns a list of IPv4 or IPv6 sybils.
>
>    Use the modified get_possible_sybil_list() to exclude relays from the
>    authority's vote, if there are more than AuthDirMaxServersPerAddr on the
>    same IPv4 or IPv6 address.
>
>    Since these relay exclusions happen at voting time, they do not require a
>    new consensus method.

Since it's trivial for one host to have a staggering number of IPv6
addresses, should this specify a /80 or /96 or something as being
sybil-like?

 [...]
> 3.5.7. Add IPv6 Support Using gethostbyname2()

I agree that this change should be unnecessary; I'd suggest that we
not do it and just require getaddrinfo() for meaningful IPv6
resolution.

Alternatively, we could use libevent's DNS.

> 3.5.8. Change Relay OutboundBindAddress Defaults
>
>    We propose this optional change, to improve the reliability of
>    IP address-based filters in tor.
>
>    For example, the tor network treats relay IP addresses differently when:
>      * resisting denial of service, and
>      * selecting canonical, long-term connections.
>    (See [Ticket 33018: Dir auths using an unsustainable 400+ mbit/s] for the
>    initial motivation for this change: resisting significant bandwidth load
>    on directory authorities.)
>
>    Now that tor knows its own addresses, we propose that relays (and bridges)
>    set their IPv4 and IPv6 OutboundBindAddress to these discovered addresses,
>    by default. If binding fails, tor should fall back to an unbound socket.

I think this change might be unnecessary, but it shouldn't hurt.  I'd
suggest not prioritizing it very high.

 [...]

In general, this plan above looks solid.

I have a suggestion before we get into the implementation, though: I
think we should, for each check, make sure that we write down _when_
it happens, what makes it happen, and where we store the result.  That
is, some of these checks are things we need to launch (like looking up
our own hostname), whereas others will happen passively pretty often
(like connecting to a directory authority).  Of the ones that we need
to launch, some will happen only when other methods have failed,
whereas some will happen on startup.  Some are things that can time
out, whereas others aren't.  Writing this all down will make sure that
we aren't making our state machine more complex than it needs to be.

IMO, we should record the status of all possible IP lookup methods,
with "not yet tried" being a possible status: it will help us keep our
 implementation and our logging simple -- or at least, as simple as
can be.

cheers,
-- 
Nick
_______________________________________________
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

Reply via email to