Source: unbound
Version: 1.4.17-3+deb7u2
Severity: normal
Tags: upstream

Dear Maintainer,

Unbound's current implementation of interface-automatic always forces an 
exit interface for the reply datagrams, causing reply packets to be 
dropped on multi-homed systems in the presence of asymmetric routing. We were
affected by this running unbound on a router with multiple interfaces in 
a meshed network.

Long explanation follows:

To provide single-socket UDP multihoming, unbound uses the ancilary
IP_PKTINFO data received during recvmsg(2) and passes it as-is
(including ipi_ifindex) to sendmsg(2). The in_pktinfo structure contains
the following members:

    struct in_pktinfo {
        unsigned int   ipi_ifindex;  /* Interface index */
        struct in_addr ipi_spec_dst; /* Local address */
        struct in_addr ipi_addr;     /* Header Destination
                                  address */
    };

At recvmsg(2) time ipi_ifindex contains the ifindex of the interface the packet
arrived on, while ipi_spec_dst contains the local interface address the packet
matched. Regarding sendmsg(2), man 7 ip states:

    If  IP_PKTINFO  is  passed  to  sendmsg(2)  and ipi_spec_dst  is  not
    zero, then it is used as the local source address for the routing table
    lookup and for setting up IP source route options.  When ipi_ifindex is
    not zero, the primary local address of the interface  specified  by
    the  index overwrites ipi_spec_dst for the routing table lookup.

So it appears as if passing ipi_ifindex should only affect cases where source
routing is performed. However, the actual Linux kernel implementation
(as of 4.0-rc6) states the following (__ip_route_output_key() in
net/ipv4/route.c):

    if (fib_lookup(net, fl4, &res)) {
            res.fi = NULL;
            res.table = NULL;
            if (fl4->flowi4_oif) {
                    /* Apparently, routing tables are wrong. Assume,
                       that the destination is on link.

                       WHY? DW.
                       Because we are allowed to send to iface
                       even if it has NO routes and NO assigned
                       addresses. When oif is specified, routing
                       tables are looked up with only one purpose:
                       to catch if destination is gatewayed, rather than
                       direct. Moreover, if MSG_DONTROUTE is set,
                       we send packet, ignoring both routing tables
                       and ifaddr state. --ANK

                       We could make it even if oif is unknown,
                       likely IPv6, but we do not.
                     */

                    if (fl4->saddr == 0)
                            fl4->saddr = inet_select_addr(dev_out, 0,
                                                          RT_SCOPE_LINK);
                    res.type = RTN_UNICAST;
                    goto make_route;
            }

What this basically does is that ipi_ifindex will override any routing
table decision and force the packet out of that very interface.
Furthermore, if there is no routing table entry for the destination via
that interface, the destination will be assumed to be on-link and will
not be routed via a gateway. Note that no error will be returned to
userspace and if the destination does not respond to an ARP request on
that very link, the packet will be silently dropped.

So, for unbound this means that reply packets on interface-automatic
sockets will always attempt to leave the system from the same
(physical/logical) interface the query came in. This is fine for a
single-homed server, however when running on multi-homed systems (e.g.
on a router with multiple interfaces on a meshed network) there will be
cases of asymmetric routing where the return route to the client goes
through a different interface than the one the query came in and
unbound's replies to not directly connected clients will be silently
dropped.

Since using the correct source address is all that interface-automatic
is about, we should really pass an ipi_ifindex set to 0 to sendmsg(2)
and let the system's routing tables decide the actual interface that
should be used to send the reply, while still retaining the correct
source address in ipi_spec_dst.

The attached patch fixes this.

Regards,
Apollon

--System Information:
Debian Release: 8.0
  APT prefers testing
  APT policy: (500, 'testing'), (90, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=el_GR.UTF-8, LC_CTYPE=el_GR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
From 588448143170de218fa68fdd01cbbad4d7a2179c Mon Sep 17 00:00:00 2001
From: Apollon Oikonomopoulos <apoi...@debian.org>
Date: Wed, 1 Apr 2015 16:36:50 +0300
Subject: [PATCH] Fix interface-automatic on multihomed systems

To provide single-socket UDP multihoming, unbound uses the ancilary
IP_PKTINFO data received during recvmsg(2) and passes it as-is
(including ipi_ifindex) to sendmsg(2). The in_pktinfo structure contains
the following members:

    struct in_pktinfo {
        unsigned int   ipi_ifindex;  /* Interface index */
        struct in_addr ipi_spec_dst; /* Local address */
        struct in_addr ipi_addr;     /* Header Destination
          			      address */
    };

At recvmsg(2) time ipi_ifindex contains the ifindex of the interface the packet
arrived on, while ipi_spec_dst contains the local interface address the packet
matched. Regarding sendmsg(2), man 7 ip states:

    If  IP_PKTINFO  is  passed  to  sendmsg(2)  and ipi_spec_dst  is  not
    zero, then it is used as the local source address for the routing table
    lookup and for setting up IP source route options.  When ipi_ifindex is
    not zero, the primary local address of the interface  specified  by
    the  index overwrites ipi_spec_dst for the routing table lookup.

So it appears as if passing ipi_ifindex should only affect cases where source
routing is performed. However, the actual Linux kernel implementation
(as of 4.0-rc6) states the following (__ip_route_output_key() in
net/ipv4/route.c):

    if (fib_lookup(net, fl4, &res)) {
            res.fi = NULL;
            res.table = NULL;
            if (fl4->flowi4_oif) {
                    /* Apparently, routing tables are wrong. Assume,
                       that the destination is on link.

                       WHY? DW.
                       Because we are allowed to send to iface
                       even if it has NO routes and NO assigned
                       addresses. When oif is specified, routing
                       tables are looked up with only one purpose:
                       to catch if destination is gatewayed, rather than
                       direct. Moreover, if MSG_DONTROUTE is set,
                       we send packet, ignoring both routing tables
                       and ifaddr state. --ANK

                       We could make it even if oif is unknown,
                       likely IPv6, but we do not.
                     */

                    if (fl4->saddr == 0)
                            fl4->saddr = inet_select_addr(dev_out, 0,
                                                          RT_SCOPE_LINK);
                    res.type = RTN_UNICAST;
                    goto make_route;
            }

What this basically does is that ipi_ifindex will override any routing
table decision and force the packet out of that very interface.
Furthermore, if there is no routing table entry for the destination via
that interface, the destination will be assumed to be on-link and will
not be routed via a gateway. Note that no error will be returned to
userspace and if the destination does not respond to an ARP request on
that very link, the packet will be silently dropped.

So, for unbound this means that reply packets on interface-automatic
sockets will always attempt to leave the system from the same
(physical/logical) interface the query came in. This is fine for a
single-homed server, however when running on multi-homed systems (e.g.
on a router with multiple interfaces on a meshed network) there will be
cases of asymmetric routing where the return route to the client goes
through a different interface than the one the query came in and
unbound's replies to not directly connected clients will be silently
dropped.

Since using the correct source address is all that interface-automatic
is about, we should really pass an ipi_ifindex set to 0 to sendmsg(2)
and let the system's routing tables decide the actual interface that
should be used to send the reply, while still retaining the correct
source address in ipi_spec_dst.
---
 util/netevent.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/util/netevent.c b/util/netevent.c
index c7ed30e..b884d4c 100644
--- a/util/netevent.c
+++ b/util/netevent.c
@@ -504,6 +504,8 @@ comm_point_send_udp_msg_if(struct comm_point *c, sldns_buffer* packet,
 		cmsg->cmsg_type = IP_PKTINFO;
 		memmove(CMSG_DATA(cmsg), &r->pktinfo.v4info,
 			sizeof(struct in_pktinfo));
+		/* unset the ifindex to not bypass the routing tables */
+		((struct in_pktinfo *) CMSG_DATA(cmsg))->ipi_ifindex = 0;
 		cmsg->cmsg_len = CMSG_LEN(sizeof(struct in_pktinfo));
 #elif defined(IP_SENDSRCADDR)
 		msg.msg_controllen = CMSG_SPACE(sizeof(struct in_addr));
@@ -524,6 +526,8 @@ comm_point_send_udp_msg_if(struct comm_point *c, sldns_buffer* packet,
 		cmsg->cmsg_type = IPV6_PKTINFO;
 		memmove(CMSG_DATA(cmsg), &r->pktinfo.v6info,
 			sizeof(struct in6_pktinfo));
+		/* unset the ifindex to not bypass the routing tables */
+		((struct in6_pktinfo *) CMSG_DATA(cmsg))->ipi6_ifindex = 0;
 		cmsg->cmsg_len = CMSG_LEN(sizeof(struct in6_pktinfo));
 	} else {
 		/* try to pass all 0 to use default route */
-- 
2.1.4

Attachment: signature.asc
Description: Digital signature

Reply via email to