It is possible to bind a socket to a particular network device using
SO_BINDTODEVICE. If a socket is bound to a device, then
sk->bound_dev_if is set to dev->ifindex, otherwise sk->bound_dev_if is
set to 0.
bound_dev_if is used in ip_output_slow as part of the hash code
calculation. Here, oif is set to sk->bound_dev_if.
hash = rt_hash_code(daddr, saddr^(oif<<5), tos);
err = rt_intern_hash(hash, rth, rp);
So under most circumstances, hash=rt_hash_code(daddr, saddr, tos), but
if the socket is explicitly bound to a device, rt_hash_code(daddr,
saddr^(oif<<5), tos) is used instead.
If, when data is send out on a socket, an icmp fragmentation needed is
returned, the mtu for that connection needs to be updated.
ip_rt_frag_needed computes the hash as rt_hash_code(daddr, skeys[i], tos).
where skeys[2] contains {iph->saddr, 0}.
The network device is not mentioned in ip_rt_frag_needed at all, so
for sockets bound to devices, the wrong entry in the hash table is
used and the mtu is not updated. pmtu black-hole detection will cover
up most of the effects of this error, so it isn't highly visible.
I haven't done any experiments to confirm that the bug exists, but if
I'm wrong about how the code works, I'd appreciate any feedback.
Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-net" in
the body of a message to [EMAIL PROTECTED]