Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-10-18 Thread Juliusz Chroboczek
> A quick skimming of RFC 7298 suggests the PC is indeed intended to be
> per-interface without taking the {mult,un}icast bit into account. Is this
> an oversight in the spec?

Hi Daniel,

I've just given a talk about this issue, and while reviewing our
correspondence while preparing the talk, I've realised that in your
initial mail you gave us the solution that we finally implemented and are
in the process of normalising.

You are currently acknowledged in the draft¹, but only as the person who
noticed the issue.  My apologies for the oversight.  Please be assured
that I'll change this before publication so as to properly credit you as
suggesting the solution.

¹ 
https://www.ietf.org/archive/id/draft-ietf-babel-mac-relaxed-02.html#name-acknowledgments

Sorry for the oversight,

-- Juliusz

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-09 Thread Daniel Gröber
Hi,

On Wed, May 04, 2022 at 02:16:08PM +0200, Juliusz Chroboczek wrote:
> We still need to understand why you're getting systematic packet
> reordering. 

See the two attached pcaps, babel-reorder-sender is on the AP wlanX device
and babel-reorder-receiver is on the other side of the wireless link.

I'm also including babel packets in both directions as Toke asked
(wireshark filter: babel || icmpv6).

You can see the reordering at sender packet no 12 vs. receiver packet no
161.

Toke, I'm running without the DSCP iptables rule but I'm seeing CS0 anyway
so it's probably not that, right?

--Daniel



babel-reorder-sender.pcapng
Description: Binary data


babel-reorder-receiver.pcapng
Description: Binary data
___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-06 Thread Toke Høiland-Jørgensen
Juliusz Chroboczek  writes:

> CC-ing babel@ietf.  List, Daniel has reported that multicast packets are
> delayed in his network by up to 200ms, which breaks Babel-MAC's PC check.
> Toke has determined that the issue is with WiFi powersave, which is
> unfortunately not something we can control.  Toke has proposed a patch
> against his implementation of Babel that implements window-based PC
> validation, in the style of RFC 4303 Section 3.4.3.
>
>> I took a shot at implementing window-based PC verification in Bird,
>> patch below (compile-tested only);
>
> That should work, although I fear that a window size of 64 is not enough,
> especially since RFC 8967 Section 4.2 allows increasing the PC by more
> than one.  So we'd either need to remove that latitude from the spec, or
> require the use of a more complicated data structure.

My PoC uses a window size of 32, actually, but could be trivially
extended to 64 (just by switching from a u32 to a u64 for storing the
bitmap), and a bit less trivially to a larger window by using a bitmap
spanning several u64s (or some other data structure).

> But I've been thinking the issue is that we require a single strictly
> monotonic sequence of PCs that mixes up unicast and multicast packet.
> What about relaxing the requirement so that the sequence of unicast
> packets is monotonic, the sequence of multicast packets is monotonic, but
> the two sequences can grow independently?  This will still prevent replay:
> a unicast packet won't be possibly replayed as unicast, due to the
> monotonicity condition, and it cannot be replayed as multicast, since the
> MAC covers the pseudo-header
>
> More precisely, I propose that we maintain two distinct "last PC" fields
> in the neighbour table, called PCu and PCm.  These behave as follows:
>
>   - when we receive a challenge reply, we set both PCu and PCm to the
> value received in the challenge reply;
>   - when we receive a normal packet, we compare its PC against *either*
> PCu or PCm, depending on whether it's unicast or multicast;
>   - if the packet is accepted, we update *either* PCu or PCm, leaving the
> other value unchanged.
>
> (We could generalise that to having one PC value per destination address,
> but I'm not sure it's useful.)

Hmm, I certainly see where you're coming from; having separate sequence
numbers for unicast/multicast would neatly sidestep this particular
problem. However, one problem with this is that it's not straight-forwardly
backward compatible. I.e., if a sender starts using separate sequence
number spaces for unicast and multicast, they would become incompatible
with receivers implementing RFC8967 as it is written today, even on
networks that exhibit no reordering. Whereas simply having a reorder
window is more of a "be lenient in what you accept" thing on top of the
existing spec (i.e., a babel speaker implementing a replay window can
interoperate fine with one that doesn't, except for reordering). Also,
packets could get reordered for other reasons, not necessarily related
to whether they are unicast or multicast; a window-based approach would
deal with that as well.

As for the size of the window (setting aside the case where an
implementation increases the PC by more than one for every packet), I
guess we'd need it to be large enough to contain a full routing table
dump. A window of 64 packets can fit several thousand routes even in the
worst case with no compression; so I'm wondering if this isn't enough
for most deployments?

-Toke

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-06 Thread Juliusz Chroboczek
CC-ing babel@ietf.  List, Daniel has reported that multicast packets are
delayed in his network by up to 200ms, which breaks Babel-MAC's PC check.
Toke has determined that the issue is with WiFi powersave, which is
unfortunately not something we can control.  Toke has proposed a patch
against his implementation of Babel that implements window-based PC
validation, in the style of RFC 4303 Section 3.4.3.

> I took a shot at implementing window-based PC verification in Bird,
> patch below (compile-tested only);

That should work, although I fear that a window size of 64 is not enough,
especially since RFC 8967 Section 4.2 allows increasing the PC by more
than one.  So we'd either need to remove that latitude from the spec, or
require the use of a more complicated data structure.

But I've been thinking the issue is that we require a single strictly
monotonic sequence of PCs that mixes up unicast and multicast packet.
What about relaxing the requirement so that the sequence of unicast
packets is monotonic, the sequence of multicast packets is monotonic, but
the two sequences can grow independently?  This will still prevent replay:
a unicast packet won't be possibly replayed as unicast, due to the
monotonicity condition, and it cannot be replayed as multicast, since the
MAC covers the pseudo-header

More precisely, I propose that we maintain two distinct "last PC" fields
in the neighbour table, called PCu and PCm.  These behave as follows:

  - when we receive a challenge reply, we set both PCu and PCm to the
value received in the challenge reply;
  - when we receive a normal packet, we compare its PC against *either*
PCu or PCm, depending on whether it's unicast or multicast;
  - if the packet is accepted, we update *either* PCu or PCm, leaving the
other value unchanged.

(We could generalise that to having one PC value per destination address,
but I'm not sure it's useful.)

-- Juliusz

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-05 Thread Daniel Gröber
Hi Toke,

On Thu, May 05, 2022 at 07:14:24PM +0200, Toke Høiland-Jørgensen wrote:
> Could you please try turning off power saving on the client device and
> see if that makes the problem go away? 'iw dev wlan0 set power_save off'
> should do the trick. You'll also need to make sure no other devices are
> connected to the AP while you're testing (or that they also all have
> power save disabled).

Yup, that did the trick. If I kick all the smartphones off my network (and
disable power-save on the babel peer) the problem goes away but comes back
as soon as I reconnect one.

I was wondering why you wanted me to remove all clients but I think I get
it: ordinarily the power save packet buffering would only affect one client
but since we're talking about multicast here the AP (apparently) buffers
those packets if _any_ client is requesting power save.

--Daniel

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Dave Taht
Multicast is pinned to the beacon in a conventional wifi setup,
usually on a 100ms interval.

Sorry I have no further insights to offer at the moment (5AM here).

On Wed, May 4, 2022 at 5:16 AM Juliusz Chroboczek  wrote:
>
> > > 1. Are you running with the "unicast" option set in your config file?
> >
> > Aaah turns out I was, because I had it set in `default` for my wireguard
> > links.
> >
> > Adding `interface enpxx unicast false` magically fixes this. According to
> > the docs this skips sending a duplicate hello to neighbours which would
> > explain why it works.
>
> Not quite.  Babel has three kinds of TLVs:
>
>   - Discovery Hellos, which are always sent over multicast;
>   - requests, which are always sent over unicast;
>   - the bulk of the protocol, which may be sent either over multicast or
> unicast.
>
> The unicast option controls whether the bulk of the protocol is sent over
> multicast (unicast off) or sent to each peer over unicast (unicast on).
> In your case, Babel was sending
>
>   Hello multicast
>   IHU unicast
>
> The Hello and the IHU were getting reordered, so the Hello was getting
> dropped due to an incorrect packet counter value.  With "unicast false",
> Babel is sending a single aggregated Hello+IHU over multicast, so no
> reordering can happen.
>
> We still need to understand why you're getting systematic packet
> reordering.  If it's something that cannot be avoided, then we will need
> to update the HMAC implementation (and spec!) to maintain a replay window,
> in the style of
>
>   https://datatracker.ietf.org/doc/html/rfc4303#section-3.4.3
>
> > Where are you getting the 200ms number from exactly?
>
> Here:
>
>   10:24:31.056310 IP6 fe80::1.6696 > fe80::c23c:59ff:fe4a:ce46.6696
>   Router Id 02:0d:b9:ff:fe:4e:90:54
>   Update/prefix...
>   PC value 57567 index len 8
>
>   10:24:31.257774 IP6 fe80::1.6696 > ff02::1:6.6696
>   Hello seqno 36696 interval 4.00s
>   PC value 57566 index len 87
>
> It's suspiciously close to 200ms (201.5ms exactly).  Toke, you're the
> world specialist of the Linux WiFi stack -- do you see a hardwired 200ms
> delay somewhere?
>
> -- Juliusz
>
> ___
> Babel-users mailing list
> Babel-users@alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users



-- 
FQ World Domination pending: https://blog.cerowrt.org/post/state_of_fq_codel/
Dave Täht CEO, TekLibre, LLC

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Juliusz Chroboczek
> > 1. Are you running with the "unicast" option set in your config file?
> 
> Aaah turns out I was, because I had it set in `default` for my wireguard
> links.
> 
> Adding `interface enpxx unicast false` magically fixes this. According to
> the docs this skips sending a duplicate hello to neighbours which would
> explain why it works.

Not quite.  Babel has three kinds of TLVs:

  - Discovery Hellos, which are always sent over multicast;
  - requests, which are always sent over unicast;
  - the bulk of the protocol, which may be sent either over multicast or
unicast.

The unicast option controls whether the bulk of the protocol is sent over
multicast (unicast off) or sent to each peer over unicast (unicast on).
In your case, Babel was sending

  Hello multicast
  IHU unicast

The Hello and the IHU were getting reordered, so the Hello was getting
dropped due to an incorrect packet counter value.  With "unicast false",
Babel is sending a single aggregated Hello+IHU over multicast, so no
reordering can happen.

We still need to understand why you're getting systematic packet
reordering.  If it's something that cannot be avoided, then we will need
to update the HMAC implementation (and spec!) to maintain a replay window,
in the style of

  https://datatracker.ietf.org/doc/html/rfc4303#section-3.4.3

> Where are you getting the 200ms number from exactly?

Here:

  10:24:31.056310 IP6 fe80::1.6696 > fe80::c23c:59ff:fe4a:ce46.6696
  Router Id 02:0d:b9:ff:fe:4e:90:54
  Update/prefix...
  PC value 57567 index len 8

  10:24:31.257774 IP6 fe80::1.6696 > ff02::1:6.6696
  Hello seqno 36696 interval 4.00s
  PC value 57566 index len 87

It's suspiciously close to 200ms (201.5ms exactly).  Toke, you're the
world specialist of the Linux WiFi stack -- do you see a hardwired 200ms
delay somewhere?

-- Juliusz

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Daniel Gröber
Hi Juliusz,

On Wed, May 04, 2022 at 12:20:57PM +0200, Juliusz Chroboczek wrote:
> 1. Are you running with the "unicast" option set in your config file?

Aaah turns out I was, because I had it set in `default` for my wireguard
links.

Adding `interface enpxx unicast false` magically fixes this. According to
the docs this skips sending a duplicate hello to neighbours which would
explain why it works.

--Daniel

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Daniel Gröber
Hi Toke,

On Wed, May 04, 2022 at 01:02:02PM +0200, Toke Høiland-Jørgensen wrote:
> Daniel, you say you noticed this when you "turned on fq_codel", but also
> that this is only happening over wireless links. Do you mean that you
> explicitly enabled fq_codel on the WiFi links (as opposed to using the
> built-in FQ-CoDel implementation in the WIFi stack), or is there an
> Ethernet hop involved? And how are the interface(s) in question
> configured (station/AP/mesh?) and which WiFi driver is used?

Right, so my router is a separate box, attached to an OpenWrt AP
(ubnt,unifiac-pro) via a switch. So there is an ethernet hop involved. The
AP is in normal infrastructure mode.

What I mean by fq_codel enablement is just setting `sysctl
net.core.default_qdisc=fq_codel`
on the router. The AP already had this set by default.

Given that the problem doesn't happen over ethernet (see below) that's
probably a red herring thogh.

> Also, you mention the other side is running bird; does the reordering
> only happen with babeld as the sender?

babeld doesn't log auth failures AFAICT but the neighbour cost stays at
infinity there too so I assume it's having the same problem. I'm happy to
debug that too but I think we should take it one problem at a time :)

> I agree with Juliusz that 200ms seems a bit on the high side for such a
> delay, but if the channel is suffering a lot of congestion (not
> necessarily from the same station), I suppose it *could* take that long
> for the scheduling to get around to servicing the multicast queue *and*
> getting and airtime slot (multicast also runs at a very low bit level, so
> the packet will take up more airtime, which will make it more prone to
> interference and thus retransmissions, causing further delay).

Where are you getting the 200ms number from exactly? I don't think my link
is congested certainly nothing was going on in my network at the time and
none of the neigbours around here are using the 5GHz band so I'd be
surprised if that was it.

I did just try it over ethernet and that isn't showing the same problem,
hmm. So the wireless link is somehow to blame still.

> Of course the diffserv hypothesis is quite easy to test: just disable
> diffserv and see if the problem goes away. I don't think babeld has a
> configuration knob for this, but you could clear it with an iptables
> rule like:
> 
> ip6tables -t mangle -A OUTPUT -o wlan0 -p udp --dport 6696 -j DSCP --set-dscp > 0

I tried that with the output device adjusted to my ethernet device, didn't
change anything.

--Daniel

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Juliusz Chroboczek
> I'm attaching a filtered down pcap from the receiving side showing the
> problem. Wireshark filter: (babel && ipv6.src == fe80::1) || icmpv6. The
> sending side is running babeld the receiving side bird2 if that matters.

Interesting, thanks.

1. Are you running with the "unicast" option set in your config file?

2. Is the link badly congested?  Your dump indicates that a packet got
   delayed by no less than 200ms (!).

Dave, Toke, please have a look at packets #10 and #11 in Daniel's dump,
and let me know if you're as puzzled as I am.

-- Juliusz

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-04 Thread Daniel Gröber
Hi Juliusz,

On Tue, May 03, 2022 at 08:18:07PM +0200, Juliusz Chroboczek wrote:
> So the multicast packet overtook the unicast one.  That's probably due to
> neighbour discovery delaying the unicast packet.  Could you please provide
> the timestamps?

I don't think it is ND. I started noticing this when I enabled fq_codel on
my Debian systems so I have a hunch it's because of the more fancy per-flow
queueing which it does. Was the MAC stuff ever tested on any openwrt
systems? Those have had fq_codel by default for a while now. Debian only
recently switched to it in unstable.

I'm attaching a filtered down pcap from the receiving side showing the
problem. Wireshark filter: (babel && ipv6.src == fe80::1) || icmpv6. The
sending side is running babeld the receiving side bird2 if that matters.

> Daniel, how often does it happen? 

Very frequently. AFAICT the neighbour never gets assigned anything better
than the infintiy metric so it's unusable.

Thanks,
--Daniel


babel-mac-reorder.pcap
Description: application/vnd.tcpdump.pcap
___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] Babel MAC auth fails due to packet reordering

2022-05-03 Thread Juliusz Chroboczek
> I'm seeing babel mac authentication failures related to the packet counter
> on a wireless link. I've tracked this down to being because of packet
> reordering.

Babel-MAC does not deal with packet reordering: we assume that packets
don't get reordered on the local link.  It would be fairly trivial to
extend it to deal with moderate amounts of reordering (by keeping a window
of reordered packets, like in DTLS), but I'd rather we didn't make the
code more complex: this is a security algorithm, and good security relies
on simplicity.

> Wireshark packet traces on both sides look something like this:

Cool, thanks a lot.

> Sender:
> 
> Src Dst   PC
> fe80::1 fe80::2   1452  Babel router-id update update update pc hmac
> fe80::1 ff02::1:6 1453  Babel hello pc hmac
> fe80::1 fe80::2   1454  Babel ihu pc hmac
> 
> Receiver:
> 
> fe80::1 fe80::2   1452  Babel router-id update update update pc hmac
> fe80::1 fe80::2   1454  Babel ihu pc hmac
> fe80::1 ff02::1:6 1453  Babel hello pc hmac
> 
> AFAICT babeld shares the packet counter across unicast and multicast
> hellos, however since these constitute different flows it seems reasonable
> for something in the network stack to reorder them.

So the multicast packet overtook the unicast one.  That's probably due to
neighbour discovery delaying the unicast packet.  Could you please provide
the timestamps?

> A quick skimming of RFC 7298 suggests the PC is indeed intended to be
> per-interface without taking the {mult,un}icast bit into account. Is this
> an oversight in the spec?

No, it's by design: the protocol assumes that there is no reordering on
the local link.

Daniel, how often does it happen?  If it's due to neighbour discovery, it
should happen no more often than once per minute, and Babel should be
quite able to compensate for that.  If it happens more often than that,
then we'll need to look into it further, and perhaps implement
a reordering window in HMAC.

-- Juliusz

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users


[Babel-users] Babel MAC auth fails due to packet reordering

2022-05-02 Thread Daniel Gröber
Hi babel-users,

I'm seeing babel mac authentication failures related to the packet counter
on a wireless link. I've tracked this down to being because of packet
reordering. Wireshark packet traces on both sides look something like this:

Sender:

Src Dst   PC
fe80::1 fe80::2   1452  Babel router-id update update update pc hmac
fe80::1 ff02::1:6 1453  Babel hello pc hmac
fe80::1 fe80::2   1454  Babel ihu pc hmac

Receiver:

fe80::1 fe80::2   1452  Babel router-id update update update pc hmac
fe80::1 fe80::2   1454  Babel ihu pc hmac
fe80::1 ff02::1:6 1453  Babel hello pc hmac

AFAICT babeld shares the packet counter across unicast and multicast
hellos, however since these constitute different flows it seems reasonable
for something in the network stack to reorder them.

A quick skimming of RFC 7298 suggests the PC is indeed intended to be
per-interface without taking the {mult,un}icast bit into account. Is this
an oversight in the spec?

--Daniel

___
Babel-users mailing list
Babel-users@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users