Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Hi Juliusz, On Sat, Jul 23, 2022 at 09:44:58PM +0200, Juliusz Chroboczek wrote: > > > > While this is not fatal for the reordering fix per-se your RTT patch also > > breaks because of this AFAICT. Since the IHU tstamps only ever arrive via > > unicast. At least with my `unicast true` babeld config. > > More exactly, with "unicast true" babeld will send a IHUs over unicast, > which will in turn force it to include an unscheduled unicast hello. You > should be able to work around the issue by setting > > rfc6126-compatible true > > in babeld. However, this has other consequences, such as breaking > source-specific routing. Ah I see. That could be an option I suppose, since I don't need SSR. > > Nothing bad seems to happen if I just comment this out :) Do you have have > > any pointers as to what needs to be impl-emented in bird to properly support > > unicast hellos? > > You don't need full support for unicast hellos, you just need to parse the > sub-TLVs of unicast hellos in order to extract the timestamp. I'm literally testig a patch that does just that now, sounds like I'm on the right track then. Thanks! --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> > While this is not fatal for the reordering fix per-se your RTT patch also > breaks because of this AFAICT. Since the IHU tstamps only ever arrive via > unicast. At least with my `unicast true` babeld config. More exactly, with "unicast true" babeld will send a IHUs over unicast, which will in turn force it to include an unscheduled unicast hello. You should be able to work around the issue by setting rfc6126-compatible true in babeld. However, this has other consequences, such as breaking source-specific routing. > Nothing bad seems to happen if I just comment this out :) Do you have have > any pointers as to what needs to be impl-emented in bird to properly support > unicast hellos? You don't need full support for unicast hellos, you just need to parse the sub-TLVs of unicast hellos in order to extract the timestamp. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Hi Toke, I've spent some time today trying to debug the weird behaviour with your ooo-pc bird patch. I found the bird code ignores unicast hellos entirely which I wasn't expecting :) babel_read_hello: /* We currently don't support unicast Hello */ u16 flags = get_u16(>flags); if (flags & BABEL_HF_UNICAST) return PARSE_IGNORE; Nothing bad seems to happen if I just comment this out :) Do you have have any pointers as to what needs to be implemented in bird to properly support unicast hellos? While this is not fatal for the reordering fix per-se your RTT patch also breaks because of this AFAICT. Since the IHU tstamps only ever arrive via unicast. At least with my `unicast true` babeld config. --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
On Sat, May 14, 2022 at 12:37:16AM +0200, Toke Høiland-Jørgensen wrote: > Ah, oops; looks like I got my operator precedence wrong, so the code is > doing pointer arithmetic instead of adding to the value being pointed > to... > > Pushed a fixed version here: > https://github.com/tohojo/bird/tree/babel-ooo-pc > > Could you try if that works better, please? :) It's still crashing: #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65 #1 0x55592773 in bvsnprintf ( buf=0x7fffd770 "babel1: Authentication PC (\367\377\177", size=997, fmt=0x55638ab9 "s) %u already seen (window start %u, value %x)", fmt@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at lib/printf.c:256 #2 0x555931ed in buffer_vprint (buf=buf@entry=0x7fffdb80, fmt=fmt@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at lib/printf.c:531 #3 0x555f81d8 in vlog (class=7, msg=msg@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at sysdep/unix/log.c:219 #4 0x555f83c2 in log_rl (f=f@entry=0x5567ed50, msg=, msg@entry=0x55638aa0 "\a%s: Authentication PC (%s) %u already seen (window start %u, value %x)") at sysdep/unix/log.c:262 #5 0x555b5df3 in babel_auth_check_pc (ifa=ifa@entry=0x556a2410, msg=msg@entry=0x7fffded8) at proto/babel/babel.c:1568 #6 0x555b8738 in babel_auth_check (ifa=ifa@entry=0x556a2410, saddr=..., sport=, daddr=..., dport=6696, pkt=0x556a5460, trailer=0x556a55cb "\020 \022\247\063\345%\215\210\227\334\001з\363\265\331\025O\325f\230tou\313\036g\020\244\256\220Y\354{.(4\366\310\340y\020 \276\205\016ۭ>\257\320e\\\301\360\350\321\375\245m\023\214q\267\257\327\034\270\275v\327V<Ʈ", trailer_len=34) at proto/babel/packets.c:1907 #7 0x555b8b0a in babel_process_packet (ifa=0x556a2410, pkt=0x556a5460, len=, saddr=..., sport=, daddr=..., dport=6696) at proto/babel/packets.c:1492 #8 0x555b8f6b in babel_rx_hook (sk=, len=) at proto/babel/packets.c:1585 #9 0x555f5483 in sk_read (s=0x556a5310, revents=) at sysdep/unix/io.c:1914 #10 0x555f6181 in io_loop () at sysdep/unix/io.c:2349 #11 0x555660e6 in main (argc=, argv=) at sysdep/unix/main.c:940 --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> I've managed to reproduce the problem locally, and I've confirmed that the > split-PC approach fixes the issue. I'm seeing failed PC validations, but > not enough to cause association failure. Just to be clear -- I'm seeing failed PC validations with stock 1.12. I'm not seeing any unexpected PC validation failures with the split-PC version. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Daniel, I've managed to reproduce the problem locally, and I've confirmed that the split-PC approach fixes the issue. I'm seeing failed PC validations, but not enough to cause association failure. I've merged the fix into master. Right now, I'm not planning to implement the window-based algorithm, which Toke has implemented in addition to split-PC, but I'm open to evidence that it is actually needed. I'm planning to release babeld-1.12.1 soon. Please let me know if for some reason you need a backport to 1.11. Thanks again for your help, -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Daniel Gröber writes: > Hi Toke, > > after running with your patch for a short while I'm actually starting to > see frequent crashes. Here's a backtrace for one: Ah, oops; looks like I got my operator precedence wrong, so the code is doing pointer arithmetic instead of adding to the value being pointed to... Pushed a fixed version here: https://github.com/tohojo/bird/tree/babel-ooo-pc Could you try if that works better, please? :) -Toke ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Hi Toke, after running with your patch for a short while I'm actually starting to see frequent crashes. Here's a backtrace for one: Program received signal SIGSEGV, Segmentation fault. __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65 65 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory. (gdb) bt #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65 #1 0x55592773 in bvsnprintf ( buf=0x7fffd770 "babel1: Authentication PC (\367\377\177", size=997, fmt=0x55638ab9 "s) %u already seen (window start %u, value %x)", fmt@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at lib/printf.c:256 #2 0x555931ed in buffer_vprint (buf=buf@entry=0x7fffdb80, fmt=fmt@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at lib/printf.c:531 #3 0x555f81d8 in vlog (class=7, msg=msg@entry=0x55638aa1 "%s: Authentication PC (%s) %u already seen (window start %u, value %x)", args=args@entry=0x7fffdbc0) at sysdep/unix/log.c:219 #4 0x555f83c2 in log_rl (f=f@entry=0x5567ed50, msg=, msg@entry=0x55638aa0 "\a%s: Authentication PC (%s) %u already seen (window start %u, value %x)") at sysdep/unix/log.c:262 #5 0x555b5df3 in babel_auth_check_pc (ifa=ifa@entry=0x556a17f0, msg=msg@entry=0x7fffded8) at proto/babel/babel.c:1568 #6 0x555b8738 in babel_auth_check (ifa=ifa@entry=0x556a17f0, saddr=..., sport=, daddr=..., dport=6696, pkt=0x556a1b80, trailer=0x556a1b9a "\020 z\334\016\"\367\212\304u\320\317\333\022\357\363t\a\277\036\356\234\304\370\236\177\351\232mW\236a\235\255", trailer_len=34) at proto/babel/packets.c:1907 #7 0x555b8b0a in babel_process_packet (ifa=0x556a17f0, pkt=0x556a1b80, len=, saddr=..., sport=, daddr=..., dport=6696) at proto/babel/packets.c:1492 #8 0x555b8f6b in babel_rx_hook (sk=, len=) at proto/babel/packets.c:1585 #9 0x555f5483 in sk_read (s=0x556a1a30, revents=) at sysdep/unix/io.c:1914 #10 0x555f6181 in io_loop () at sysdep/unix/io.c:2349 #11 0x555660e6 in main (argc=, argv=) at sysdep/unix/main.c:940 I can re-test with -O0 tomorrow if that helps. --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Thanks a lot, Daniel. > I'm having some trouble establishing a baseline using babeld. Using > babeld-1.11 as both the sending and receiving side I'm not observing any > errors You need to run babeld with the "-d2" flag to see MAC and PC validation errors. > and the session seems to come up perfectly It looks to me like you were lucky. There's some reordering going on in your trace, but it's never severe enough to cause association failures. I'll try to reproduce your issue locally, you've given me all the hints I need. At any rate, your results seem to indicate that we've successfully solved the issue, which means we can try to push the Internet-Draft through the working group. I'm very grateful for your report and for your help with understanding the issue. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
On Fri, May 13, 2022 at 08:54:19PM +0200, Daniel Gröber wrote: > 3.a) Update Update receiving side to patched bird. > 3.b) Observe neighbour metric still nominal and no auth errors. > > For babeld > > 4.a) Shut down bird on the receiver and start unpatched babeld instead. > 4.b) On the receiver: Observe through local-path interface that sender >has nominal neighbour metric. (unexpected) Err, this should be 3.a) Update sending side to patched babeld 3.b) Observe neighbour metric still nominal and no auth errors. and 4) Revert sending side to unpatched babeld 4.a) ... --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
Hi Toke and Juliusz, On Sun, May 08, 2022 at 10:01:53PM +0200, Toke Høiland-Jørgensen wrote: > Right, okay. I updated the Bird patch to implement both the separate > ucast/mcast values and the window (patch below). Daniel, could you > please test this in your environment? I've added the patch on top of the bird2 2.0.9-1 Debian package and can confirm that using the patched version on the receiving end fixes the issue with both un-/patched babeld. So it seems compatibility is also not broken either :) On Mon, May 09, 2022 at 04:56:14PM +0200, Juliusz Chroboczek wrote: > You'll find a patch for babeld in the branch "hmac-unicast-pc" > > git clone -b hmac-unicast-pc https://github.com/jech/babeld > > The patch is here: > > > https://github.com/jech/babeld/commit/7e5d18791f5b5f2d5ad660fad85769f75f47f705 > > Daniel, please report whether that fixes the problem, so we can merge and > start writing up a new Internet-Draft. I'm having some trouble establishing a baseline using babeld. Using babeld-1.11 as both the sending and receiving side I'm not observing any errors and the session seems to come up perfectly though I can see reordering in wireshark and bird having thrown errors during testing just before. So the link is still behaving the same. I'm attaching a pcap from that situation: babeld-reordered-but-working.pcapng. Overall testing methodology: 1) Revert sender babeld config to failing "unicast true" version, use unpatched babeld 1.11 sender and unpatched bird 2.0.9 receiver. For bird: 2.a) on the receiver: Observe neigbour metric for sender is stuck at infinity and MAC auth errors are still emitted. 2.b) Update receiving side to 2.0.9 with Toke's patch. 2.c) Observe neghbour metric returning to normal and absence of auth errors. 3.a) Update Update receiving side to patched bird. 3.b) Observe neighbour metric still nominal and no auth errors. For babeld 4.a) Shut down bird on the receiver and start unpatched babeld instead. 4.b) On the receiver: Observe through local-path interface that sender has nominal neighbour metric. (unexpected) Config files: # Sender key id 1 type hmac-sha256 value local-path /run/babeld.status default type tunnel unicast true interface enp2s0 type wired key 1 kernel-priority 200 # Receiver key id 1 type hmac-sha256 value local-path /run/babeld.status default type tunnel unicast true interface wlp3s0 type wireless key 1 kernel-priority 200 --Daniel babeld-reordered-but-working.pcapng Description: Binary data ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
On Thu, May 12, 2022 at 04:43:35PM +0200, Juliusz Chroboczek wrote: > > The patch is here: > > > > > > https://github.com/jech/babeld/commit/7e5d18791f5b5f2d5ad660fad85769f75f47f705 > > Daniel, could you please confirm that whether this fixes the issue? I've been a bit busy. I'll get to the testing during the weekend :) --Daniel ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> You'll find a patch for babeld in the branch "hmac-unicast-pc" > > git clone -b hmac-unicast-pc https://github.com/jech/babeld > > The patch is here: > > > https://github.com/jech/babeld/commit/7e5d18791f5b5f2d5ad660fad85769f75f47f705 Daniel, could you please confirm that whether this fixes the issue? -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> Right, okay. I updated the Bird patch to implement both the separate > ucast/mcast values and the window (patch below). Daniel, could you > please test this in your environment? You'll find a patch for babeld in the branch "hmac-unicast-pc" git clone -b hmac-unicast-pc https://github.com/jech/babeld The patch is here: https://github.com/jech/babeld/commit/7e5d18791f5b5f2d5ad660fad85769f75f47f705 Daniel, please report whether that fixes the problem, so we can merge and start writing up a new Internet-Draft. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> Ah, I see! Okay, that makes sense. Also, it occurred to me that the > window-based approach likely isn't enough when there are multiple > neighbours and you do unicast updates, as then another neighbour can eat > up a whole chunk of PC number space that you never see. Exactly. The sender maintains just one (index, PC) state per interface, not one state per destination. (In constrained environments, you could in principle have just one state for all interfaces, although that's not allowed by the RFC as it is currently written.) > However, what about other sources of reordering? Should we still do > window-based verification to deal with this? We might add it as an option to the document you suggest. I'm not currently planning to add it to babeld, but I might change my mind if new evidence that it is needed surfaces. Ok? > Also, I guess this could all be described in a "relaxed PC verification > to deal with reordering" document that could be optional to implement > (i.e., you could still be compliant with RFC 8967 if you don't implement > it)? I tend to agree, but I'd rather we did the implementation first, to see how it goes. >> Expect on the order of 60 routes per packet. 64 packets gives you on >> the order of 3800 routes. > Right. Which is a lot for a local mesh network, but not a lot for the > internet. OTOH, you should be spreading the updates over the whole length of the update interval to avoid sending bursts of packets. It's been on my todo list for babeld for a long time, but I never got around to implementing it. > Do you have any insights into typical sizes of real-world babel > deployments in terms of the number of routes? Nexedi have around 1000 routers. I don't know how many routes they're advertising in total. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] [babel] Babel MAC auth fails due to packet reordering
> Hmm, I certainly see where you're coming from; having separate sequence > numbers for unicast/multicast would neatly sidestep this particular > problem. However, one problem with this is that it's not straight-forwardly > backward compatible. No, no sender changes. Just receiver changes. The sender still sends packets in a single sequence. The receiver, however, makes a more relaxed check on the received packet: it merely checks that the received PC has a larger value than that received in the last packet *of the same type*. In other words, the receiver is checking that unicast packets come in ascending order, and that multicast packets come in ascending order. It does not verify the relative ordering of unicast vs. multicast. > As for the size of the window (setting aside the case where an > implementation increases the PC by more than one for every packet), I > guess we'd need it to be large enough to contain a full routing table > dump. A window of 64 packets can fit several thousand routes even in the > worst case with no compression; Expect on the order of 60 routes per packet. 64 packets gives you on the order of 3800 routes. -- Juliusz ___ Babel-users mailing list Babel-users@alioth-lists.debian.net https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users