Re: [Babel-users] babels bug with uninitialized data somewhere?
Le 27 juin 2014 à 20:39, Juliusz Chroboczek a écrit : the two captures are at: http://snapon.lab.bufferbloat.net/~cero2/babcap/ There's nothing obviously wrong. Here's an example of a source-specific TLV (the one at time 06:20.362230 in the wifi capture): 0d 16 02 20 00 00 06 40 65 8a 00 e0 40 00 20 01 04 70 82 36 02 61 02 00 Type = 0d (13) Length = 16 (32) AE = 02 Flags = 20 Plen = 00 Omitted = 00 Interval = 0640 (16s) Seqno = 658a Metric = 00e0 (224) Prefix = (empty) (::) Src Plen = 40 (64) src Omitted = 00 src Prefix = 20 01 04 70 82 36 02 61 (2001:0470:8236:0261:0261::) Sub-TLV: 02 00 (diversity empty) Matthieu, do you have any ideas? I don't see any problem with that TLV. However, an interesting one is the last of that packet (babel.pcap): 00:05:20.966837 IP6 (class 0xc2, hlim 1, next-header UDP (17) payload length: 232) fe80::28c6:8eff:febb:9ff0.6696 ff02::1:6.6696: [udp sum ok] babel 2 (220) Next Hop 172.21.2.1 Router Id ea:94:f6:ff:fe:91:2e:a4 Update 172.21.3.4/32 metric 143 seqno 53263 interval 160.0s Update 172.21.18.1/32 metric 143 seqno 53263 interval 160.0s Update 172.21.18.65/32 metric 143 seqno 53263 interval 160.0s Update 172.21.18.161/32 metric 143 seqno 53263 interval 160.0s Router Id ee:a8:6b:ff:fe:fe:09:a2 Update/prefix fd20::2/128 metric 256 seqno 37335 interval 160.0s Update 172.21.50.2/32 metric 256 seqno 37335 interval 160.0s Update 172.21.51.1/32 metric 256 seqno 37335 interval 160.0s Router Id a2:21:b7:ff:fe:ac:e4:56 SS-Update/src-prefix ::/0 from 2001:470:8236:261::/64 metric 224 seqno 25993 interval 160.0s SS-Update/prefix ::/0 from ::/128 metric 224 seqno 25993 interval 160.0s The last line show we have sent a 'from ::/128', which is martian. Who is fe80::28c6:8eff:febb:9ff0 ? Does it have a config file ? (If yes, which contents ?) Matthieu ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
what about putting the latest version in the public git? Done. (forced update) Matthieu ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
SS-Update/src-prefix ::/0 from 2001:470:8236:261::/64 metric 224 seqno 25993 interval 160.0s SS-Update/prefix ::/0 from ::/128 metric 224 seqno 25993 interval 160.0s Ah, there's an SS-aware version of tcpdump? Cool. Where do I get a copy? -- Juliusz ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
Ah, there's an SS-aware version of tcpdump? Cool. Where do I get a copy? On *your* web page ! http://git.wifi.pps.univ-paris-diderot.fr/?p=tcpdump-babels.git;a=summary Matthieu ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
On Fri, Jun 27, 2014 at 1:40 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: There's nothing obviously wrong. Here's an example of a source-specific TLV (the one at time 06:20.362230 in the wifi capture): So this does imply some sort of memory corruption issue on parsing that martian packet? I don't want to make any guesses. Matthieu's code has been through some churn lately, and we've fixed some minor bugs and typos. Matthieu, what about putting the latest version in the public git? I'm sure Dave can deal with our dirty rebasing habit. (you're probably the only person in the world who thinks /27 is a round number). 32 is a round number! Indeed. It's the sum of two primes. You have an off by two error in that statement, unless you were discarding the broadcast address and base address. I would certainly like merely to export the /24 (and ipv6 /61) to the universe from each box but have never figured out how. First, install a blackhole or unreachable route for the whole /24. It's a good thing to do in any case, since it will shoot any packets that are destined for an interface that's currently down and might otherwise follow the default route: ip route add unreachable 192.168.4.0/24 proto static (I prefer unreachable, since it makes debugging marginally easier, but Real Men (and Real Women) use blackhole routes. I know, I'm a wimp.) I like chatty interior networks also. One thing I added recently was bcp38 support, and it helps if that is chatty. bcp38 is a package I've been meaning to push up from ceropackages into openwrt... I pointed out an issue with rogue routers announcing things internally like 75.75.75.75 which this package somewhat helps with also. Then export this route as usual (babeld doesn't care that it's unreachable -- as far as it's concerned, it's a perfectly good static route): redistribute ip 0.0.0.0/0 le 24 allow No idea how to do it through UCI, you'll need to ask Gabriel. Thank you, I'll try. I turn it off (it has a noisy fan) and for sane values of boom, the whole network switches over to going through the wan or adhoc ports. You appear to be running with a very high hello interval, so the value of boom is on the order of a minute, right? It usually seems much faster than that... but I'll go measure. I have a couple links I'd like to fail over faster than they do. I am using the default hello intervals. Should I tighten that in this case? Several 802.11ac wifi interfaces in the lab are bridged as well, and I actually want them preferred so I don't tell babel they are actually wifi. (part of the reason for the edgerouter upgrade was so I could drive those 802.11ac devices at faster rates, and they are the only thing here not running babel. yet. ) multicast throughout the network is set to 9mbits/sec. The 172.20.142 devices are all nanostation M5s with p2p links (this gives me some desire for wanting unicast route updates one day) , the aps are all on picostations with a routed /24 each (143 for p2p), most channels are unique. I typically configure everything to not get default routes via dhcp, in openwrt that's option 'defaultroute' '0', and in dhclient.conf on things like debian, you just kill the routers portion of the setup: request subnet-mask, broadcast-address, time-offset, routers, domain-name, domain-name-servers, domain-search, host-name, dhcp6.name-servers, dhcp6.domain-search, netbios-name-servers, netbios-scope, interface-mtu, rfc3442-classless-static-routes, ntp-servers, dhcp6.fqdn, dhcp6.sntp-servers; default routes are evil. Compared to what would have happened if I'd tried vlans or some other bridging solution, this was marvelous. Isn't your use case exactly what STP was designed for? Set the STP root to the edgerouter, and put an alternate root with slightly lower priority on your other router. I guess I should do a network map, huh? If I were to bridge the entire network (20+ wireless mesh connections spread over 110 acres, 30+ wired in various locations, mostly in the lab, 100+ users on the APs on the weekends) bad things would happen. As for bridging the lab, that is doable, but I'm trying to prototype the next generation of the deployment and it's just easier to route everything. (Unlike Babel, though, STP won't handle the meshy part of your network, and it won't attempt to optimise your traffic -- everything will follow the STP tree, even when shortcuts are possible. One of the weird ways I use babel is to be able to test a given device through a given path, which I typically do by disabling babel on the interface, downing a given interface, or doing filtering. I can still shoot myself in the foot, however, as babel tries really hard to find a path no matter what. More times than I can count I have ended up testing a different path than what I thought I was testing. And: if I'm saturating the network, or using an artificial
Re: [Babel-users] babels bug with uninitialized data somewhere?
On Fri, Jun 27, 2014 at 1:40 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: There's nothing obviously wrong. Here's an example of a source-specific TLV (the one at time 06:20.362230 in the wifi capture): So this does imply some sort of memory corruption issue on parsing that martian packet? I don't want to make any guesses. Matthieu's code has been through some churn lately, and we've fixed some minor bugs and typos. Matthieu, what about putting the latest version in the public git? I'm sure Dave can deal with our dirty rebasing habit. There is one other major user of babels over on homewrt I know of that would be surprised if you rebased without warning. Just warn us when you rebase. An update would be nice! A merge into 1.5, and quagga, better. :) No pressure... ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
I'm sure Dave can deal with our dirty rebasing habit. There is one other major user of babels over on homewrt I know of that would be surprised if you rebased without warning. We're not rebasing the babeld trunk. We have a private branch that we intend to merge, and that branch is being rebased every couple of days. Are you advising we don't make our work-in-progress branch public? -- Juliusz ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users
Re: [Babel-users] babels bug with uninitialized data somewhere?
On Fri, Jun 27, 2014 at 3:04 PM, Juliusz Chroboczek j...@pps.univ-paris-diderot.fr wrote: I'm sure Dave can deal with our dirty rebasing habit. There is one other major user of babels over on homewrt I know of that would be surprised if you rebased without warning. We're not rebasing the babeld trunk. We have a private branch that we intend to merge, and that branch is being rebased every couple of days. Are you advising we don't make our work-in-progress branch public? Ghu, no. Do everything in public! What I have been doing is pulling from: PKG_REV:=757af8018a6e51ba64994d4834d41d4da8377e09 PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.gz PKG_SOURCE_URL:=https://github.com/boutier/babeld.git PKG_SOURCE_SUBDIR:=babeld-$(PKG_VERSION) PKG_SOURCE_VERSION:=$(PKG_REV) PKG_SOURCE_PROTO:=git which hasn't seen an update since december. Should I be pulling from elsewhere? the openwrt-routing repo is pulling the same commit from PKG_NAME:=babels PKG_SOURCE_VERSION:=757af8018a6e51ba64994d4834d41d4da8377e09 PKG_VERSION:=2013-12-18-$(PKG_SOURCE_VERSION) PKG_RELEASE:=1 PKG_SOURCE_PROTO:=git PKG_SOURCE_URL:=git://git.wifi.pps.univ-paris-diderot.fr/babels PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.gz PKG_SOURCE_SUBDIR:=$(PKG_NAME)-$(PKG_VERSION) -- Juliusz -- Dave Täht NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article ___ Babel-users mailing list Babel-users@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users