Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-28 Thread Matthieu Boutier

Le 27 juin 2014 à 20:39, Juliusz Chroboczek a écrit :

 the two captures are at: http://snapon.lab.bufferbloat.net/~cero2/babcap/
 
 There's nothing obviously wrong.  Here's an example of a source-specific
 TLV (the one at time 06:20.362230 in the wifi capture):
 
  0d 16 02 20 00 00 06 40 65 8a 00 e0 40 00 20 01 04 70 82 36 02 61 02 00 
 
  Type = 0d (13)
  Length = 16 (32)
  AE = 02
  Flags = 20
  Plen = 00
  Omitted = 00
  Interval = 0640 (16s)
  Seqno = 658a
  Metric = 00e0 (224)
  Prefix = (empty) (::)
  Src Plen = 40 (64)
  src Omitted = 00
  src Prefix = 20 01 04 70 82 36 02 61 (2001:0470:8236:0261:0261::)
  Sub-TLV: 02 00 (diversity empty)
 
 Matthieu, do you have any ideas?

I don't see any problem with that TLV.

However, an interesting one is the last of that packet (babel.pcap):

00:05:20.966837 IP6 (class 0xc2, hlim 1, next-header UDP (17) payload length: 
232) fe80::28c6:8eff:febb:9ff0.6696  ff02::1:6.6696: [udp sum ok] babel 2 (220)
Next Hop 172.21.2.1
Router Id ea:94:f6:ff:fe:91:2e:a4
Update 172.21.3.4/32 metric 143 seqno 53263 interval 160.0s
Update 172.21.18.1/32 metric 143 seqno 53263 interval 160.0s
Update 172.21.18.65/32 metric 143 seqno 53263 interval 160.0s
Update 172.21.18.161/32 metric 143 seqno 53263 interval 160.0s
Router Id ee:a8:6b:ff:fe:fe:09:a2
Update/prefix fd20::2/128 metric 256 seqno 37335 interval 160.0s
Update 172.21.50.2/32 metric 256 seqno 37335 interval 160.0s
Update 172.21.51.1/32 metric 256 seqno 37335 interval 160.0s
Router Id a2:21:b7:ff:fe:ac:e4:56
SS-Update/src-prefix ::/0 from 2001:470:8236:261::/64 metric 224 seqno 
25993 interval 160.0s
SS-Update/prefix ::/0 from ::/128 metric 224 seqno 25993 interval 160.0s

The last line show we have sent a 'from ::/128', which is martian.

Who is fe80::28c6:8eff:febb:9ff0 ?  Does it have a config file ?  (If yes, 
which contents ?)

Matthieu


___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-28 Thread Matthieu Boutier
 what about putting the latest version in the public git?

Done. (forced update)

Matthieu


___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-28 Thread Juliusz Chroboczek
   SS-Update/src-prefix ::/0 from 2001:470:8236:261::/64 metric 224 seqno 
 25993 interval 160.0s
   SS-Update/prefix ::/0 from ::/128 metric 224 seqno 25993 interval 160.0s

Ah, there's an SS-aware version of tcpdump?  Cool.  Where do I get a copy?

-- Juliusz

___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-28 Thread Matthieu Boutier
 Ah, there's an SS-aware version of tcpdump?  Cool.  Where do I get a copy?

On *your* web page !

http://git.wifi.pps.univ-paris-diderot.fr/?p=tcpdump-babels.git;a=summary

Matthieu


___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-27 Thread Dave Taht
On Fri, Jun 27, 2014 at 1:40 PM, Juliusz Chroboczek
j...@pps.univ-paris-diderot.fr wrote:
 There's nothing obviously wrong.  Here's an example of a source-specific
 TLV (the one at time 06:20.362230 in the wifi capture):

 So this does imply some sort of memory corruption issue on parsing
 that martian packet?

 I don't want to make any guesses.  Matthieu's code has been through some
 churn lately, and we've fixed some minor bugs and typos.  Matthieu, what
 about putting the latest version in the public git?  I'm sure Dave can
 deal with our dirty rebasing habit.

 (you're probably the only person in the world who thinks /27 is a round
 number).

 32 is a round number!

 Indeed.  It's the sum of two primes.

You have an off by two error in that statement, unless you were
discarding the broadcast address and base address.


 I would certainly like merely to export the /24 (and ipv6 /61) to the
 universe from each box but have never figured out how.

 First, install a blackhole or unreachable route for the whole /24.  It's
 a good thing to do in any case, since it will shoot any packets that are
 destined for an interface that's currently down and might otherwise follow
 the default route:

   ip route add unreachable 192.168.4.0/24 proto static

 (I prefer unreachable, since it makes debugging marginally easier, but
 Real Men (and Real Women) use blackhole routes.  I know, I'm a wimp.)

I like chatty interior networks also.

One thing I added recently was bcp38 support, and it helps if that is
chatty. bcp38 is a package I've been meaning to push up from
ceropackages into openwrt...

I pointed out an issue with rogue routers announcing things internally
like 75.75.75.75 which this package somewhat helps with also.

 Then export this route as usual (babeld doesn't care that it's unreachable
 -- as far as it's concerned, it's a perfectly good static route):

   redistribute ip 0.0.0.0/0 le 24 allow

 No idea how to do it through UCI, you'll need to ask Gabriel.

Thank you, I'll try.

 I turn it off (it has a noisy fan) and for sane values of boom, the
 whole network switches over to going through the wan or adhoc ports.

 You appear to be running with a very high hello interval, so the value of
 boom is on the order of a minute, right?

It usually seems much faster than that... but I'll go measure. I have
a couple links I'd like to fail over faster than they do.

I am using the default hello intervals. Should I tighten that in this case?

Several 802.11ac wifi interfaces in the lab are bridged as well, and I
actually want them preferred so I don't tell babel they are actually
wifi. (part of the reason for the edgerouter upgrade was so I could
drive those 802.11ac devices at faster rates, and they are the only
thing here not running babel. yet. )

multicast throughout the network is set to 9mbits/sec. The 172.20.142
devices are all nanostation M5s with p2p links (this gives me some
desire for
wanting unicast route updates one day) , the aps are all on
picostations with a routed /24 each (143 for p2p), most channels are
unique.

I typically configure everything to not get default routes via dhcp, in openwrt
that's option 'defaultroute' '0', and in dhclient.conf on things like
debian, you just kill the routers portion of the setup:

request subnet-mask, broadcast-address, time-offset, routers,
domain-name, domain-name-servers, domain-search, host-name,
dhcp6.name-servers, dhcp6.domain-search,
netbios-name-servers, netbios-scope, interface-mtu,
rfc3442-classless-static-routes, ntp-servers,
dhcp6.fqdn, dhcp6.sntp-servers;

default routes are evil.

 Compared to what would have happened if I'd tried vlans or some other
 bridging solution, this was marvelous.

 Isn't your use case exactly what STP was designed for?  Set the STP root
 to the edgerouter, and put an alternate root with slightly lower priority
 on your other router.

I guess I should do a network map, huh? If I were to bridge the entire
network (20+ wireless mesh connections spread over 110 acres, 30+
wired in various locations, mostly in the lab, 100+ users on the APs
on the weekends) bad things would happen.

As for bridging the lab, that is doable, but I'm trying to prototype
the next generation of the deployment and it's just easier to route
everything.

 (Unlike Babel, though, STP won't handle the meshy
 part of your network, and it won't attempt to optimise your traffic --
 everything will follow the STP tree, even when shortcuts are possible.

One of the weird ways I use babel is to be able to test a given device
through a given path, which I typically do by disabling babel on the
interface, downing a given interface, or doing filtering.

I can still shoot myself in the foot, however, as babel tries really
hard to find a path no matter what. More times than I can count I have
ended up testing a different path than what I thought I was testing.

 And: if I'm saturating the network, or using an artificial 

Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-27 Thread Dave Taht
On Fri, Jun 27, 2014 at 1:40 PM, Juliusz Chroboczek
j...@pps.univ-paris-diderot.fr wrote:
 There's nothing obviously wrong.  Here's an example of a source-specific
 TLV (the one at time 06:20.362230 in the wifi capture):

 So this does imply some sort of memory corruption issue on parsing
 that martian packet?

 I don't want to make any guesses.  Matthieu's code has been through some
 churn lately, and we've fixed some minor bugs and typos.  Matthieu, what
 about putting the latest version in the public git?  I'm sure Dave can
 deal with our dirty rebasing habit.

There is one other major user of babels over on homewrt I know of that
would be surprised if you rebased without warning.

Just warn us when you rebase. An update would be nice! A merge into
1.5, and quagga, better. :) No pressure...

___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-27 Thread Juliusz Chroboczek
 I'm sure Dave can deal with our dirty rebasing habit.

 There is one other major user of babels over on homewrt I know of that
 would be surprised if you rebased without warning.

We're not rebasing the babeld trunk.  We have a private branch that we
intend to merge, and that branch is being rebased every couple of days.

Are you advising we don't make our work-in-progress branch public?

-- Juliusz

___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users


Re: [Babel-users] babels bug with uninitialized data somewhere?

2014-06-27 Thread Dave Taht
On Fri, Jun 27, 2014 at 3:04 PM, Juliusz Chroboczek
j...@pps.univ-paris-diderot.fr wrote:
 I'm sure Dave can deal with our dirty rebasing habit.

 There is one other major user of babels over on homewrt I know of that
 would be surprised if you rebased without warning.

 We're not rebasing the babeld trunk.  We have a private branch that we
 intend to merge, and that branch is being rebased every couple of days.

 Are you advising we don't make our work-in-progress branch public?

Ghu, no. Do everything in public!

What I have been doing is pulling from:

PKG_REV:=757af8018a6e51ba64994d4834d41d4da8377e09
PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.gz
PKG_SOURCE_URL:=https://github.com/boutier/babeld.git
PKG_SOURCE_SUBDIR:=babeld-$(PKG_VERSION)
PKG_SOURCE_VERSION:=$(PKG_REV)
PKG_SOURCE_PROTO:=git

which hasn't seen an update since december. Should I be pulling from elsewhere?

the openwrt-routing repo is pulling the same commit from

PKG_NAME:=babels
PKG_SOURCE_VERSION:=757af8018a6e51ba64994d4834d41d4da8377e09
PKG_VERSION:=2013-12-18-$(PKG_SOURCE_VERSION)
PKG_RELEASE:=1
PKG_SOURCE_PROTO:=git
PKG_SOURCE_URL:=git://git.wifi.pps.univ-paris-diderot.fr/babels
PKG_SOURCE:=$(PKG_NAME)-$(PKG_VERSION).tar.gz
PKG_SOURCE_SUBDIR:=$(PKG_NAME)-$(PKG_VERSION)

 -- Juliusz



-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article

___
Babel-users mailing list
Babel-users@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/babel-users