In the meantime I digged a little deeper into this. DAT as such works but has some side effects on the backbone network in a setup like mine with several mesh nodes connected to the same backbone network and bla enabled. I see two main issues:
1. The original broadcast ARP request sent by the PC is looping back into the backbone network. As far as I have figured out this comes from the encapsulation of the original ARP broadcast into a BATADV_UNICAST_4ADDR frame, which is not handled by the bla code responsible for preventing looping broadcasts as for bla this is a unicast frame. 2. All ARP replies are forwarded into the backbone by all possible gateways. If a gateway gets responses of up to 3 remote dat candidates, the total number of seen arp replies becomes 3 times the number of gateways used. I am not sure, if this is specific to the old kernel version I used, but I tried to overcome the two mentioned points with the following measures: 1. drop BATADV_UNICAST_4ADDR DHT_GET frames received from another gateway as long as we cannot answer the forwarded arp request. 2. make sure, that only a gateway which has claimed the src mac of an arp reply forwards this reply to the backbone network 3. drop received arp replies as soon as we have a local dat entry for the src mac of the arp reply. In this case it is most likely that the device has already sent a reply. With these measures I see a "clean" arp request / reply behaviour in the backbone network. As a further improvement I added the snooping of all incoming IP traffic on the mesh soft interface. I use the src mac and src IP to update the local dat cache. I wanted to achieve as low arp request/reply and connected broadcast traffic in the mesh as possible. If there is interest I could send a patch file to the mailing list with the changes based on the batman-adv git master. But I warn you in front: I am not a very skilled kernel programmer nor do I have any experience in using git ;-) Regards, Andreas "B.A.T.M.A.N" <b.a.t.m.a.n-boun...@lists.open-mesh.org> schrieb am 13.03.2015 15:35:53: > Von: Andreas Pape <ap...@phoenixcontact.com> > An: The list for a Better Approach To Mobile Ad-hoc Networking > <b.a.t.m.a.n@lists.open-mesh.org>, > Datum: 13.03.2015 15:57 > Betreff: Re: [B.A.T.M.A.N.] DAT broken in 2014.4.0? > Gesendet von: "B.A.T.M.A.N" <b.a.t.m.a.n-boun...@lists.open-mesh.org> > > Hello Antonio, > > my mesh nodes use a wlan interface in adhoc mode as the only hard_if in > bat0. bat0 is bridged to a Linux bridge br0 together with the Ethernet > interface eth0. The wlan interface ath0 is not part of that bridge. The > only interface having an ip address assigned is the bridge br0. > > As mentioned I use 6 mesh nodes of that described setup of which 3 are > only accessible via the mesh (eth0 interface not connected to any other > Ethernet device) and 3 devices are connected with their eth Interfaces to > the same Ethernet switch. The Windows PC is also connected to that same > switch. > > I am using batman-adv 2014.4.0 in combination with a fairly old Linux > kernel 2.6.32.26 on an embedded device. If I enable BLA and DAT and send a > ping from the Windows PC to one of the mesh nodes which is not connected > to the Ethernet backbone, I see a multiplication of the ARP request sent > by the PC and even a higher amount of corresponding ARP replies in the > backbone network of which I am not sure, how much of them are really sent > by the mesh node being the original destination for the ARP request. > Furthermore I get lots of "bat0: received packet with own address as > source address" and some "eth0: received ...." kernel log messages in that > case as soon as the PC sends the first broadcast ARP request (after the > mentioned arp -d command). > > If I disable DAT on all of my 6 devices the ARP telegrams being visible in > the backbone network look normal to me. There is only one broadcast ARP > request from the PC and only one ARP reply. > > In the meantime I enabled dat debug messages on one of my gateways between > the ethernet backbone and the mesh. After clearing the ARP cache of the PC > by the arp -d command, I see the following output of batctl l > > Parsing outgoing ARP REQUEST > ARP MSG : [src: <mac of the PC> - 192.168.0.50 dst: 00:00:00:00:00:00 - > 192.168.0.101] > Entry updated 192.168.0.50 <mac of the PC> > ARP request replied locally > ARP Request for 192.168.0.101: fallback prevented > Parsing incoming ARP REPLY > ARP MSG: [src: <mac of the mesh node> - 192.168.0.101 dst: <mac of the PC> > - 192.168.0.50] > * encapsulated within a UNICAST packet > Entry updated: 192.168.0.101 <mac of the mesh node> > Entry updated: 192.168.0.50 <mac of the PC> > > followed by a flood of additional messages of similiar kind. From this > logging and from what I understood so far about bla and dat from > open-mesh.org and a short look into the source code I conclude, that the > gateway knew already the mac the PC was looking for ("ARP request replied > locally") and did not forward it as a broadcast into the mesh. > Nevertheless the gateway received an ARP reply from the mesh. I guess the > original ARP request broadcast was forwarded at least by one of the > remaining two backbone gateways into the mesh and a reply was sent by > someone else (another mesh node with enabled dat or the mesh node being > searched for). > > This leads me to the question if using dat and a bla setup in combination > is considered by design and if this should work or if dat is only > reasonable to be used when a backbone network has a single gateway into > the mesh (as depicted in the dat wiki on open-mesh.org) only. > > Thanks for the support and regards, > Andreas > > > > Von: Antonio Quartulli <anto...@meshcoding.com> > An: The list for a Better Approach To Mobile Ad-hoc Networking > <b.a.t.m.a.n@lists.open-mesh.org>, > Datum: 13.03.2015 13:22 > Betreff: Re: [B.A.T.M.A.N.] DAT broken in 2014.4.0? > Gesendet von: "B.A.T.M.A.N" <b.a.t.m.a.n-boun...@lists.open-mesh.org> > > > > Hi Andreas, > > so far we don't have any known DAT regression in 2014.4.0. > > Could you please provide a more detailed description about your setup > including how the nodes have their bridges configured and what > interfaces have been added to batman-adv? > > Thanks! > > On 13/03/15 08:28, Andreas Pape wrote: > > Is there a known issue conerning the DAT functionality in batman-adv > > 2014.4.0? > > > > I have got a problem with looping ARP packets / multiplication of ARP > > packets causing ARP storms in a setup with enabled DAT and BLA. My setup > > > consists of 6 mesh nodes of which 3 are connected to the same backbone > > network. I connected a PC to the backbone which has an open ssh > connection > > to one ot the mesh nodes not connected to the backbone network directly. > > > Using arp -d to delete the ARP cache of the Windows PC forces the PC to > > send an ARP request to the mesh node used for the ssh session. I can > then > > see multiple copies of that ARP request in the backbone in a wireshark > > recording and also multiple ARP replies from the mesh node. > > Sometimes also BLA gratuitous ARP telegrams seem to be looping, but it's > > > easier to force this behaviour with regular ARPs (via arp -d on a PC). > > Non-ARP telegrams don't seem to be affected and except the waste of > > bandwith in the mesh and backbone I don't have problems with normal > > network communication in the mesh. > > > > I could provide the mentioned wireshark recordings made in the backbone > > network with a switch using port mirroring if someone explains how to > > provide such a file to the mailing list (I guess simple attachments are > > not allowed?). > > > > If I disable DAT, everything looks fine again, i.e. no duplicated ARP > > telegrams anymore (except for a few ARP replies from the mesh node which > > > are received twice, which could be a race for claiming the device?).. > > > > Regards, > > Andreas > > > > > > .................................................................. > > PHOENIX CONTACT ELECTRONICS GmbH > > > > Sitz der Gesellschaft / registered office of the company: 31812 Bad > Pyrmont > > USt-Id-Nr.: DE811742156 > > Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528 > > Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck > > ___________________________________________________________________ > > Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail > irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und > vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige > Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. > > > ---------------------------------------------------------------------------------------------------- > > This e-mail may contain confidential and/or privileged information. If > you are not the intended recipient (or have received this e-mail in error) > please notify the sender immediately and destroy this e-mail. Any > unauthorized copying, disclosure, distribution or other use of the > material or parts thereof is strictly forbidden. > > ___________________________________________________________________ > > > > -- > Antonio Quartulli > > [Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix Contact] > > > > .................................................................. > PHOENIX CONTACT ELECTRONICS GmbH > > Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont > USt-Id-Nr.: DE811742156 > Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528 > Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck > ___________________________________________________________________ > Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese > E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den > Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, > jegliche anderweitige Verwendung sowie die unbefugte Weitergabe > dieser Mail ist nicht gestattet. > ---------------------------------------------------------------------------------------------------- > This e-mail may contain confidential and/or privileged information. > If you are not the intended recipient (or have received this e-mail > in error) please notify the sender immediately and destroy this e- > mail. Any unauthorized copying, disclosure, distribution or other > use of the material or parts thereof is strictly forbidden. > ___________________________________________________________________ .................................................................. PHOENIX CONTACT ELECTRONICS GmbH Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont USt-Id-Nr.: DE811742156 Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528 Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck ___________________________________________________________________ Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. ---------------------------------------------------------------------------------------------------- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden. ___________________________________________________________________