Hello Simon, Simon Wunderlich <s...@simonwunderlich.de> schrieb am 19.02.2016 16:06:03:
> > 2. Although having the patch for 1. applied, the backbone gateways send > > claim frames for the devices of their own backbone in rare cases from time > > to time. I could send a patch for this as it is rather easy to check with > > the help of the local tt table (batadv_is_my_client) if it is reasonable > > to send a claim frame for these devices. Again, this patch looks more like > > a workaround to me as I also cannot explain what really triggers the > > generation of these claim frames. > > I don't think this is the right way to solve it - if a client has roamed to > another device in the mesh, a gateway MUST send a claim. However > batadv_is_my_client would probably return true, suggesting that the client is > local although it is not local anymore. > > The problem probably needs to be fixed somewhere else. > > > 3. I see again in rare cases looping multicasts for traffic > > mesh->backbone->mesh. If I look at the bla debug messages in these cases I > > see, that a backbone gw holding the claim for the source of the multicast > > frame thinks that the client belonging to the source address has "roamed" > > from another mesh node into the backbone network although it didn't. From > > this I conclude that another backbone gw has forwarded the multicast into > > the backbone although it shoudn't have done this (having found no claim > > for the client or erroneously also holding a claim). In this case the > > backbone gateways seem to be out-of-sync about the actual claim status for > > that client. This effect only lasts a very short time, as the gateway > > which found the "roaming" client unclaims it and within a few milliseconds > > (depending on the traffic generated by the client) another backbone gw (or > > the same) claims the client again. Of course then the looping of the > > multicast traffic from the client stops. In my case the sender of the > > multicast was the bridge interface br0 of a remote mesh node itself. The > > bat0 softinterface was added to that bridge. The looping multicast then > > gave me a "bat0: received packet with own address as source address" > > message. Furthermore that bat0 interface sent a claim frame for the mac of > > the own bridge (whch is obvious as bat0 received a message from the mesh > > with a mac address not claimed yet....). This claim frame then produces > > another "bat0: received packet ..." message. > > I currently have no workaround for this 3rd issue as all I can image to > > prevent this will break the "roaming client" scenario for bla. I could > > even live with this problem as it happens quite seldomly and as it is > > "self-healing", but it tells my that there might be a sync issue. Do you > > think that my 1st and 2nd point could also relate to the same problem? > > In the meantime I looked through the code for hours but I am not able to > > find something that could explain the observed problem. > > Hmm ... that sounds strange. I don't know if this is related to the > your first > two points since we are talking about multicast here and the other > points were > about unicast. > In the meantime after further attempts to debug this I can say that 2. and 3. are somehow related to each other as both seem to happen at the same time. I cannot syncronize the debug messages as they are recorded by two different nodes (a backbone gw A and a normal node B) but it looks as if I first get a packet sent by B which is forwarded into the mesh by gw A again leading to an unclaim of B (although B hasn't roamed to the backbone). After that A has added B to the local TT. Then A receives a unicast packet from B via the mesh. After this packet I see different other unicast packets sent by devices from the local backbone network of A coming via the bat0 interface erroneously. These packets trigger the generation of claim frames (as described in 2.). > I think the main question here is - if the packet came from the mesh, why > wasn't there a claim frame? > > Maybe two questions could help: > * does this happen in the first minutes after starting/restarting the mesh? > There is some initial time for bla gateway nodes to detect each other, > although this should happen quite fast. I am aware that this might be a little bit unstable after establishing the mesh. But this normally happens after some minutes of successfull operation of the system (in the order of 10 minutes approximately). > * Do you have some unusual high amount of broadcast/multicast (e.g. > streaming, fieldbus protocol, etc)? > No, only normal IPv4 stuff and sometimes some ARPs. I use normal IPv4 communication to the webinterfaces of my nodes and some normal pings (a packet a second). The nodes itself have IPv6 enabled although I don't use IPv6 but only IPv4 addressing. The multicasts in question are for destination mac 33:33:00:00:00:01 (IPv6) and are sent each 10 seconds by the IPv6 Linux stack. > What might help is to get dumps from the hard interface as well as the bat0 > soft interface and check the corresponding packets when this problemhappens. > Not sure if this helps and how easy it is to capture dumps ... > > Cheers, > Simon[Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix Contact] .................................................................. PHOENIX CONTACT ELECTRONICS GmbH Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont USt-Id-Nr.: DE811742156 Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528 Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck ___________________________________________________________________ Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. ---------------------------------------------------------------------------------------------------- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden. ___________________________________________________________________