I've run into what sounds like a similar problem, but dove in and found more 
details.  Here's the setup:

-19 nodes running BATMAN-adv 2014.14.0 on OpenWrt Chaos Calmer; various 
hardware (D-Link, BBB, Open-Mesh, WRTnode, TP-Link).
-bat0 using ad-hoc interface on each node
-bat0 bridged (br-lan) with Ethernet
-br-lan on all nodes (except node 1) get DHCPv4 address from dnsmasq running on 
node 1
-A few PCs are hard-wired to LAN on node 1, one PC wired to LAN at node 11, all 
other nodes completely standalone
-WAN on node 1 connected to building network - only connection to any outside 
network
-DAT enabled
-BLA disabled

Problem:
After several weeks uptime, node 1 could no longer SSH or ping (L3) node 3.  
Tcpdumps showed ping rec'd at node 3 and node 3 replied, but reply never 
arrived at node 1.  Linux PC wired to LAN at node 1 successfully pings node 3. 
L2 ping (via batctl) works between nodes 1 and 3.
Further investigation showed two entries for node 1's br-lan MAC in the global 
translation table at node 3.  Secondary entry was correct; primary entry 
pointed to node 4.  Node 4's tables (local and global) were both correct.

root@WifiMesh-03:~# batctl tg | grep c8:d3:a3:70:a9:b0
* 42:5e:78:f3:50:7e    0   (  2) via c8:d3:a3:70:a9:b0     ( 25)   (0xd7886ba8) 
[....]
* c8:d3:a3:70:a9:b0   -1   ( 19) via c8:d3:a3:70:a9:53     ( 19)   (0x10e4856e) 
[....]
 + c8:d3:a3:70:a9:b0   -1   (  2) via c8:d3:a3:70:a9:b0     ( 25)   
(0x352c5b78) [....]
 * 42:5e:78:f3:50:7e   -1   (  2) via c8:d3:a3:70:a9:b0     ( 25)   
(0x352c5b78) [....] 

(Yes, br-lan and adhoc0 have same MAC on node 1.  Yes, these are D-Link 
routers.)
...50:7e is bat0 at node 1, ...a9:b0 is adhoc0/br-lan at node 1, ...a9:53 is 
adhoc0/br-lan at node 4

This part may be odd: problem persisted for a few days while I investigated, 
but resolved immediately after viewing the tables on node 4.  May be 
coincidence, though, because it didn't work for the following nodes.

At same time, same problem existed with two other nodes on the mesh: node 13 
(an OM2P-HS) matched node 3's global table; node 9 (a WRTnode) showed a primary 
entry for node 1's br-lan using yet another originator.  Rebooted nodes to 
resolve.
Problem happened again more recently, but the destination MAC was that of the 
Linux PC mentioned above, attached to the LAN on node 1.  In this case, most 
nodes' global tables showed two entries for that MAC, though the originator in 
the primary entry was not consistent.
 
I plan to test with a more recent version of BATMAN-adv once I standardize on 
one model of hardware for the nodes (should be within the next few months).  In 
the meantime, I plan to watch for secondary entries in the global translation 
tables since the current configuration should never result one client being 
accessible through multiple nodes.

Thanks,
-Nick

 
-----Original Message-----
From: B.A.T.M.A.N [mailto:b.a.t.m.a.n-boun...@lists.open-mesh.org] On Behalf Of 
Sven Eckelmann
Sent: Monday, May 02, 2016 9:55 AM
To: b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [B.A.T.M.A.N.] mesh losing internal Ilayer3 connectivity

On Monday 02 May 2016 21:57:49 Karl Auer wrote:
> My apologies up front for a newbie question in this apparently very 
> technical list. If there is a more appropriate list or forum please 
> direct me to it.
> 
> I'm running batman-adv (Chaos Calmer, r47065) on OpenWRT on the GL
> -AR150 platform.

Are you using v2016.1 or some older version of batman-adv? If you use something 
like v2014.4.

What kind of layer 3 are you using? IPv4/IPv6/...?  What is you current 
configuration (for example are you have enabled DAT, BLA, ...). Did you check 
what exactly goes over the air and what the device (the adhoc one) 
receives/sends? What is what the data sent/received over the batman-adv devices?

Did you hardcode the mac address of the batman-adv device or are you let it 
change to a random value on each device creation? Is the device part of a 
bridge or is the IP configured directly on the batman-adv device?

Are you sure that the conntrack for the masquerade over the mesh isn't broken? 
Why are you masquerade over the mesh anyway?

Kind regards,
        Sven

Reply via email to