Hi,

>  At first everything seemed to work. A node on the one end could ping a
>  node on the other end over the mesh-network. The ping was hopping from
>  node to node as expected.
>
>  But sometimes some paths do not work anymore.
>
>  Some nodes can only reach their direct neighbors via a "normal ping". A
>  ping to a node via one hop does not work. A "batctl ping" does work!
>
>  This only happens to parts of the network and is not permanent. If i
>  wait it will recover, but then the problem appears at another node.

since "batctl ping" works I'd say your mesh works fine - you have a problem in
your higher layers. Maybe a mac address collision or an ARP timeout ?

Can you provide specific examples we can go through ? For instance, provide
the batctl ping output to the neighbor in question, the ping error message
(does it say timeout / host could not be found / etc), a batctl traceroute to
the neighbor in question and the output of the global translation table.

Are you trying to ping a 'fixed' node or a node that is roaming ?

Regards,
Marek

Hello Marek,

thanks for you response. I'll try to give you an example - i'll cut out the parts that are not relevant (i hope).

First i have to correct the version - it seems to be 2011.3 - not 2011.2 as the subject says.

root@fon-58:~# dmesg | grep "batman_adv"
batman_adv: B.A.T.M.A.N. advanced 2011.3.0 (compatibility version 14) loaded

The route from bat49 to bat58 is not working. It should hop via bat59.

root@1043-49:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/9e:0c:6d:ee:7c:ba (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]: Potential nexthops ... bat58 3.080s (168) bat59 [ wlan2]: bat59 (168) bat51 ( 0) bat60 (127) bat59 3.130s (202) bat59 [ wlan2]: bat60 (155) bat51 (134) bat59 (202)

root@fon-59:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/0a:18:84:80:87:9d (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]: Potential nexthops ... bat58 4.740s (210) bat58 [ wlan2]: bat55 ( 0) bat51 ( 0) bat49 ( 0) bat52 ( 0) bat67 (148) bat53 (120) bat54 (191) bat60 (170) bat58 (210) bat49 0.040s (192) bat49 [ wlan2]: bat52 ( 36) bat55 ( 0) bat67 (106) bat58 (148) bat54 (129) bat53 ( 80) bat60 (152) bat49 (192) bat51 (112)

root@fon-58:~# batctl o
[B.A.T.M.A.N. adv 2011.3.0, MainIF/MAC: wlan2/0a:18:84:81:a1:0d (bat0)]
Originator last-seen (#/255) Nexthop [outgoingIF]: Potential nexthops ... bat49 0.570s (174) bat59 [ wlan2]: bat51 ( 4) bat52 ( 9) bat55 ( 5) bat54 (149) bat53 (140) bat67 (156) bat60 ( 95) bat59 (174) bat49 ( 0) bat59 0.990s (245) bat59 [ wlan2]: bat55 ( 8) bat51 ( 3) bat52 ( 8) bat60 (137) bat53 (186) bat67 (217) bat54 (206) bat59 (245)

ifconfigs:
root@1043-49:~# ifconfig bat0
bat0      Link encap:Ethernet  HWaddr 9E:90:FC:DC:99:09
inet addr:192.168.111.49 Bcast:192.168.111.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9410 errors:0 dropped:0 overruns:0 frame:0
          TX packets:64693 errors:0 dropped:2560 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1622914 (1.5 MiB)  TX bytes:13322553 (12.7 MiB)
root@1043-49:~# ifconfig wlan2
wlan2     Link encap:Ethernet  HWaddr 9E:0C:6D:EE:7C:BA
          UP BROADCAST RUNNING MULTICAST  MTU:1528  Metric:1
          RX packets:84071 errors:0 dropped:78 overruns:0 frame:0
          TX packets:112446 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6693056 (6.3 MiB)  TX bytes:20633620 (19.6 MiB)

root@fon-59:~# ifconfig bat0
bat0      Link encap:Ethernet  HWaddr D6:0F:24:F1:43:3C
inet addr:192.168.111.59 Bcast:192.168.111.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:23493 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5078 errors:0 dropped:8 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:4006301 (3.8 MiB)  TX bytes:726585 (709.5 KiB)
root@fon-59:~# ifconfig wlan2
wlan2     Link encap:Ethernet  HWaddr 0A:18:84:80:87:9D
          UP BROADCAST RUNNING MULTICAST  MTU:1528  Metric:1
          RX packets:298487 errors:0 dropped:748 overruns:0 frame:0
          TX packets:176654 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:18665398 (17.7 MiB)  TX bytes:14335354 (13.6 MiB)

root@fon-58:~# ifconfig bat0
bat0      Link encap:Ethernet  HWaddr C2:90:A3:3B:4E:C9
inet addr:192.168.111.58 Bcast:192.168.111.255 Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:23159 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7759 errors:0 dropped:2298 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1115758 (1.0 MiB)  TX bytes:737874 (720.5 KiB)
root@fon-58:~# ifconfig wlan2
wlan2     Link encap:Ethernet  HWaddr 0A:18:84:81:A1:0D
          UP BROADCAST RUNNING MULTICAST  MTU:1528  Metric:1
          RX packets:3475063 errors:0 dropped:1422 overruns:0 frame:0
          TX packets:1601622 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:154462355 (147.3 MiB)  TX bytes:100565468 (95.9 MiB)

this is working:
root@1043-49:~# batctl p bat59
PING bat59 (0a:18:84:80:87:9d) 20(48) bytes of data
20 bytes from bat59 icmp_seq=1 ttl=49 time=6.12 ms

and this:
root@1043-49:~# batctl p bat58
PING bat58 (0a:18:84:81:a1:0d) 20(48) bytes of data
20 bytes from bat58 icmp_seq=1 ttl=48 time=17.61 ms

and this too:
root@1043-49:~# ping 192.168.111.59
PING 192.168.111.59 (192.168.111.59): 56 data bytes
64 bytes from 192.168.111.59: seq=0 ttl=64 time=7.621 ms

this NOT:
root@1043-49:~# ping 192.168.111.58
PING 192.168.111.58 (192.168.111.58): 56 data bytes

the route seems ok:
root@1043-49:~# batctl tr bat58
traceroute to bat58 (0a:18:84:81:a1:0d), 50 hops max, 20 byte packets
 1: bat59 (0a:18:84:80:87:9d)  4.297 ms  31.777 ms  0.938 ms
 2: bat58 (0a:18:84:81:a1:0d)  7.868 ms  4.153 ms  3.352 ms

I see the pings going out on bat49
root@1043-49:~# batctl td wlan2 | grep "ICMP"
13:16:39.026715 BAT bat49 > bat58: UCAST, ttvn 1, ttl 50, IP 192.168.111.49 > 192.168.111.58: ICMP echo request, id 9467, seq 16, length 64

i even see the packet come into bat58:
root@fon-58:~# batctl td wlan2 | grep "ICMP"
13:18:39.715935 BAT bat59 > bat58: UCAST, ttvn 1, ttl 48, IP 192.168.111.49 > 192.168.111.58: ICMP echo request, id 9467, seq 159, length 64

but no reply.

in the bat0-interface i can see the reply:
root@fon-58:~# batctl td bat0 | grep "ICMP"
13:19:15.730081 IP 192.168.111.49 > 192.168.111.58: ICMP echo request, id 9467, seq 195, length 64 13:19:15.732864 IP 192.168.111.58 > 192.168.111.49: ICMP echo reply, id 9467, seq 195, length 64

the arp-table of bat58 looks good:
root@fon-58:~# arp -a
IP address HW type Flags HW address Mask Device
192.168.111.49   0x1         0x2         9e:90:fc:dc:99:09     *        bat0

the other direction does not work either:
root@fon-58:~# ping 192.168.111.49
PING 192.168.111.49 (192.168.111.49): 56 data bytes

the packet go out on bat58 on the bat0 interface
root@fon-58:~# batctl td bat0 | grep "ICMP"
13:54:15.727222 IP 192.168.111.58 > 192.168.111.49: ICMP echo request, id 1961, seq 112, length 64

but it its *NOT* visible in the wlan-interface:
root@fon-58:~# batctl td wlan2 | grep "ICMP"

A ping from bat58 to bat59 works:
root@fon-58:~# ping 192.168.111.59
PING 192.168.111.59 (192.168.111.59): 56 data bytes
64 bytes from 192.168.111.59: seq=0 ttl=64 time=15.729 ms

and appears in both dumps:
root@fon-58:~# batctl td wlan2 | grep "ICMP"
14:00:50.522992 BAT bat58 > bat59: UCAST, ttvn 1, ttl 50, IP 192.168.111.58 > 192.168.111.59: ICMP echo request, id 1997, seq 3, length 64 14:00:50.530158 BAT bat59 > bat58: UCAST, ttvn 1, ttl 50, IP 192.168.111.59 > 192.168.111.58: ICMP echo reply, id 1997, seq 3, length 64

root@fon-58:~# batctl td bat0 | grep "ICMP"
14:01:05.563243 IP 192.168.111.58 > 192.168.111.59: ICMP echo request, id 1997, seq 18, length 64 14:01:05.567195 IP 192.168.111.59 > 192.168.111.58: ICMP echo reply, id 1997, seq 18, length 64

Why is the ICMP-Ping from 58 to 49 not send on the wlan?

Does the "TX-dropped" count in ifconfig mean anything?

I dont't understand the "batctl tg". If i repeat the command it gives me different results:

root@fon-58:~# batctl tg |grep "49"
 * 04:11:80:f4:40:c8  (  1) via             bat49     (  1)
root@fon-58:~# batctl tg |grep "49"
 * 0c:6d:ee:7c:ba:01  (  1) via             bat49     (  1)
root@fon-58:~# batctl tg |grep "49"
 * 04:11:80:f4:40:c8  (  1) via             bat49     (  1)
root@fon-58:~# batctl tg |grep "49"
 * 18:84:80:34:51:01  (  1) via             bat49     (  1)
root@fon-58:~# batctl tg |grep "49"
 * 04:30:48:60:6c:dd  (  1) via             bat49     (  1)
root@fon-58:~# batctl tg |grep "49"
 * 04:11:80:f4:40:c8  (  1) via             bat49     (  1)

i have not yet found a device with the mac "04:11:80:f4:40:c8"

if i look in the logs on bat49 it keeps creating and deleting an enrty with this address:
root@1043-49:~# batctl l | grep "40:c8"
[ 9726] Creating new global tt entry: 04:11:80:f4:40:c8 (via 0a:18:84:1e:f6:05) [ 9726] Deleting global tt entry 04:11:80:f4:40:c8 (via 0a:18:84:1e:f6:05): originator time out

The nodes are fixed an not moving. Do i have to specify them as non-roaming somehow?

We have problems with the correct / same time on all devices. Is that a problem for batman?

Tobias

Reply via email to