[B.A.T.M.A.N.] Drop Packet, , single packet to large to fit maximum packet size
Hi, I very often get the following error in syslog and batman log. Currently I use batman-experimental rev 1214 The error occurres every second. [ 79646204] Error - Drop Packet, single packet to large to fit maximum packet size scheduled time 79646157, now 79646202, agg_size 4, next_len 276 !! Axel: have you read my last email directly sent to you? It was about the hanging batmand problem. Bye Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] dublicate HNAs / certificates
Hi, I like brainstorming like this. me too. We wanted batmand (and especially its core routing algorithm) to be decentral and simple. So no central point of control/failure and therefore also no HNA server. Perhaps there is a different solution. What if everybody may broadcast their HNA like batman is currently working and batmand get a list of router ip from which HNA is accepted? The bad-guy has normally no way to modify the firmware of other routers and can not tell the batmand to accept its faulty HNA. In this case batman can be updated requlary by cron-job and needs only check HNA against it list. A positiv and negativ list should be possible. Perhaps the list may contain network ranges. (hcl = hna control list) the firmware of the router may request the list from a server. In case a non accepted hna is received, batmand may completely ignore the node, that is injecting invalid HNA. When I understand you right, batmand currently ignores nodes completely that are sending the same HNA? /stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] dublicate HNAs / certificates
Hi again, is there a way to set a TTL value for each hna that is different from OGM TTL? If I assume that an HNA internet host is reachable via two nodes (e.g. running icvpn - bgp) batmand currently ignores one of this hna and also the node and its traffic (right?). What if we use the ttl value as metric to decide which hna is used? In this case both nodes are still present in network but you don't have a address conflict. Batman should not accept HNA that belongs to the ipranges of the batman network. So a node with ip 10.12.0.1 can not send a HNA with 10.12.10.17 and disturbing the routing. Perhaps batmand already checks this? /stephan
Re: [B.A.T.M.A.N.] policy-routing-script issues
Hi Axel, sorry for the late response. I missed the email and found it today. I understand your concerns and agree with that. The reason for all this was to get the information on every gateway change. I need the node ip of the gateway node at the time when the gateway is selected or deselected. At moment I have modified the batmand to call a script in both cases to setup /etc/resolv.conf which is important to always have a valid dns server. the main publich dns servers often are not accessible from different provider networks. /stephan Hi, I can understand your need and agree that your idea with the return value makes the hole thing very flexible. But i am not sure if it makes sense to call the policy routing script even more than once because system calls like that can be quite expensive. And another problem i see comes with the return values itself. Currently the policy-routing feature is not aware of the return value of the called script. One reason for this is that the policy-routing-script is essentially operating in a loop waiting to be feeded with new commands via a pipe. Therefore the script itself does not terminate after being feeded and does not return anything. Another problem is that the C-function execv() (which is currently used) does not really support return values except in case of an error. As you suggested, the function system() could be used instead, but the manpage suggests to not use this function with suid privileges. Regarding the cost (in terms of processing-time) i did some small experiences on a netgear wgt634u and a linksys WRT with openwrt which showed that configuring a route via a bash script (using the ip command) or by using system() to execute the ip command is up to 50 times more expensive than doing it directly (using netlink sockets). In my test adding and removing 100 route entries using netlink sockets took about 200ms while using the ip command it takes about 6-10 seconds. Even just calling a script with system() which does nothing else than return 0 takes about 50ms per call. Therefore I am not sure if using the policy routing script in a large network and with slow devices is a good idea at all. And if the threat must be blocked during the execution of the script to wait for the return value it would be even worse. What about introducing the possibility to define that the routing information which is forwarded to the policy routing script is only informative and is still applied by the daemon itself? ciao, axel On Donnerstag 18 September 2008, Stephan Enderlein (Freifunk Dresden) wrote: Hi Axel, thanks for your comments. At moment I have no much time to spend for batman development. We got a son two month ago and I'm currently enjoy him much. But I have a good idea concerning the routing script. The problem is that I like batman-exp to setup all routes as defined by parameters, but also want batman-exp to call the script. Batman may call the script twice. One call before and one after setting the routes. All should depend on the return code of the first call of the script. If the script returns 0, then batman should not set routes and also there is no need to call the script a second time. If the first call of the script returns 1, then batman should set the routes as defined by parameters and after it should call the script a second time (like pre/post scripts). This allows me to get the routing information without need to setup all routings per hand. Bye setting a environment variable you can distingnuish if the script is called as pre or post script. This leads to the next solution/patch: You should also add information about the gateway selection/changes/deselection to this script. Together with the modification above you can let batmand set the routes and update your resolv.conf to find the correct router that knows how to resolve dns requests. At moment you have to get the dns ip from dhcp or you should enter this as a fix value. But a fix value for this is bad if you build a firmware with a simple user interface. Many people don't know how dns works and what ip they should enter. If you have different ISP some dns server are not accessible. At moment I have added my own patch where I call (system(gateway_scirpt)) each time the gateway tunnel is created or deleted. this is working perfectly. /Stephan Hi On Montag 15 September 2008, freif...@ddmesh.de wrote: Hi, Just applied your latest patches as well. Thanks for looking over the code. Virgin eyes stumble easier over nasty stuff. :-) When you find some problems in batman, can you also apply those patches to the batman-experimental branch? At moment it is running without problems for freifunk dresden. But if the network is growing perhaps some issues may cause problems. Over the time a reasonable part of the code structure of bmx and batman has forked pretty much. Therefore I am
Re: [B.A.T.M.A.N.] dublicate HNAs
Hi, I'm not so deep involved in batman routing to find a solution. I hope you can find a way. But for now it is not so important. But if one node announces a HNA and a different node that just has fun to turn off this node can simply send the same HNA. If you say the first HNA is the right one, then what happens when this node gets the forced disconnection after 24 hours by its internet provider? I think it is difficult to find a solution for this. The best is to keep all nodes active but kill the HNA if not reachable? /Stephan
[B.A.T.M.A.N.] batman-exp rev.1154 still using 94% CPU load
Hi, I have still the problem that batman-exp is hanging on 94% cpu load. perhaps it has nothing to do with the gateway task. It is possible that I run more batmand -c at same time? The current batman revision is 1154. Do you have any idea? Mem: 15700K used, 14924K free, 0K shrd, 1440K buff, 6600K cached CPU: 5.8% usr 94.1% sys 0.0% nice 0.0% idle 0.0% io 0.0% irq 0.0% softirq Load average: 1.62 1.39 0.98 PID PPID USER STAT VSZ %MEM %CPU COMMAND 24256 1843 root R 1264 4.1 94.7 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 1156 1 root S 1264 4.1 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 1843 1156 root S 1264 4.1 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 24462 24460 root S 1216 3.9 0.0 batmand -cb -d2 24857 24815 root S 1216 3.9 0.0 batmand -c -b -r 1 --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] batman-exp rev.1154 still using 94% CPU load
Hi axel, Do you feel this problem has arised with a specific revision (has it been there with rv1069 and before) or has it always been there and your setup has changed? The first time I saw this was on revistion 1105. But a similar problem was already on revision 972 where I still could call batmand -c -r 3 but not with -d I can not say if this is still the same problem. My setup of compiling batmand was not changed. The compile flags I used were: CFLAGS =-Wall -O1 -DMEMORY_USAGE -DPROFILE_DATA -DDEBUG_MALLOC LDFLAGS = -lpthread Today I changed to revision 1171 and use the following flags: CFLAGS =-Wall -O2 -g -DDEBUG_MALLOC -DMEMORY_USAGE -DPROFILE_DATA LDFLAGS = -lpthread I compile batmand within the whiterussian_rc6 openwrt environment. I wanted to create a core file but it seems that the openwrt kernel does not support it. (ulimit -c unlimited, and kill -6 xxx) Actually not. Unfortunately I'll be probably be offline during the next week and cant do much. There is a completely thread-free version waiting to be checked in, then we can see if this helps, But actually I would prefer to nail down the source of the problem... When do you expect a thread-free version of the batman-experimental branch? I also like to solve such problems instead of using new code in hope that the problem is gone. Bye Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
[B.A.T.M.A.N.] batmand-exp rev 1146 - hanging again
Hello, I'm using batman-ex with revision 1146. One thread is running with high cpu load. I can not make any connection via batmand -c. When I try it, the call is also hanging. I had the problems before, but not very often. I can not say how to reproduce this. the compile/linker flags I use are: CFLAGS =-Wall -O1 -DMEMORY_USAGE -DPROFILE_DATA -DDEBUG_MALLOC LDFLAGS = -lpthread Here the top output. Most time the load is about 90%. Mem: 14840K used, 15784K free, 0K shrd, 1544K buff, 5936K cached CPU: 15.8% usr 84.1% sys 0.0% nice 0.0% idle 0.0% io 0.0% irq 0.0% softirq Load average: 1.04 0.96 0.72 PID PPID USER STAT VSZ %MEM %CPU COMMAND 20639 1793 root R 1260 4.1 67.6 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 1106 1 root S 1260 4.1 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 1793 1106 root S 1260 4.1 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules --no-prio-rules --one-way-tunnel 1 --two-way-tunnel 0 eth1 tbb /t 1 /i /A 21037 21035 root S 1212 3.9 0.0 batmand -cb -d2 21283 21241 root S 1212 3.9 0.0 batmand -c -b -r 1 Regards Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] batman-exp still has sometimes around 90% CPU load
Hi, I will try the release 1146. Just for information I use the following compile/linker switches: CFLAGS =-Wall -O1 -DMEMORY_USAGE -DPROFILE_DATA -DDEBUG_MALLOC LDFLAGS = -lpthread Cheers, Stephan
[B.A.T.M.A.N.] batman-exp still has sometimes around 90% CPU load
Hi, I'm using batman-experimental rev1145. I test this release since 2 days. I found that a thread of batmand was consuming around 90% of cpu resources. Mem: 15064K used, 15560K free, 0K shrd, 1152K buff, 5176K cached CPU: 10.0% usr 90.0% sys 0.0% nice 0.0% idle 0.0% io 0.0% irq 0.0% softirq Load average: 0.98 1.00 0.97 PID PPID USER STAT VSZ %MEM %CPU COMMAND 4464 1941 root R 1216 3.9 84.1 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules 1151 1 root S 1216 3.9 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules 1941 1151 root S 1216 3.9 0.0 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rules it was not possibe to connect to the running batmand. A call batmand -c just hangs without printing current parameters. I had this error in previous revision where you disabled the debug task. Batmand was running without problems.Have you enabled the debug task again in current revision? any idea or solution? Regards /Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
[B.A.T.M.A.N.] current vis revision (1128) not working with batmand-experimental
Hi, I had some problems with vis which was consuming a lot of memory within a short time after running for days. So I decided to use the current revision of vis. But it seems that the vis does not process any incomming vis packet from batman-experimental. Do you have any idea to get it running? Best regards Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] My patches don't reference you mails
Hi, https://list.open-mesh.net/pipermail/b.a.t.m.a.n/2008-September.txt.gz If I look at this file I agree with you that there is no In-Replay-To. But the email header is normaly much longer. So we can see this if you create a new message for the list (not a reply). The mailing list modifies the emails and removes a lot of header entries. Perhaps the mailing list has a problem with multi-part email. Have you tried to send new emails without attachments/pgp? If it is working than the mailing ist has a problem with that and attachments should be avoided. see: Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part. Url : http://list.open-mesh.net/pipermail/b.a.t.m.a.n/attachments/20080912/69d789a8/attachment.pgp From sven.eckelmann at gmx.de Fri Sep 12 16:59:45 2008 /Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] [PATCH] Send TUNNEL_IP_REQUEST as response of TUNNEL_IP_REQUEST
Hello Sven, It seems that you are always replying to emails of a different thread. The information you send do not belong to my questions. I also have seen that you replay to other threads that is not related to the threads. Please create a new thread instead for you PATCH messages as it would keep the thread in correct order. Bye Stephan
[B.A.T.M.A.N.] batman-exp hang on 95% CPU load
Hi, I currently use batman-experimental rev 1105. Yesterday a node was not reachable. The top showed me that batmand was running with almost 100% load. I could not attach to the batmand to watch any debug info or states. When calling batmand -cd8 (or others) the call simply hangs without prints. The only call I could make was batmand -c, which displayed: /sbin/batmand [not-all-options-displayed] -r 1 -a 10.12.10.16/28 -a 172.16.10.17/32 eth1 tbb /t logread did not show any batman message. The memory consumption was also ok. After hard killing and starting the daemon batmand runs normal. Creating a core dump on wrt was not possible. Do you have already seen this? /Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden
Re: [B.A.T.M.A.N.] batman-exp hang on 95% CPU load
Hello, I forgot to add the process list of batmand. The hanging thread was created by 1115. This may give you a hint to find the reason for hanging. 1114 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 1115 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 1116 root 1216 S /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 13821 root 1216 R /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t Mem: 16424K used, 14200K free, 0K shrd, 1588K buff, 7104K cached CPU: 6.6% usr 93.3% sys 0.0% nice 0.0% idle 0.0% io 0.0% irq 0.0% softirq Load average: 0.89 1.05 1.00 PID PPID USER STAT VSZ %MEM %CPU COMMAND 13821 1115 root R 1216 3.9 94.3 /sbin/batmand -s 10.12.0.1 -a 10.12.10.16/28 -r 1 --t 63 --no-unreachable-rule --no-throw-rul Regards and thanks Stephan --- Dipl.Informatiker(FH) Stephan Enderlein Freifunk Dresden