Optimizing the protocol settings to the artificially stable testbed sounds like a bad idea...
Do you expect everything to be perfectly stable and static? Henning Rogge On Sat, Dec 20, 2025 at 9:25 AM Benjamin Henrion <[email protected]> wrote: > > Hi, > > If your network consists of pretty static nodes (fixed routers on the roof), > you can tune your settings to update the routing less frequently. > > Adding and removing node can take time (30 minutes) it does not need to be > instant. > > Best, > > -- > Benjamin Henrion (zoobab) > Email: zoobab at gmail.com > Mobile: +32-484-566109 > Web: http://www.zoobab.com > FFII.org Brussels > "In July 2005, after several failed attempts to legalise software patents in > Europe, the patent establishment changed its strategy. Instead of explicitly > seeking to sanction the patentability of software, they are now seeking to > create a central European patent court, which would establish and enforce > patentability rules in their favor, without any possibility of correction by > competing courts or democratically elected legislators." > > Le ven. 19 déc. 2025, 18:18, Valent@MeshPoint <[email protected]> a > écrit : >> >> Hi everyone, >> I'm working on a fair, reproducible benchmark methodology for comparing >> mesh routing protocols (Babel, BATMAN-adv, Yggdrasil, and others). >> Before >> running the full benchmark, I'd like to get feedback from the Babel >> community on the methodology. >> BACKGROUND >> ---------- >> We're using meshnet-lab (https://github.com/mwarning/meshnet-lab) for >> testing, which creates virtual mesh networks using Linux network >> namespaces >> on a single host. This approach has limitations that we've documented, >> and >> I'd appreciate input on whether our methodology properly accounts for >> them. >> TEST ENVIRONMENT >> ---------------- >> Hardware: ThinkPad T14 laptop (12 cores, 16GB RAM) >> Software: meshnet-lab with network namespaces >> Protocols: babeld 1.13.x, batctl/batman-adv, yggdrasil 0.5.x >> INFRASTRUCTURE LIMITATIONS DISCOVERED >> ------------------------------------- >> During development, we found significant limitations when testing larger >> networks: >> 1. Supernode/Hub Bottleneck >> When testing real Freifunk topologies (e.g., Bielefeld with 246 nodes), >> we discovered that star topologies cause test infrastructure failures, >> not protocol failures. >> The issue: If a topology has a supernode (hub) connected to 200+ other >> nodes, the meshnet-lab bridge for that hub receives ~60 hello >> packets/second >> from all neighbors. This causes: >> - UDP packet loss at the bridge level >> - Apparent "connectivity failures" that are actually infrastructure >> artifacts >> - False negatives that make protocols look broken when they're not >> Our solution: Cap maximum node degree at 20 and avoid pure star >> topologies. >> 2. Scale Limitations >> We've validated that 100 nodes is a safe limit where: >> - CPU stays under 80% >> - Memory is not a bottleneck >> - Results are reproducible (variance < 10%) >> For networks larger than ~250 nodes, single-host simulation becomes >> unreliable regardless of available RAM. The bottleneck is CPU context >> switching between namespaces and multicast flooding overhead. >> 3. 1000+ Node Networks >> We cannot reliably test 1000+ node networks with this methodology. >> Any attempt would produce infrastructure artifacts, not protocol >> measurements. For such scales, distributed testing across multiple >> physical hosts would be needed. >> PROPOSED TEST SUITE >> ------------------- >> We've documented a methodology with: >> 6 Topologies: >> T1: Grid 10x10 (100 nodes, max degree 4) >> T2: Random mesh (100 nodes, max degree ~10) >> T3: Clustered/federated (100 nodes, 4 clusters) >> T4: Linear chain (50 nodes, diameter 49) >> T5: Small-world Watts-Strogatz (100 nodes) >> T6: Sampled real Freifunk (80 nodes, degree capped) >> 5 Validation Tests (before benchmarks): >> V1: 3-node sanity check >> V2: Scaling ladder (find breaking point) >> V3: Consistency check (reproducibility) >> V4: Resource monitoring >> V5: Bridge port audit >> 8 Benchmark Scenarios: >> S1: Steady-state convergence >> S2: Node failure recovery >> S3: Lossy link handling (tc netem) >> S4: Mobility/roaming simulation >> S5: Network partition and merge >> S6: High churn (10% nodes cycling) >> S7: Traffic under load (iperf3) >> S8: Administrative complexity (subjective) >> QUESTIONS FOR THE COMMUNITY >> --------------------------- >> 1. Missing tests? >> Are there scenarios important for Babel that we should add? >> 2. Unrealistic tests? >> Should we skip any tests that don't make sense for real-world >> evaluation? >> 3. Babel-specific considerations? >> Any configuration parameters or behaviors we should specifically >> measure? >> 4. Large-scale alternatives? >> Does anyone have experience with distributed mesh testing across >> multiple hosts? How do you handle the coordination and measurement? >> 5. Known limitations? >> Are there known Babel behaviors at scale that we should document >> upfront? >> INITIAL RESULTS >> --------------- >> Our initial tests with babeld show: >> Grid 100 nodes: 100% connectivity, ~14s convergence >> Chain 50 nodes: 100% connectivity, ~5s convergence >> Small-world 100 nodes: 100% connectivity, ~12s convergence >> These results validate that the test infrastructure works correctly >> for Babel at this scale. >> FULL METHODOLOGY DOCUMENT >> ------------------------- >> The complete methodology document attached. >> I'd appreciate any feedback, suggestions, or concerns before we proceed >> with the full benchmark. >> Thanks, >> Valent. >> >> >> ------ Original Message ------ >> From "Juliusz Chroboczek" <[email protected]> >> To "Linus Lüssing" <[email protected]> >> Cc "Valent Turkovic" <[email protected]>; >> [email protected] >> Date 19.12.2025. 12:45:16 >> Subject Re: [Babel-users] Restarting MeshPoint – seeking advice on >> routing for crisis/disaster scenarios >> >> >> There's also l3roamd, predating sroamd: >> >> >> >> https://github.com/freifunk-gluon/l3roamd >> > >> >That's right, I should have mentioned it. I'll be sure to give proper >> >credit if I ever come back to sroamd. >> > >> >For the record, sroamd is based on a combination of the ideas in l3roamd >> >and in the PMIPv6 protocol, plus a fair dose of IS-IS. >> > >> >-- Juliusz_______________________________________________ >> Babel-users mailing list >> [email protected] >> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users > > _______________________________________________ > Babel-users mailing list > [email protected] > https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users _______________________________________________ Babel-users mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/babel-users
