In today's session there was an disagreement between Yaron and Tero about how likely it is that messages are missing.
Let's assume our cluster has tunnels with 10,000 peers, and that we do Liveness Check (DPD) every 20 seconds (StrongSwan default) Also, let's assume that we synch every 0.1 seconds, and detect a failure after 10 missed messages (1 second) We have 500 liveness checks every second (10,000 / 20), but at the time of the failure on average 25 messages (half of the amount of liveness checks in a 0.1 second period). So with Tero's way, 25 IKE SAs are mismatched (the other 475 didn't get a response, so the peer is still retransmitting) That's 25 IKE SAs torn down out of 10,000 (0.25%, or 1 in 400) Whether this is acceptable or not, is vendor-specific, and YMMV Yoav _______________________________________________ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec