On Wed, 16 Jun 2021, Miguel Ponce Antolin wrote:
Some questions that came to me with the upgrade option, - Is it still needed to separate the rightsubnets? And do you create them on different files? I have understood that you create them on the same conf file.
I would first try the upgrade and see if your problem remains. If it does, try separating the conns. It can be in 1 file.
- The ikelifetime and salifetime for rekeying is still a problem on version 4.4-1?, I think it is recommended anyway.
It is not so much a code problem, but an interop of configuration problem. Tweaking the lifetimes, tweaks who will decide to rekey first. That can work around implementation bugs. We are not aware of libreswan having a bug here. I just gave you two methods that ensure either libreswan always rekeys, or libreswan never rekeys. That usually works around any other implementation bugs. Paul
Thanks again, Best Regards! El mar, 15 jun 2021 a las 17:40, Paul Wouters (<[email protected]>) escribió: On Tue, 15 Jun 2021, Miguel Ponce Antolin wrote: > I have been suffering a random problem with libreswan v3.25 when connecting an AWS EC2 Instance running Libreswan and a Cisco ASA on the other end. Is it possible to test v4.4 ? We have rpms build on download.libreswan.org/binaries/ Specifically, with the many subnets you are likely needing this fix from 4.4: * IKEv2: Connections would not always switch when needed [Andrew/Paul] But the changelog between 3.25 and 4.4 is huge. There might be other items you need too. Alternatively, you can try and split up your subnetS into different conns, eg: conn vpn type=tunnel authby=secret # use auto=ignore, will be read in via also= statements auto=ignore left=%defaultroute leftid=xxx.xxx.xxx.120 leftsubnets=xxx.xxx.xxx.80/28 right=xxx.xxx.xxx.45 rightid=xxx.xxx.xxx.45 # no rightsubnet= here # dont use this with more than one subnet... leftsourceip=xxx.xxx.xxx.92 ikev2=insist ike=aes256-sha2;dh14 esp=aes256-sha256 keyexchange=ike ikelifetime=28800s salifetime=28800s dpddelay=30 dpdtimeout=120 dpdaction=restart encapsulation=no conn vpn-1 also=vpn auto=start rightsubnet=10.subnet.1.0/22 conn vpn-2 also=vpn auto=start rightsubnet=10.subnet.2.0/20 [...] conn vpn-18 also=vpn auto=start rightsubnet=10.subnet.18.9/32 This uses a slightly different code path to get all the tunnels loaded and active. > We tried to "force" to reconnect using the ping command to an IP in various rightsubnets but when the problem is active we continously are seeing this > kind of logs: That would be hacky and not really solve race conditions. > Jun 11 11:17:25.795153: "vpn/1x15" #221: message id deadlock? wait sending, add to send next list using parent #165 unacknowledged 1 next message > id=63 ike exchange window 1 Note that this is a bit of a concern. You can only have one IKE message outstanding, and this indicates that the Cisco might not be answering that outstanding message, and so the only thing libreswan can do is wait longer or restart _everything_ related to that IKE SA, so that means all tunnels. We did reduce the change of message id deadlock some point in the past with our pending() code, so again tetsing with an upgraded libreswan would be a useful test. > Is there any troubleshooting we could do in order to know where the rekey request is lost or why is not trying to rekey at all when this problem is > active? Depending on what the issues are, you can try to ensure either libreswan or Cisco is always the rekey initiator by tweaking the ikelifetime and salifetime. Eg try ikelifetime=24h with salifetime=8h and most likely Cisco will trigger all the rekeys. Or use ikelifetime=2h and salifetime=1h to make libreswan likely always initiate the rekeys. Paul -- Logo Especialidad Miguel Ponce Antolín. Sistemas · +34 670 360 655 Linea Logo Paradigma · paradig.ma · contáctanos · Twitter Youtube Linkedin Instagram
_______________________________________________ Swan mailing list [email protected] https://lists.libreswan.org/mailman/listinfo/swan
