On Wed, 16 Jun 2021, Miguel Ponce Antolin wrote:

Some questions that came to me with the upgrade option,
- Is it still needed to separate the rightsubnets? And do you create them on 
different files? I have understood that you create them on the same conf
file.

I would first try the upgrade and see if your problem remains. If it
does, try separating the conns. It can be in 1 file.

- The ikelifetime and salifetime for rekeying is still a problem on version 
4.4-1?, I think it is recommended anyway.

It is not so much a code problem, but an interop of configuration
problem. Tweaking the lifetimes, tweaks who will decide to rekey
first. That can work around implementation bugs. We are not aware
of libreswan having a bug here. I just gave you two methods that
ensure either libreswan always rekeys, or libreswan never rekeys.
That usually works around any other implementation bugs.

Paul

Thanks again,

Best Regards!


El mar, 15 jun 2021 a las 17:40, Paul Wouters (<[email protected]>) escribió:
      On Tue, 15 Jun 2021, Miguel Ponce Antolin wrote:

      > I have been suffering a random problem with libreswan v3.25 when 
connecting an AWS EC2 Instance running Libreswan and a Cisco ASA on the
      other end.

      Is it possible to test v4.4 ? We have rpms build on 
download.libreswan.org/binaries/

      Specifically, with the many subnets you are likely needing this fix from 
4.4:

      * IKEv2: Connections would not always switch when needed [Andrew/Paul]

      But the changelog between 3.25 and 4.4 is huge. There might be other
      items you need too.

      Alternatively, you can try and split up your subnetS  into different
      conns, eg:


              conn vpn
                  type=tunnel
                  authby=secret
                  # use auto=ignore, will be read in via also= statements
                  auto=ignore
                  left=%defaultroute
                  leftid=xxx.xxx.xxx.120
                  leftsubnets=xxx.xxx.xxx.80/28
                  right=xxx.xxx.xxx.45
                  rightid=xxx.xxx.xxx.45
                  # no rightsubnet= here
                  # dont use this with more than one subnet...    
leftsourceip=xxx.xxx.xxx.92
                  ikev2=insist
                  ike=aes256-sha2;dh14
                  esp=aes256-sha256
                  keyexchange=ike
                  ikelifetime=28800s
                  salifetime=28800s
                  dpddelay=30
                  dpdtimeout=120
                  dpdaction=restart
                  encapsulation=no

             conn vpn-1
              also=vpn
              auto=start
              rightsubnet=10.subnet.1.0/22

             conn vpn-2
              also=vpn
              auto=start
              rightsubnet=10.subnet.2.0/20

             [...]

             conn vpn-18
              also=vpn
              auto=start
              rightsubnet=10.subnet.18.9/32


      This uses a slightly different code path to get all the tunnels loaded 
and active.

      > We tried to "force" to reconnect using the ping command to an IP in 
various rightsubnets but when the problem is active we continously are
      seeing this
      > kind of logs:

      That would be hacky and not really solve race conditions.

      > Jun 11 11:17:25.795153: "vpn/1x15" #221: message id deadlock? wait 
sending, add to send next list using parent #165 unacknowledged 1 next
      message
      > id=63 ike exchange window 1

      Note that this is a bit of a concern. You can only have one IKE message
      outstanding, and this indicates that the Cisco might not be answering
      that outstanding message, and so the only thing libreswan can do is
      wait longer or restart _everything_ related to that IKE SA, so that
      means all tunnels. We did reduce the change of message id deadlock
      some point in the past with our pending() code, so again tetsing
      with an upgraded libreswan would be a useful test.

      > Is there any troubleshooting we could do in order to know where the 
rekey request is lost or why is not trying to rekey at all when this
      problem is
      > active?

      Depending on what the issues are, you can try to ensure either libreswan
      or Cisco is always the rekey initiator by tweaking the ikelifetime and
      salifetime. Eg try ikelifetime=24h with salifetime=8h and most likely
      Cisco will trigger all the rekeys. Or use ikelifetime=2h and
      salifetime=1h to make libreswan likely always initiate the rekeys.

      Paul



--

Logo Especialidad

Miguel Ponce Antolín.
Sistemas    ·    +34 670 360 655
Linea
Logo Paradigma    ·   paradig.ma   ·   contáctanos   ·   Twitter   Youtube   
Linkedin   Instagram  



_______________________________________________
Swan mailing list
[email protected]
https://lists.libreswan.org/mailman/listinfo/swan

Reply via email to