On Sun, Mar 10, 2019 at 02:41:35PM +0100, Michał Koc wrote:

> W dniu 10.03.2019 o 08:10, Otto Moerbeek pisze:
> > On Sat, Mar 09, 2019 at 11:19:11PM +0100, Michał Koc wrote:
> > 
> > > W dniu 09.03.2019 o 12:43, Otto Moerbeek pisze:
> > > > On Sat, Mar 09, 2019 at 12:10:34PM +0100, Michał Koc wrote:
> > > > 
> > > > > W dniu 09.03.2019 o 08:15, Otto Moerbeek pisze:
> > > > > > On Fri, Mar 08, 2019 at 12:03:25PM +0100, Michał Koc wrote:
> > > > > > 
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > We have a triple redundant vpn gateway setup with sasyncd running 
> > > > > > > and tons
> > > > > > > of tunnels, about 1000 flows.
> > > > > > > 
> > > > > > > Looking at the graph of memory usage, you can clearly see that 
> > > > > > > something is
> > > > > > > sucking up the memory.
> > > > > > > 
> > > > > > > The graph can be viewed here: https://pasteboard.co/I4sjzQ8.jpg
> > > > > > > 
> > > > > > > Looking at the ps, sasyncd shows huge memory consumption:
> > > > > > > 
> > > > > > > USER         PID       %CPU  %MEM   VSZ          RSS        TT 
> > > > > > > STAT
> > > > > > > STARTED       TIME       COMMAND
> > > > > > > _isakmpd 33560  0.0       17.0        699264   708508 ?? S
> > > > > > > 26Feb19        6:58.81  /usr/sbin/sasyncd
> > > > > > > 
> > > > > > > It only happens on the master node. Slaves do not show such a 
> > > > > > > behavior.
> > > > > > > 
> > > > > > > There is nothing about sasyncd in the logs.
> > > > > > > 
> > > > > > > After sasyncd restart memory consumption is minimal, but tends to 
> > > > > > > grow.
> > > > > > > 
> > > > > > > Is it normal ? or am I missing something ?
> > > > > > > 
> > > > > > > Best regards
> > > > > > > M.K.
> > > > > > > 
> > > > > > This is not normal. You could try to run with -vv to see if some 
> > > > > > error
> > > > > > path is taken that triggers a leak.
> > > > > > 
> > > > > >     -Otto
> > > > > > 
> > > > > Should I look for something specific ?
> > > > > 
> > > > > The log grows pretty fast and it looks like it could contain some 
> > > > > security
> > > > > data which I wouldn't like to post online.
> > > > > 
> > > > > The statistics of the log(about 2 hours) looks like this:
> > > > > carp_init:       1
> > > > > config:       7
> > > > > monitor_get_pfkey_snap:       4
> > > > > monitor_loop:       1
> > > > > net:       1
> > > > > net_connect:       3
> > > > > net_ctl:       4
> > > > > net_disconnect_peer:       3
> > > > > net_handle_messages:       2
> > > > > net_queue:   91780
> > > > > net_read:      10
> > > > > net_send_messages:   39192
> > > > > pfkey_send_flush:       4
> > > > > pfkey_snapshot:    6832
> > > > > timer_add:      19
> > > > > timer_run:      18
> > > > > 
> > > > > Best regards
> > > > > M.K.
> > > > > 
> > > > Just the counts does not reveal anything. I did a quick audit of the
> > > > memory allocation logic of sasyncd and did not spot an error. If you
> > > > do not want to post the logs, you'll neeed to analyze them yourself.
> > > > This requires matching the log lines to the code and tracking where
> > > > stuff gets allocated and deallocated. Some digging could reveal the 
> > > > error.
> > > > 
> > > > I used to run sasyncd, but I do no longer. Settig up a test env is
> > > > quite some work I do not have time for.
> > > > 
> > > >         -Otto
> > > > 
> > > First of all, thank You for your time. I know it's one of the most 
> > > valuable
> > > resource.
> > > 
> > > We have done some analysis and we have found the problem.
> > > 
> > > The problem is that the very master machine exists as a peer in it's 
> > > config.
> > > The purpose of this is to make all of the config files to be as similar as
> > > possible for all of the members of the cluster.
> > > 
> > > Removing it from peers fixes the problem.
> > > 
> > > So there are three main roads we can follow:
> > > 1. Fix sasyncd to recognize self and handle it properly (a result)
> > > 2. It should be mentioned in manual, not to set self as a peer (an excuse)
> > > 3. We can change our internal config handling (no one else benefits)
> > > 
> > > What would You recommend as a next step ? we will do much to follow road 
> > > 1,
> > > but we might need help, eg. code review and some guidance to meet OpenBSD
> > > needs
> > > 
> > > Furthermore if we follow road 1, is there any chance to get the reviewed,
> > > correct, accepted fix into the tree ?
> > > 
> > > Best regards
> > > M.K.
> > > 
> > Good you found the problem.
> > 
> > As for number 1, in general it is hard to determine if an IP is your
> > own (or redirects to yourself), especially since network interfaces
> > can be added and changed dynamically. But a starup check against
> > getifaddrs(3) would catch the most obvious I'm my own peer cases.
> > 
> > A dynamic form of loop detection would be nice, it would require
> > adding a hostid or equivalent. That is not normally set these days and
> > also would change the wire format of the messages.
> > 
> > But even if you start work on code, a manual page change warning
> > against this would be a good thing.
> > 
> > If you post diffs to tech@ with a proper explanation they will be
> > picked up and reviewed, either by me or somee other developer.
> > 
> > BTW, I noticed there are order dependencies in sasyncd.cnf that are
> > not mentioned in the man page. Also the example in
> > src/etc/examples/sasyncd.conf does not work if you just uncomment the
> > lines because of those order dependencies. So there's more room for
> > improvement in the man page and example config.
> > 
> >     -Otto

> Correct me please if I am wrong, but in my opinion vpn gateways are usually
> static setups, because of design, security, network, documentation, politics
> and other reasons
> 
> So startup check should at least give a clue when something goes wrong. Of
> course setting up some kind of host id would be best solution, we will get
> back to it in less busy time.
> 
> The patch below makes the following changes:
> 1. adds startup check for peers being listed in local addresses
> 2. adds proper log messages in cases of peer being local address or
> duplicate
> 3. fixes duplicate handling, which was incorrect
> 4. changes name of "duplicate" variable to "valid" to allow it to used in
> more cases than just duplicate
> 
> Best regards
> M.K.

Thanks for working on this,

Some comments:

1. I do not like lots of code in yacc rules. Better make the validation
a separate function.

2. You are comparing string representations. One is from the config
file and one is produced by inet_ntop(), that one is in canonical
form, but the first does not have to be. So you might miss equivalent
addresses.  Probably better convert the ones form the config file with
inet_pton() compare those to the addresses returned by getifaddrs()
using memcmp().

3. Why use malloc for fixed size? A char buf[INET6_ADDRSTRLEN] will do,
(but see 2).

4. I would make it clear in the log message you're skipping the "self" peer.

        -Otto

> 
> Index: conf.y
> ===================================================================
> RCS file: /cvs/src/usr.sbin/sasyncd/conf.y,v
> retrieving revision 1.19
> diff -u -p -r1.19 conf.y
> --- conf.y    9 Apr 2017 02:40:24 -0000       1.19
> +++ conf.y    10 Mar 2019 13:39:04 -0000
> @@ -30,8 +30,10 @@
>  #include <sys/types.h>
>  #include <sys/stat.h>
>  #include <sys/socket.h>
> +#include <arpa/inet.h>
>  #include <ctype.h>
>  #include <fcntl.h>
> +#include <ifaddrs.h>
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include <string.h>
> @@ -172,17 +174,50 @@ setting         : INTERFACE STRING
>               | PEER STRING
>               {
>                       struct syncpeer *peer;
> -                     int              duplicate = 0;
> +                     int              valid = 1;
> +                     struct ifaddrs  *ifap = 0, *ifa;
> +                     char            *ifaddrname = 0;
> +                     
> +                     if (getifaddrs(&ifap) == 0) {
> +                             for (ifa = ifap; ifa && valid; ifa = 
> ifa->ifa_next) {
> +                                     if (!ifa->ifa_addr)
> +                                             continue;
> +
> +                                     switch (ifa->ifa_addr->sa_family) {
> +                                     case AF_INET:
> +                                             ifaddrname = 
> malloc(INET_ADDRSTRLEN);
> +                                             if (ifaddrname)
> +                                                     inet_ntop(AF_INET, 
> &((struct sockaddr_in *)ifa->ifa_addr)->sin_addr, ifaddrname, 
> INET_ADDRSTRLEN);
> +                                             break;
> +                                     case AF_INET6:
> +                                             ifaddrname = 
> malloc(INET6_ADDRSTRLEN);
> +                                             if (ifaddrname)
> +                                                     inet_ntop(AF_INET6, 
> &((struct sockaddr_in6 *)ifa->ifa_addr)->sin6_addr, ifaddrname, 
> INET6_ADDRSTRLEN);
> +                                             break;
> +                                     }
> +
> +                                     if(ifaddrname) {
> +                                             if (strcmp($2, ifaddrname) == 
> 0) {
> +                                                     log_msg(2, "config: 
> local peer %s", $2);
> +                                                     valid = 0;
> +                                             }
> +                                             free(ifaddrname);
> +                                             ifaddrname = 0;
> +                                     }
> +                             }
> +                             freeifaddrs(ifap);
> +                     }
> -                     for (peer = LIST_FIRST(&cfgstate.peerlist); peer;
> -                          peer = LIST_NEXT(peer, link))
> -                             if (strcmp($2, peer->name) == 0) {
> -                                     duplicate++;
> +                     if (valid)
> +                         for (peer = LIST_FIRST(&cfgstate.peerlist); peer;
> +                              peer = LIST_NEXT(peer, link))
> +                                 if (strcmp($2, peer->name) == 0) {
> +                                     log_msg(2, "config: duplicate peer %s", 
> $2);
> +                                     valid = 0;
>                                       break;
> -                             }
> -                     if (duplicate)
> -                             free($2);
> -                     else {
> +                                 }
> +
> +                     if (valid) {
>                               peer = calloc(1, sizeof *peer);
>                               if (!peer) {
>                                       log_err("config: calloc(1, %lu) "
> @@ -191,10 +226,11 @@ setting         : INTERFACE STRING
>                                       YYERROR;
>                               }
>                               peer->name = $2;
> -                     }
> -                     LIST_INSERT_HEAD(&cfgstate.peerlist, peer, link);
> -                     cfgstate.peercnt++;
> -                     log_msg(2, "config: add peer %s", peer->name);
> +                             LIST_INSERT_HEAD(&cfgstate.peerlist, peer, 
> link);
> +                             cfgstate.peercnt++;
> +                             log_msg(2, "config: add peer %s", peer->name);
> +                     } else
> +                             free($2);
>               }
>               | LISTEN ON STRING af port
>               {
> 

Reply via email to