Hi all,

I have recently been running into the problem that some of my l4xnat
farms become unreachable every few days until I restart the problem
farm(s). In the farmguardian logs I found messages like:

iptables: Resource temporarily unavailable.
iptables: Index of deletion too big.

The above messages appeared at the time of backend status changes when
farmguardian restarts the farm.

Sometimes the problem farms were then actually shown as down in the
web interface.

In zenloadbalancer.log I found messages like:
some date -  -  -  - running 'Start write false'
...
some date -  -  -  - running /sbin/iptables ....
some date -  -  -  - last command failed!
...

The more farms and thus farmguardian instances are running the more
likely and frequent the problem occurs. And even more likely if
multiple farms target the same backend server.

So what I figured is that if the backends of multiple farms change
status at the same time multiple farmguardian instances will restart
multiple farms i.e. running many iptables commands at the same time
which will cause some of those commands to fail.

As a quick workaround I modified the farmguardian script to prevent
multiple farmguardian instances restarting farms at the same time
(using file locking). The problem seems to have disappeared now.

...
use Fcntl qw(:flock);
...
        if ($type eq "l4xnat"){
          if (open(FGA,">/var/run/farmguardian_lock")){
            my $count = 1;
            while (!flock(FGA,LOCK_EX|LOCK_NB) && ($count < 11) ) {
              print("Restart farm $farmname blocked, waiting... $count\n");
              sleep(1);
              $count++;
            }
          }
          &_runFarmStop($farmname,"false");
          &setFarmBackendStatus($farmname,$j,"up");
          &_runFarmStart($farmname,"false");
          close(FGA);
        }
...
        if ($type eq "l4xnat"){
          if (open(FGA,">/var/run/farmguardian_lock")){
            my $count = 1;
            while (!flock(FGA,LOCK_EX|LOCK_NB) && ($count < 11) ) {
              print("Restart farm $farmname blocked, waiting... $count\n");
              sleep(1);
              $count++;
            }
          }
          &_runFarmStop($farmname,"false");
          &setFarmBackendStatus($farmname,$j,"fgDOWN");
          &_runFarmStart($farmname,"false");
          close(FGA);
        }

I guess a better place to do something like this would be in
farms_functions.cgi... Maybe the developers can have a look into this?

Kind Regards,
Stefan

------------------------------------------------------------------------------
_______________________________________________
Zenloadbalancer-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/zenloadbalancer-support

Reply via email to