forgot to add bugs@openbsd.org

Subject: Re: relayd's icmp check only works for a small number of hosts
Date: 2016-09-02 17:50
From: Remi Locherer <remi.loche...@relo.ch>
To: Reyk Floeter <r...@openbsd.org>

On 2016-09-02 16:51, Reyk Floeter wrote:
On Fri, Aug 19, 2016 at 04:31:10PM +0200, Remi Locherer wrote:
>Synopsis:   relayd's icmp check only works for a small number of hosts
>Category:   relayd
>Environment:
        System      : OpenBSD 5.9
Details : OpenBSD 5.9 (GENERIC.MP) #10: Wed Aug 3 13:46:07 CEST 2016 r...@stable-59-amd64.mtier.org:/binpatchng/work-binpatch59-amd64/src/sys/arch/amd64/compile/GENERIC.MP

        Architecture: OpenBSD.amd64
        Machine     : amd64

>Description:
relayd says 70 out of 104 hosts are not reachable via icmp. But ping on the same host where relayd runs can reach all hosts with a rtt below 1ms.

In the logs I see "210ms,icmp read timeout". But in relayd.conf a timeout
of 1000 is set.


All checks have to be completed before the next check interval.  With
that many tests, it can happen that relayd is not finished
sending/receiving all individual checks before the next interval;
missed hosts will be marked down.

You could try the following:

1. Increase the global interval.

With this in relayd.conf:
interval 300
timeout 60000

relayd successfully checked 36 hosts and reported icmp response times between 4 and 6 ms. After 60s relayd reports "icmp read timeout" for the other 68 hosts.

Sep 2 17:29:13 lb2 relayd[31358]: host 192.168.63.48, check icmp (60008ms,icmp read timeout), state unknown -> down, availability 0.00%

While it's true that a few hosts are down the majority of hosts answer
my manual pings within 0.600 ms.


2. Instead of testing the same hosts multiple times, you can use the
"parent" keyword to interhit the state from a tested hosts, eg.

table <foo> {
        10.1.1.1
}

table <bar> {
        10.1.1.1 parent 1
}

table <baz> {
        10.1.1.1 parent 1
}

I'll try this one. It's a bit tricky since I can only reference the
parent table by index and not by name. My relayd.conf is generated
and deployed with Ansible.

Reply via email to