That's a good point, but I don't think there needs to be any non-fatal
error logging. There are three situations during startup.
1) bind() succeeds.
2) bind fails for a reason which won't change - fatal error.
3) bind fails for a reason which may change - startup and wait until it
does change and try again.
The canonical example of 3) is the one I gave before,
--listen-address=1.2.3.4 but not local interface has address 1.2.3.4.
The intention is that when a new interface comes up with address 1.2.3.4
then a new socket will be created and bound. This is long after startup,
so the only option if it fails then is to log the event.
If the only situation where we want to wait is the one above, then the
solution to to make EADDRNOTAVAIL at startup the only one where we keep
waiting, and all the others are fatal. I think when I originally wrote
this I wasn't sure if that was the only non-fatal error which is why the
code is as it is.
This is not a complete solution to your original problem of enforcing
only one dnsmasq daemon process in any case. For example if you
configure a single listen-address which doesn't exist on the machine,
then you can start as many dnsmasq processes as you like and they'll all
start up and be waiting for the interface with that address to be
created. Once it is, all will try and bind it, and all but one will
fail, but they'll all still exist. Managing daemon processes is really
the job of sysvinit or systemd, but the authors of the bug seem to sant
protection from just running the binary from the command line.
TLDR;
We either pick a set of errors which are Ok to continue (EADDRNOTAVAIL,
what others?) and fail fatally at startup for all others, or we pick a
set of errors to fail fatally at startup (EADDRINUSE, EACCESS, what
others?) and continue for all others.
Cheers,
Simon.
On 23/11/2023 11:13, Petr Menšík wrote:
To fix problem with multiple instances correctly refusing running on the
same machine and namespaces, yes, it would be sufficient.
But I think part of the problem is hiding all problems during startup
and not showing them at all, in any source. I think that is okay for
EADDRNOTAVAIL to not be printed. But I think in other cases we want at
least warning somewhere. This way you also get exact error message
printed. For example selinux policy hardening may prevent your process
to listen on port 53, even though it has NET_BIND_SERVICE.
With my modification it will print errors for listeners used. Note
10.1.2.3 is hidden at that phase. You would not know it without strace
analysis. I expect there can be different errors, for example running
out of file descriptors or memory. Hiding something non-standard
happened during startup is quite a bad design. Only some kinds of errors
are okay during startup.
$ sudo -u nobody fedora-like/dnsmasq -d --bind-dynamic
--listen-address=10.1.2.3
dnsmasq: failed to create listening socket for 127.0.0.1: Permission denied
dnsmasq: failed to create listening socket for 127.0.0.1: Permission denied
dnsmasq: failed to create listening socket for ::1: Permission denied
dnsmasq: failed to create listening socket for ::1: Permission denied
dnsmasq: process is missing required capability NET_BIND_SERVICE
# Compare this with:
$ sudo -u nobody fedora-like/dnsmasq -d --bind-interfaces
--listen-address=10.1.2.3
dnsmasq: failed to create listening socket for 127.0.0.1: Permission denied
I think we want any errors printed, even if they are not made fatal.
Except carefully chosen type of errors, which are expected and would
raise just false alarms. Not sure how to trigger other types of errors,
but I am sure I would like to see them, even if they did not cause the
process to die. That is why I have used more complicated approach, which
should print everything unexpected, even when dnsmasq is not stopped. In
order to investigate you first have to know something unusual has happened.
On 23. 11. 23 0:29, Simon Kelley wrote:
Isn't this sufficient to fix the problem?
Not calling die() when bind-dynamic is set is intended to handle the
case that bind returns EADDRNOTAVAIL because you've configured
--listen-address=1.2.3.4 but there's not a local interface with that
address. dnsmasq runs anyway in the expectation that such an interface
will appear in the future and a socket will be bound then.
I don't think there's a die()/syslog() conflict at all.
diff --git a/src/network.c b/src/network.c
index ca9fada..db1d528 100644
--- a/src/network.c
+++ b/src/network.c
@@ -924,7 +924,7 @@ static int make_sock(union mysockaddr *addr, int
type, int dienow)
{
/* failure to bind addresses given by --listen-address at
this point
is OK if we're doing bind-dynamic */
- if (!option_bool(OPT_CLEVERBIND))
+ if (!option_bool(OPT_CLEVERBIND) || errno == EADDRINUSE)
die(s, daemon->addrbuff, EC_BADNET);
}
else
Cheers,
Simon.
On 22/11/2023 19:27, Petr Menšík wrote:
Hello everyone,
I have received error report RHEL-16398 [1], which I think makes
sense to fix even in the lastest version. I believe it allows
non-intentional another instance running without error. What is
worse, it does not even show any warning that initialization is
incomplete.
Of course the problem at start is those errors happen in time when no
log is available. I think that can be fixed easily by using stderr at
that time. That is patch #1.
Second makes EADDRNOTAVAIL bind errors still hidden, but prints all
other errors at least to stderr. On a system with systemd that should
make it present in journalctl -u dnsmasq anyway. EADDRINUSE is made
fatal, because that would not be usually handled by new addresses
added later. If there is a need to start another dnsmasq instance
without TCP listeners, I think that should be specified more
explicitly. Makes EADDRINUSE fatal the same way as with
--bind-interfaces.
Would you find any other errors, which should be hidden or made
fatal? What would you think of those changes?
1. https://issues.redhat.com/browse/RHEL-16398
_______________________________________________
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss