On 2023-04-13 21:01 +02, "Peter J. Philipp" <p...@delphinusdns.org> wrote: > On Mon, Apr 10, 2023 at 10:17:08AM +0200, Peter J. Philipp wrote: >> On Sat, Apr 08, 2023 at 08:28:05PM +0200, Peter J. Philipp wrote: >> /cut >> > Apr 6 14:43:05 polarstern unwind[97893]: bad packet: too large: 65552 - >> > pool.ntp.org. IN AAAA >> > Apr 6 14:46:25 polarstern unwind[97893]: bad packet: too large: 65552 - >> > pool.ntp.org.mainrechner.de. IN AAAA >> >> /cut >> > Any thoughts? >> > >> > Best Regards, >> > -peter >> >> OK I traced this log down, by mere accident. It is a condition when the >> forwarder isn't listening on TCP (or UDP). Perhaps we need a better syslog >> for this, I'll see later today if I can produce a patch. >> >> Best Regards, >> -peter > > Alright, I have a patch together with the first patch that I sent. I tested > this by killing my forwarder and pinging centroid.eu (my site), the error > message is correct in my view. But you're the judge of that.
Sorry, this doesn't make any sense. I could never reproduce the -1 or > 65535 case reliably, I see it once in a while, but I can't trigger it. I also can't reproduce it with your instructions. As far as I can work out these weird answer_len numbers come from libworker_event_done_cb() in libworker.c, specifically: (*cb)(cb_arg, rcode, (buf?(void*)sldns_buffer_begin(buf):NULL), (buf?(int)sldns_buffer_limit(buf):0), sec, why_bogus, was_ratelimited); So you are comparing against sldns_buffer_limit(buf), that can just be anything, you can't derive from that "remote server had no listening port". I get "rcode: 2, answer_len: 128" or something else sensibly if I point unwind at a non-responsive forwarder. Since you can reproduce it, can you try this diff please and report back? Does it hit that code path when answer_len is -1 or when answer_len is 65552? diff --git libunbound/libunbound/libworker.c libunbound/libunbound/libworker.c index 11bf5f9db55..6bb34001ee9 100644 --- libunbound/libunbound/libworker.c +++ libunbound/libunbound/libworker.c @@ -680,6 +680,9 @@ libworker_event_done_cb(void* arg, int rcode, sldns_buffer* buf, context_query_delete(q); lock_basic_unlock(&ctx->cfglock); + log_warn("%s: rcode: %d, answer_len: %d\n", __func__, + rcode, buf?(int)sldns_buffer_limit(buf):0); + if(!cancelled) { /* call callback */ int sec = 0; > > ----------------> > Apr 13 20:53:26 polarstern unwind[73363]: terminating > Apr 13 20:53:27 polarstern resolvd[81113]: rebuilding: new unwind > Apr 13 20:54:12 polarstern unwind[16252]: remote nameserver had no listening > port: centroid.eu. IN A > ^Z[1] + Suspended tail -f /var/log/daemon > root@polarstern# dddctl start > starting delphinusdnsd > root@polarstern# fg > tail -f /var/log/daemon > <---------------- > (after this restart of delphinusdnsd forwarder this pinged fine, no more logs) > > The patch is below my .signature, > > Best Regards, > -peter > > Index: resolver.c > =================================================================== > RCS file: /cvs/src/sbin/unwind/resolver.c,v > retrieving revision 1.158 > diff -u -p -u -r1.158 resolver.c > --- resolver.c 8 Feb 2023 08:01:25 -0000 1.158 > +++ resolver.c 13 Apr 2023 18:51:01 -0000 > @@ -954,13 +954,19 @@ resolve_done(struct uw_resolver *res, vo > running_res = --rq->running; > > if (answer_len < LDNS_HEADER_SIZE) { > - log_warnx("bad packet: too short"); > + if (answer_len != -1) > + log_warnx("bad packet: too short"); > goto servfail; > } > > if (answer_len > UINT16_MAX) { > - log_warnx("bad packet: too large: %d - %s", answer_len, > - query_imsg2str(query_imsg)); > + if (answer_len == 65552) { > + log_warnx("remote nameserver had no listening port: %s", > + query_imsg2str(query_imsg); > + } else { > + log_warnx("bad packet: too large: %d - %s", answer_len, > + query_imsg2str(query_imsg)); > + } > goto servfail; > } > answer_header->answer_len = answer_len; > -- In my defence, I have been left unsupervised.