Re: [SOLVED?] BIND: managed-keys-zone: Unable to fetch DNSKEY set '.': timed out

2023-03-13 Thread Casey Deccio


> On Mar 13, 2023, at 4:14 PM, local10  wrote:
> 
> Mar 13, 2023, 21:42 by recovery...@enotuniq.net:
> 
>> Well, it was worth to check it.
>> 
>> 
>> Next idea is somewhat more complicated.
>> 
>> Install tcpdump.
>> Run:
>> tcpdump -pni any -s0 -w /tmp/dns.pcap -c 30 udp port 53 or tcp port 53
>> Bounce BIND, wait for a minute at least.
>> Do some DNS queries. One or two will do.
>> Interrupt tcpdump unless it completes by itself.
>> Post dns.pcap.
>> 
> 
> 
> Strangely, the issue resolved itself without me having to do anything. Am 
> really puzzled as to what it was. Perhaps the internet provider suddenly 
> started to block DNS queries but then allowed them again?

Hard to tell without further data, but it's possible.

> If so, why did dig's message say that there was "communications error to 
> 127.0.0.1#53: timed out"? It really gives an impression that dig was failing 
> to connect 127.0.0.1 port 53, on which bind was running.
> 
> # dig www.yahoo.com 
> ;; communications error to 127.0.0.1#53: timed out
> ;; communications error to 127.0.0.1#53: timed out
> ...
> 
> Maybe someone will shed some light on this.

This one is a little misleading.  The fact is that BIND tries really hard to 
resolve your name, trying all sorts of alternate servers and fallbacks to 
account for timeouts, DNSSEC validation failures, and more.  Sometimes that can 
take a really long time.  In one of the outputs you provided previously, you 
showed "timed out" followed by SERVFAIL.  Those are symptoms of this behavior: 
first query times out with resolver trying things and second query returned the 
cached (SERVFAIL) failure.

Casey



Re: BIND: managed-keys-zone: Unable to fetch DNSKEY set '.': timed out

2023-03-13 Thread Casey Deccio

> On Mar 13, 2023, at 12:08 AM, local10  wrote:
> 
> I have  a local caching DNS server that was working fine for a long time but 
> today, all of a sudden, it stopped resolving queries.
> 
> More info: https://pastebin.com/iW5YeXgS
> 
> Any ideas? Thanks

Based on what I saw in the logs, your resolver is having trouble reaching the 
internet.  It shows problems with both the priming query (./NS) and the trust 
query (./DNSKEY).  Could you try running the following?

$ dig +norec @198.41.0.4 . NS
$ dig +norec @2001:503:ba3e::2:30 . NS
$ dig +norec @198.41.0.4 . DNSKEY
$ dig +norec @2001:503:ba3e::2:30 . DNSKEY

These manually send the same queries to the internet that your resolver is 
attempting.

Cheers,
Casey

Re: locating blocked port

2023-01-31 Thread Casey Deccio



> On Jan 31, 2023, at 8:05 AM, Haines Brown  wrote:
> 
> I have an  application that refuses to start because  its port is 
> blocked. But I have difficulty knowing what port it is

I would try strace, which shows you all system calls being made.  In this case, 
it is probably bind() that is returning an error.

strace -e trace=%net java -jar /usr/local/share/JabRef/JabRef-3.2.jar

Or

strace -e trace=%net java -jar /usr/local/share/JabRef/JabRef-3.2.jar 2>&1 | 
grep bind

For example:

$ cat test.py 
#!/usr/bin/env python3

import socket
s = socket.socket()
try:
   s.bind(('0.0.0.0', 56))
except:
   pass
$ python3 test.py # doesn't print any output
$ strace -e trace=%net python3 test.py 2>&1 | grep bind
bind(3, {sa_family=AF_INET, sin_port=htons(56), sin_addr=inet_addr("0.0.0.0")}, 
16) = -1 EACCES (Permission denied)

The value of sin_port is what you are looking for.

> How do I know from this what port the java application tried to use?
> 
> I try:
> 
> $ strings $(which jabref) | wc -l
>   56
> 

strings might be helpful (maybe?), but in this case, you are piping it to wc 
-l, which is simply counting the number of printable character sequences that 
were found in jabref.  If that also happens to be the port number, then it is 
coincidental.

> So I try:
> 
>   $ sudo ss -pt state listening 'sport = :56'
>   Recv-Q   Send-QLocal Address:Port   Peer Address:Port Process
> 
> This seems a null return. Dores this mean jabref is not using port 
> 56?

Well, it tells you that nothing (including jabref) is listening on TCP port 56, 
but it won't tell you about why something *failed* to listen.  See strace above.

Casey


Re: locating blocked port

2023-01-31 Thread Casey Deccio



> On Jan 31, 2023, at 8:05 AM, Haines Brown  wrote:
> 
> I have an  application that refuses to start because  its port is 
> blocked. But I have difficulty knowing what port it is

I would try strace, which shows you all system calls being made.  In this case, 
it is probably bind() that is returning an error.

strace -e trace=%net java -jar /usr/local/share/JabRef/JabRef-3.2.jar

Or

strace -e trace=%net java -jar /usr/local/share/JabRef/JabRef-3.2.jar 2>&1 | 
grep bind

For example:

$ cat test.py 
#!/usr/bin/env python3

import socket
s = socket.socket()
try:
s.bind(('0.0.0.0', 56))
except:
pass
$ python3 test.py # doesn't print any output
$ strace -e trace=%net python3 test.py 2>&1 | grep bind
bind(3, {sa_family=AF_INET, sin_port=htons(56), sin_addr=inet_addr("0.0.0.0")}, 
16) = -1 EACCES (Permission denied)

The value of sin_port is what you are looking for.

> How do I know from this what port the java application tried to use?
> 
> I try:
> 
>  $ strings $(which jabref) | wc -l
>56
> 

strings might be helpful (maybe?), but in this case, you are piping it to wc 
-l, which is simply counting the number of printable character sequences that 
were found in jabref.  If that also happens to be the port number, then it is 
coincidental.

>  So I try:
> 
>$ sudo ss -pt state listening 'sport = :56'
>Recv-Q   Send-QLocal Address:Port   Peer Address:Port Process
> 
>  This seems a null return. Dores this mean jabref is not using port 
>  56?

Well, it tells you that nothing (including jabref) is listening on TCP port 56, 
but it won't tell you about why something *failed* to listen.  See strace above.

Casey


Re: DNSSEC working but SSHFP reported as insecure

2022-12-04 Thread Casey Deccio



> On Dec 3, 2022, at 12:37 PM, Andre Rodier  wrote:
> 
> On Sat, 2022-12-03 at 12:09 -0700, Casey Deccio wrote:
>> 
>> It could be that your default DNS resolver is not validating.  ssh simply 
>> looks at the result of the DNSSEC validation
>> provided by your default resolver [1], so if it's not validating then you 
>> will never get "secure".  In the example in
>> your original post, you queried 1.1.1.1, which is a validating resolver.  
>> But your default resolver might yield
>> different results.  To test, do the following:
>> 
>> $ dig +dnssec main.homebox.world sshfp
>> 
>> And look for the presence of the "ad" (authenticated data) flag in the 
>> response.
>> 
>> Casey
>> 
>> [1] https://github.com/openssh/openssh-portable/blob/master/dns.c#L230
> 
> Thanks for your suggestion.
> 
> I was already using 1.1.1.1 in /etc/resolv.conf, when I had the issue.
> 
> I am running Debian Bullseye.

Even so, please invoke the dig command above and check that the "ad" flag is 
present in the response.

If you see the "ad" flag there, then run your ssh command again, but before you 
do, start something like the following before you invoke your ssh command:

sudo tcpdump -n -w ssh-dns.pcap port 53

(Modify according to your setup...)

Then open ssh-dns.pcap in Wireshark and inspect the DNS response for the 
presence of the "ad" flag.

Here is my output from running ssh on my (nearly) stock debian bullseye system:

casey@rome:~$ ssh -o VerifyHostKeyDNS=ask -o UpdateHostKeys=no 
casey-test@main.homebox.world
The authenticity of host 'main.homebox.world 
(2001:19f0:7402:86e:5400:4ff:fe38:b9b4)' can't be established.
ECDSA key fingerprint is SHA256:AMS/SI0c2IA2hufsFiTcE61/q7JYA5TtNUT6FRz1dd4.
Matching host key fingerprint found in DNS.
Are you sure you want to continue connecting (yes/no)? no
Host key verification failed.
casey@rome:~$ ssh -o VerifyHostKeyDNS=yes -o UpdateHostKeys=no 
casey-test@main.homebox.world
casey-test@main.homebox.world: Permission denied (publickey).

You can see that when I used VerifyHostKeyDNS=yes, it clearly trusted the host, 
based on the SSHFP record.

Casey


Re: DNSSEC working but SSHFP reported as insecure

2022-12-03 Thread Casey Deccio

> On Dec 3, 2022, at 9:22 AM, Andre Rodier  wrote:
> 
>> ssh -o VerifyHostKeyDNS=yes main.homebox.world
> 
> Yes, this is the default option in my ssh/config file.
> 
> I tried on the command line as well, but same result:


It could be that your default DNS resolver is not validating.  ssh simply looks 
at the result of the DNSSEC validation provided by your default resolver [1], 
so if it's not validating then you will never get "secure".  In the example in 
your original post, you queried 1.1.1.1, which is a validating resolver.  But 
your default resolver might yield different results.  To test, do the following:

$ dig +dnssec main.homebox.world sshfp

And look for the presence of the "ad" (authenticated data) flag in the response.

Casey

[1] https://github.com/openssh/openssh-portable/blob/master/dns.c#L230

Re: DNSSEC working but SSHFP reported as insecure

2022-12-03 Thread Casey Deccio

> On Dec 3, 2022, at 8:30 AM, Andre Rodier  wrote:
> 
> Where am I making a mistake, please ?

The DNSSEC looks fine.  That is, there is a secure chain from the root to the 
SSHFP record (see below).

Have you tried adding the VerifyHostKeyDNS=yes option?

ssh -o VerifyHostKeyDNS=yes main.homebox.world

Casey


[1]
$ dnsviz probe -a . -A -R sshfp main.homebox.world | dnsviz print
No global IPv6 connectivity detected
Analyzing .
Analyzing world
Analyzing homebox.world
Analyzing main.homebox.world
. [.]
  [.]  DNSKEY: 8/20326/257 [.], 8/18733/256 [.]
  [.]RRSIG: ./8/20326 (2022-11-30 - 2022-12-21) [.]
world [.] [.]
  [.]  DS: 8/13081/2 [.]
  [.]RRSIG: ./8/18733 (2022-12-03 - 2022-12-16) [.]
  [.]  DNSKEY: 8/13081/257 [.], 8/5436/256 [.], 8/60063/256 [.]
  [.]RRSIG: world/8/13081 (2022-12-01 - 2022-12-22) [.]
homebox.world [.] [.]
  [.]  DS: 13/8704/2 [.], 13/19691/2 [.], 13/45407/2 [.]
  [.]RRSIG: world/8/5436 (2022-12-02 - 2022-12-23) [.]
  [.]  DNSKEY: 13/19691/257 [.], 13/45407/256 [.], 13/8704/257 [.]
  [.]RRSIG: homebox.world/13/8704 (2022-11-24 - 2022-12-15) [.]
  [.]RRSIG: homebox.world/13/19691 (2022-11-24 - 2022-12-15) [.]
main.homebox.world
  [.]  SSHFP: 1 2 
7cf3701693baeb8406fd0db7182e01bbadc1f639ba4fc2ca7224116cc9d237dc, 2 1 
eb09a2823e9d8a51ef7fe3260e0890a56924da6f, 3 1 
142f2a695a2e06cabab6e19800657c3f0b28301d, 4 1 
35d346e05d1351a78868e033ebe736c3030d3551, 4 2 
052736c5f2e6dce7d41aeeb7f41dbce01d19d2ac9e9ccffab79fb37ab85ce335, 2 2 
c3cdd443653530c94c1b90511f3e07ce8fe1fcbbcd60e37729543a577b0a5a44, 3 2 
4f6dd59b7c671e9fe3265057aef76bc448aef75a4fce35513c17c62e9bb9c8f6, 1 1 
ea89f6c8c8eda5e29e913f4448a816a19624d125
  [.]RRSIG: homebox.world/13/45407 (2022-11-24 - 2022-12-15) [.]



Re: Bug - remote DNS monitoring

2022-09-13 Thread Casey Deccio


> On Aug 30, 2022, at 1:12 PM, Casey Deccio  wrote:
> 
> I am having trouble tracking down a bug in my monitoring setup.  It all 
> happened when I upgraded the monitored host (host B in my example below) to 
> bullseye.  Note that Host A is also running bullseye, but the problem didn't 
> show itself until Host B was upgraded.

With some help over at the bind-users mailing list [1], I discovered that 
nrpe-ng closes stdin when launching the command [2], and the new version of 
nslookup (invoked by check_dns) has issues when stdin is closed [3].

Redirecting stdin to /dev/null fixes the issue:

$ diff -u /usr/lib/python3/dist-packages/nrpe_ng/commands.py{.old,}
--- /usr/lib/python3/dist-packages/nrpe_ng/commands.py.old  2017-08-08 
13:05:02.0 -0600
+++ /usr/lib/python3/dist-packages/nrpe_ng/commands.py  2022-09-13 
17:00:36.767239885 -0600
@@ -85,6 +85,7 @@

 proc = tornado.process.Subprocess(
 run_args,
+stdin=subprocess.DEVNULL,
 stdout=tornado.process.Subprocess.STREAM,
 close_fds=True,
 env=env)

I've filed a bug report [4].

Thanks,
Casey

[1] https://lists.isc.org/pipermail/bind-users/2022-September/10.html
[2] https://github.com/bootc/nrpe-ng/blob/master/nrpe_ng/commands.py#L86
[3] https://github.com/libuv/libuv/blob/v1.x/src/unix/core.c#L602
[4] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1019718


Re: TCP: tcp_parse_options: Illegal window scaling value 15 > 14 received

2022-09-07 Thread Casey Deccio

> On Sep 7, 2022, at 12:41 PM, Jim Popovitch  wrote:
> 
> On Wed, 2022-09-07 at 12:37 -0600, Casey Deccio wrote:
>> 
>>> On Sep 7, 2022, at 11:46 AM, Jim Popovitch >> <mailto:j...@k4vqc.com>> wrote:
>>> 
>>> I saw some much of the verbose '15 > 14' logs that I just decided to
>>> net.ipv4.tcp_window_scaling=0 and be done with it. Cleared up the
>>> noise, haven't noticed any problems since. ymmv.
>> 
>> Sounds like you've seen a non-trivial amount of this?
>> 
>> Disabling window scaling with net.ipv4.tcp_window_scaling=0 will "fix" the 
>> logs, but of course, it will also disable window scaling, which means that 
>> you are limiting the size of your congestion window to 64KB.  This 
>> effectively limits the throughput of TCP sessions over "long, fat pipes".
> 
> Yep.  I don't own/run/maintain anything on fat pipes, just destinations such 
> as webservers, email  servers, and dns servers for mailinglists.  If the 
> bandwidth for them is now capped at 2MBs, that's ok in my book.   The notion 
> that everything needs to support 10Gb interfaces and terabyte sized hardware 
> is just not realistic. 

Fair enough/  Just trying to interpret "ymmv" :)

Cheers,
Casey

Re: TCP: tcp_parse_options: Illegal window scaling value 15 > 14 received

2022-09-07 Thread Casey Deccio

> On Sep 7, 2022, at 11:46 AM, Jim Popovitch  wrote:
> 
> I saw some much of the verbose '15 > 14' logs that I just decided to
> net.ipv4.tcp_window_scaling=0 and be done with it. Cleared up the
> noise, haven't noticed any problems since. ymmv.

Sounds like you've seen a non-trivial amount of this?

Disabling window scaling with net.ipv4.tcp_window_scaling=0 will "fix" the 
logs, but of course, it will also disable window scaling, which means that you 
are limiting the size of your congestion window to 64KB.  This effectively 
limits the throughput of TCP sessions over "long, fat pipes".

Casey

Re: net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-07 Thread Casey Deccio


> On Sep 3, 2022, at 7:30 PM, Kevin Price  wrote:
> 
> Am 03.09.22 um 06:32 schrieb Casey Deccio:
>>> On Sep 2, 2022, at 8:14 PM, Kevin Price  wrote
> 
>>> We got him. :) Casey, you file the bug report, Okay?
> 
>> Done!  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1018999
>> Thanks for all the help!
> 
> You are very welcome.
> 
> Thanks a lot for this conversation, which felt very pleasant to me, and
> kind, productive, and helpful, even though especially my initial reply
> was quite tight-lipped. Thanks to our well-working cooperation. We've
> successfully and quite quickly pinpointed the cause of a real-world
> problem that likely affects many others.

Indeed!  Thanks for the kind works and helpful and kind interactions.  It's a 
two-way street, after all :)

> IMHO, this is a good example of how I wish the Debian/FLOSS community to
> always be. Or any good community, for that matter. If I may: Very well
> done, Casey. *shoulder tap*

:)

> What caught my initial attention was the possibility of the kernel
> broadly changing its behavior within a stable release, which in itself
> would pose a huge problem, which to prevent is the very purpose of
> stable. Glad that turned out to be false. Your appreciativeness
> encouraged me to follow up on this, which rewarded me with quite some
> fun in helping to solve this little puzzle with you, and with the bonus
> of a few decoys in our way. ;D

There's more where that came from :)
https://lists.debian.org/debian-user/2022/08/msg00685.html

I've been sitting on that one for over a year, since I first upgraded some of 
my machines to bullseye.

> Out of curiosity I've subscribed to your
> bug #1018999. Very well written. Its outcome we'll see.

Thanks.  Yes, we'll see what happens.

> 
> As to if, when, and how it might get fixed, I'm not all that optimistic,
> so you might want to stick with any workarounds for a while. (maybe a
> tailored deb package that _Conflicts_: connman and _Recommends_:
> network-manager, or else maybe a kernel boot command line parameter
> "ipv6.disable=1", which completely overrides sysctl, or whatever may
> suit your needs)

I've ended up just hacking my own software to 1) disable IPv6 (again) on all 
interfaces, after they are up; 2) reset the network (in this case, the 
forwarding tables of the switch); and then 3) start the network scenario 
(whatever it is) [1].  It's not as clean, but I don't have to worry about what 
software might be third-party software (e.g., connman) might be installed on 
someone's system that might be running my software and how it might change, etc.

> In case your bug gets acknowledged, (which is a huge if) I'd expect any
> resolution to appear in stable no sooner than in Bookworm, whenever that
> may be released. (...very purpose of stable...)

It sounds right to me.  Of course, it all depends on if there is agreement that 
the behavior is a bug and how many others it is affecting.

> Also, in case bug
> #1018999 is not going to be fully resolved to your needs, we might
> consider filing a wishlist "bug report" against lxde to at least change
> their recommendation into something less troublesome, such as
> network-manager maybe. Which does not interfere with the user's
> preferences in the same way.

Could be.  I'm not sure how connman is used (by lxde), whether the (current) 
disable_ipv6 behavior by connman is intentional, etc.  I suppose that you and I 
have a sour taste in our mouths because of behavior that is "obviously" buggy, 
but others might see it as babies and bath water.

> Oh BTW, I ought to file another bug report against connman (if not
> already pending) for not being able to be installed via ssh in a DHCP
> environment. (because during postinst it reconfigures the network
> interfaces, failing to use the proper FQDN in DHCP requests, thus
> getting a new IP address assigned and cutting off the ssh session) Not
> quite certain, but I guess this violates some existing Debian policy, or
> else a new Debian policy to come into place rather soon. (bug report
> against debian-policy)

Could be, though admittedly, I'm not expert on Debian policy.

> Thank you Casey for being part of the Debian community. Your
> participation makes Debian a better place to be, so please keep it up!

Thanks!  Glad to be here.  I've been using Debian for over 20 years, but I've 
only recently (re-)subscribed to the user lists :O

Casey

[1] https://github.com/cdeccio/cougarnet/pull/15/files



Re: TCP: tcp_parse_options: Illegal window scaling value 15 > 14 received

2022-09-07 Thread Casey Deccio
Hi Michael,

> On Sep 7, 2022, at 5:49 AM, Michael Grant  wrote:
> 
> I'm seeing this error over and over in /var/log/messages:
> 
> Sep  6 05:02:42 hostname kernel: [408794.655182] TCP: tcp_parse_options: 
> Illegal window scaling value 15 > 14 received
> Sep  6 05:02:43 hostname kernel: [408794.830639] TCP: tcp_parse_options: 
> Illegal window scaling value 15 > 14 received
> Sep  6 05:02:43 hostname kernel: [408794.960811] TCP: tcp_parse_options: 
> Illegal window scaling value 15 > 14 received
> Sep  6 05:02:43 hostname kernel: [408795.180464] TCP: tcp_parse_options: 
> Illegal window scaling value 15 > 14 received
> 
> I've not been able to find much about these messages by searching,
> nothing useful is coming up.  Is anyone else seeing something like
> this?

This is consistent with RFC 7323, Section 2.3 [1], which states:

   "If a
   Window Scale option is received with a shift.cnt value larger than
   14, the TCP SHOULD log the error but MUST use 14 instead of the
   specified value."

>  Is this some sort of attack?

I am not sure.  But the purpose of keeping the window scale below 15 is to 
"insure that new data is never mistakenly considered old and vice versa" [1].  
In any case, it seems to me that 1) your kernel appears to be handling it 
properly (hence the logs) and 2) even if it weren't, it doesn't *seem* like a 
problem for the server as much as for the entity that wanted the data.  Just my 
$0.02.

Interestingly, I happen to have some software using different window scale 
values in its interactions with Internet servers.  I just yesterday discovered 
a bug which was occasionally allowing 15 to be used as a window scale value, 
and I have corrected that. I don't know if my software was responsible for the 
log messages that Michael observed, but I have reached out off-list to 
investigate.

Casey

[1] https://www.rfc-editor.org/rfc/rfc7323.html


Re: net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-02 Thread Casey Deccio


> On Sep 2, 2022, at 8:14 PM, Kevin Price  wrote:
> 
> Am 03.09.22 um 02:15 schrieb Kevin Price:
>> Let's double check whether our connman is in fact the culprit, and then
>> make arrest. (file bug report)
> 
> We got him. :) Casey, you file the bug report, Okay?

Done!  https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1018999

Thanks for all the help!

Casey



Re: net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-02 Thread Casey Deccio


> On Sep 2, 2022, at 1:05 PM, Kevin Price  wrote:
> 
> I suspect your "very little customization" (since you're doing
> networking stuff) or the "VBox Guest Additions" (since they mess with
> network interfaces). In order to test this, I used
> debian-11.3.0-amd64-netinst.iso from the archive to install a vm, but in
> my case QEMU/KVM. German localization and no task selected but
> task-ssh-server.

Still working this, but here's an update:

I did a minimal install using the 11.4 installer.  At the tasksel menu, I 
de-selected both Debian desktop environment and GNOME, so no GUI was installed. 
 I did *no* other installation or configuration--not even VirtualBox Guest 
Additions.  It was truly a minimal install.

With this minimal system, I ran my disable_ipv6 test.  It worked just fine!  
That is, IPv6 remained off on the interfaces when I brought them (also, the 
interfaces were not "UP" by default).

Then I ran tasksel and add Debian desktop environment and LXDE and rebooted.  
At that point, disable_ipv6 does *not* work.

Now, this does seem to narrow it down--sort of.  But the confusing thing is 
that the system on which disable_ipv6 currently *works* is also running LXDE.

>> Scratching my head...
> 
> I'm curious about the outcome, so maybe follow-up to this list please?
> Any potential bugs should be reported. But probably I won't be able to
> spare time to test VirtualBox.

Hard to tell if it's VirtualBox at this point, but as I mentioned above, I have 
*not* installed Guest Additions on this new system--just LXDE (and everything 
that comes with it :))

Casey


Re: net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-02 Thread Casey Deccio



> On Sep 2, 2022, at 2:51 AM, Kevin Price  wrote:
> 
> Am 02.09.22 um 06:33 schrieb Casey Deccio:
>> 1) a sanity check (can others confirm the behavior discrepancy?);
> 
> No. My 5.10.0-17 behaves like your 5.10.0-13.

Thanks so much for checking!

> 2) an expectation of *correct* behavior (seems to me like the 5.10.0-13
> behavior is "correct");
> 
> Yes.

I agree.

> and 3) suggestions for next steps.
> 
> FWIW, confirm by booting your 5.10.0-17 system with 5.10.0-13.

Thanks for the idea.  I took your advice and booted my 5.10.0-17 system 
(problem system) with 5.10.0-13.  The problem persisted!  Then I updated my 
"old" (non-problem) system from Debian 11.3 to 11.4 and updated to kernel 
5.10.0-17, and I rebooted.  Still no problems!

Note: the problem system is a brand new install of Debian with only a few 
packages installed (they are also installed on the non-problem system) and very 
little customization.  I used the 11.3.0 netinst image to install, but 
everything is up to date.  I've confirmed the behavior independently on two 
fresh installs.

Another note: I'm running my tests on VMs in VirtualBox.  However, they are 
running on the same version of VirtualBox and even on the same machine. Even 
the version of VBox Guest Additions is the same.

Scratching my head...

Casey


Re: net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-01 Thread Casey Deccio



> On Sep 1, 2022, at 10:33 PM, Casey Deccio  wrote:
> 
> I've come across some unexpected changes in interface behavior between 
> linux-image-5.10.0-13-amd64 and linux-image-5.10.0-17-amd64.
> 

Relatedly, there is this behavior change:

linux-image-5.10.0-13-amd64:

$ sudo ip link add test1 type veth peer test2
$ ip addr | grep test[12]
2372: test2@test1:  mtu 1500 qdisc noop state DOWN 
group default qlen 1000
2373: test1@test2:  mtu 1500 qdisc noop state DOWN 
group default qlen 1000


linux-image-5.10.0-17-amd64:

$ sudo ip link add test1 type veth peer test2
$ ip addr | grep test[12]
214: test2@test1:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000
215: test1@test2:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000

Default DOWN vs. default UP.

Casey



net.ipv6.conf.intf.disable_ipv6 behavior changes

2022-09-01 Thread Casey Deccio
I've come across some unexpected changes in interface behavior between 
linux-image-5.10.0-13-amd64 and linux-image-5.10.0-17-amd64.

Consider the following script:

$ cat test.sh
#!/bin/sh
sudo ip link add test1 type veth peer test2
sudo ip link set test1 down
sudo ip link set test2 down
sudo sysctl net.ipv6.conf.test1.disable_ipv6=1
sudo sysctl net.ipv6.conf.test2.disable_ipv6=1
sudo ip link set test1 up
sudo ip link set test2 up

(There might be a simpler way to trigger it, but this one works for me.)

When I run this on a system running linux-image-5.10.0-13-amd64, I get this 
behavior:

$ ./test.sh 
net.ipv6.conf.test1.disable_ipv6 = 1
net.ipv6.conf.test2.disable_ipv6 = 1
$  ip addr | grep -A 3 test[12]
2370: test2@test1:  mtu 1500 qdisc noqueue 
state UP group default qlen 1000
link/ether ea:fc:8a:36:09:fc brd ff:ff:ff:ff:ff:ff
2371: test1@test2:  mtu 1500 qdisc noqueue 
state UP group default qlen 1000
link/ether e2:e0:d2:09:0d:de brd ff:ff:ff:ff:ff:ff
$ sudo sysctl net.ipv6.conf.test1.disable_ipv6
net.ipv6.conf.test1.disable_ipv6 = 1
$ sudo sysctl net.ipv6.conf.test2.disable_ipv6
net.ipv6.conf.test2.disable_ipv6 = 1

No IPv6 addresses, and IPv6 is still disabled.  But when I run on a system 
running linux-image-5.10.0-17-amd64, I get this behavior:

$ ./test.sh 
net.ipv6.conf.test1.disable_ipv6 = 1
net.ipv6.conf.test2.disable_ipv6 = 1
$ ip addr | grep -A 3 test[12]
212: test2@test1:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000
link/ether ce:16:79:86:ea:16 brd ff:ff:ff:ff:ff:ff
inet6 fe80::cc16:79ff:fe86:ea16/64 scope link 
   valid_lft forever preferred_lft forever
213: test1@test2:  mtu 1500 qdisc 
noqueue state UP group default qlen 1000
link/ether b6:8f:2e:59:1e:68 brd ff:ff:ff:ff:ff:ff
inet6 fe80::b48f:2eff:fe59:1e68/64 scope link 
   valid_lft forever preferred_lft forever
$ sudo sysctl net.ipv6.conf.test1.disable_ipv6
net.ipv6.conf.test1.disable_ipv6 = 0
$ sudo sysctl net.ipv6.conf.test2.disable_ipv6
net.ipv6.conf.test2.disable_ipv6 = 0

The interfaces are configured with link-local addresses, and IPv6 is no longer 
disabled.

I looked through the changelog for linux-image-5.10.0-17-amd64 and saw a number 
of changes from upstream involving sysctl, but I couldn't point to any one 
thing that might have caused this.

So... what I'm looking for is 1) a sanity check (can others confirm the 
behavior discrepancy?); 2) an expectation of *correct* behavior (seems to me 
like the 5.10.0-13 behavior is "correct"); and 3) suggestions for next steps.  
This has broken some software I've developed. I have a workaround, but it's not 
very pretty :)

P.S.  For those that are concerned that I'm disabling IPv6, this is for 
teaching the link layer, and it's really hard to do that with all the activity 
associated with IPv6.



Re: Bug - remote DNS monitoring

2022-08-30 Thread Casey Deccio

> On Aug 30, 2022, at 1:40 PM, Nicholas Geovanis  wrote:
> 
> When you run check_dns by hand on Host B, you don't say who you are logged-in 
> as. That can make a difference. Nagios runs its scripts in a known 
> environment which may be different than you expect.
> 


Thanks for the question.  I have run the check_dns script with:

 - an arbitrary, unprivileged user
 - the nagios user (also unprivileged)
 - root (gasp!)

They all work just fine.  Also, in every case, I run tcpdump, and I can see the 
DNS queries and responses going back and forth just fine.  In the strace 
messages, I can also see that the DNS messages were written and read properly.  
I think the issue is in nslookup, some time *after* the send/recv.  But I can't 
narrow it down much more than that.

Casey

Bug - remote DNS monitoring

2022-08-30 Thread Casey Deccio
Hi all,

I am having trouble tracking down a bug in my monitoring setup.  It all 
happened when I upgraded the monitored host (host B in my example below) to 
bullseye.  Note that Host A is also running bullseye, but the problem didn't 
show itself until Host B was upgraded.

Here is the setup:

Host A (monitoring):
Installed: nagios4, nrpe-ng
IP address: 192.0.2.1

Host B (monitored):
Installed: nrpe-ng, monitoring-plugins-standard, bind9-dnsutils
IP address: 192.0.2.2

Host C (monitored through host B):
Installed: bind9
IP address: 192.0.2.3
Configured to answer authoritatively for example.com on port 53.

 nrpe
over HTTPs  DNS
Host A --> Host B -> Host C

On Host B, I run the following:
sudo /usr/bin/python3 /usr/sbin/nrpe-ng --debug -f --config 
/etc/nagios/nrpe-ng.cfg

While that is running, I run the following on Host A:
/usr/lib/nagios/plugins/check_nrpe_ng -H 192.0.2.2 -c check_dns -a example.com 
192.0.2.3 0.1 1.0

The result of running the command on Host A is:
DNS CRITICAL - '/usr/bin/nslookup -sil' msg parsing exited with no address

On Host B, I see the following debug output:
200 POST /v1/check/check_dns (192.0.2.1) 78.05ms
Executing: /usr/lib/nagios/plugins/check_dns -H example.com -s 192.0.2.3 -A -w 
0.1 -c 1.0

When I run this exact command on Host B, I get:
$ /usr/lib/nagios/plugins/check_dns -H example.com -s 192.0.2.3 -A -w 0.1 -c 1.0
DNS OK: 0.070 seconds response time. example.com returns 
192.0.2.10,2001:db8::10|time=0.069825s;0.10;1.00;0.00

Looks good!  When I run nslookup (run by check_dns), it looks good too:
$ /usr/bin/nslookup -sil example.com 192.0.2.3
Server: 192.0.2.3
Address:192.0.2.3#53

Name:   example.com
Address: 192.0.2.10
Name:   example.com
Address: 2001:db8::10

After rerunning nrpe-ng with strace -f, I see something:

[pid 1183842] write(2, "nslookup: ./src/unix/core.c:570:"..., 83) = 83
...
[pid 1183841] read(4, "nslookup: ./src/unix/core.c:570:"..., 4096) = 83

So it appears that the nslookup process is reporting an error.  But I cannot 
reproduce it outside of nrpe-ng.

Any suggestions?

Casey


stty settings

2008-05-06 Thread Casey Deccio
I would like to pipe raw straight data through a terminal without any
special characters, null character conversion,  carriage
return/newline conversion, or echo.  'stty raw' and 'stty sane' seem
to approach this, but not exactly.  Before I start guessing more with
all the individual options in 'man stty', I thought I'd probe the list
to see if anyone knows an appropriate command for this.

Regards,
Casey


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]