Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-30 Thread RVP
On Thu, 29 Jun 2023, Matthias Petermann wrote: The former proposal you sent me (net.inet.icmp.bmcastecho=1 and ping -nc10) did not create ARP-adresses with no expiration time on my NetBSD 10.0_BETA system. You mentioned this might be a feature of -HEAD - not sure about 10... I should have

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-29 Thread Matthias Petermann
Hi, On 30.06.23 07:07, Brian Buhrow wrote: hello. Yes, this behavior is expected. It ensures that there is no conflict between the device on the domu end of the vif port and the device on the dom0 end. This is more sane behavior than FreeBSD, which zeros out the MAC address on the d

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-29 Thread Brian Buhrow
hello. Yes, this behavior is expected. It ensures that there is no conflict between the device on the domu end of the vif port and the device on the dom0 end. This is more sane behavior than FreeBSD, which zeros out the MAC address on the dom0 side of the vif. -thanks -Brian

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-29 Thread Matthias Petermann
Hello, On 29.06.23 11:58, Matthias Petermann wrote: While I do not want to praise the evening before the dayyou deserve some feedback. Both the synthetic test with ssh/dd and my real payload with ssh/dump have been running for easily 6 hours without interruption this morning. I took the ad

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-29 Thread Matthias Petermann
Hi Brian, On 26.06.23 16:17, Brian Buhrow wrote: hello. A couple of quick questions based on the convrsation and the snippets of logs shown in the e-mails. 1. Is the MAC address shown in the ARP replies the correct one for the dom0? No reason it should be wrong, but it's worth veri

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-29 Thread Matthias Petermann
Hi, On 26.06.23 15:37, RVP wrote: On Mon, 26 Jun 2023, Matthias Petermann wrote: Could it still be an ARP related issue? I did a simplified version of the test this morning: Try this test: since you have static IP- & MAC-addresses everywhere in your setup, just add them as static ARP entri

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-26 Thread Brian Buhrow
hello. A couple of quick questions based on the convrsation and the snippets of logs shown in the e-mails. 1. Is the MAC address shown in the ARP replies the correct one for the dom0? No reason it should be wrong, but it's worth verifying, just in case there is an unknown host replyi

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-26 Thread RVP
On Mon, 26 Jun 2023, Matthias Petermann wrote: Could it still be an ARP related issue? I did a simplified version of the test this morning: Try this test: since you have static IP- & MAC-addresses everywhere in your setup, just add them as static ARP entries (skip own address): On each of y

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-26 Thread Matthias Petermann
Hi, On 26.06.23 10:41, RVP wrote: On Sun, 25 Jun 2023, Matthias Petermann wrote: Somewhere between 2) and 3) there should be the answer to the question. ``` 08:52:07.595831 ARP, Request who-has vhost2.lan tell srv-net.lan, length 28 08:52:07.595904 ARP, Reply vhost2.lan is-at 88:ae:dd:02:a4:

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-26 Thread RVP
On Sun, 25 Jun 2023, Matthias Petermann wrote: Somewhere between 2) and 3) there should be the answer to the question. ``` 08:52:07.595831 ARP, Request who-has vhost2.lan tell srv-net.lan, length 28 08:52:07.595904 ARP, Reply vhost2.lan is-at 88:ae:dd:02:a4:03 (oui Unknown), length 28 08:52:

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-25 Thread Matthias Petermann
Hello, On 25.06.23 07:49, Matthias Petermann wrote: 4) Run the test with tcpdump from DomU -> this is currently ongoing. I will followup as soon I have the results. This is the follow-up I promised. I was lucky this morning to catch one occurance of the issue while tcpdump was running in th

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread RVP
On Sun, 25 Jun 2023, Matthias Petermann wrote: 2) increased the ARP cache timeout net.inet.arp.nd_reachable=120 on both, Dom0 and DomU -> this seemed to have an effect at first, but the problem still exists (its not a measured fact but a feeling, that it happens now a bit less oft

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread Brian Buhrow
Hello. Here are the network configuration settings I've been using for a number of years, all the way through -current. net.inet.tcp.recvbuf_auto=1 net.inet.tcp.sendbuf_auto=1 net.inet.tcp.sendbuf_max=16777216 net.inet.tcp.recvbuf_max=16777216 -thanks -Brian

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread Matthias Petermann
Hello, On 25.06.23 03:48, RVP wrote: On Sat, 24 Jun 2023, Brian Buhrow wrote: In any case, The fact that you're getting regular delays on your pings suggests there is a delay between the time when the arp cache times out and when it gets refreshed. This would be determined by `net.inet.ar

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread RVP
On Sat, 24 Jun 2023, Brian Buhrow wrote: In any case, The fact that you're getting regular delays on your pings suggests there is a delay between the time when the arp cache times out and when it gets refreshed. This would be determined by `net.inet.arp.nd_delay' I think (on -HEAD). As a c

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread Greg Troxel
Brian Buhrow writes: > Hello. The ARP cache timeout used to be 1200 seconds or 20 minutes, > hard coded. Now, it > looks like it's either 1200 seconds or 300 seconds, I'm not sure after a > quick romp through the > kernel source. In any case, The fact that you're getting regular delays

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread Brian Buhrow
Hello. The ARP cache timeout used to be 1200 seconds or 20 minutes, hard coded. Now, it looks like it's either 1200 seconds or 300 seconds, I'm not sure after a quick romp through the kernel source. In any case, The fact that you're getting regular delays on your pings suggests there

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread Manuel Bouyer
On Fri, Jun 23, 2023 at 11:37:23PM +, RVP wrote: > On Fri, 23 Jun 2023, Brian Buhrow wrote: > > > hello. My understanding is that the arp caching mechanism works > > regardless of whether > > you use static MAC addresses or dynamically generated ones. > > [...] > > If you then run brconf

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-24 Thread RVP
On Sat, 24 Jun 2023, Matthias Petermann wrote: On my Dom0, it looks like there is a timeout for the MAC adresses. The lines below are random but subsequent samples of the "arp -an" command on the Dom0 (192.168.2.50) within a timespan of ~5 minutes. What catched my eye so far: - there seem to

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread Matthias Petermann
Hello, On 24.06.23 01:37, RVP wrote: On Fri, 23 Jun 2023, Brian Buhrow wrote: hello.  My understanding is that the arp caching mechanism works regardless of whether you use static MAC addresses or dynamically generated ones. [...] If you then run brconfig on the bridge containing the domu

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread RVP
On Fri, 23 Jun 2023, Brian Buhrow wrote: hello. My understanding is that the arp caching mechanism works regardless of whether you use static MAC addresses or dynamically generated ones. [...] If you then run brconfig on the bridge containing the domu, you'll see the MAC address you

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread Matthias Petermann
Hello Manuel, On 23.06.23 16:17, Manuel Bouyer wrote: I'm not sure it's Xen-specific, there have been changes in the network stack between -9 and -10 affecting the way ARP and duplicate addresses are managed. Thanks for your attention. I remember you are one of the Xen Gurus RVP recommended

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread Brian Buhrow
hello. My understanding is that the arp caching mechanism works regardless of whether you use static MAC addresses or dynamically generated ones. The reason is that arp bridges the gap between the layer 2 network, i.e. the MAC addresses, and the layer 3 network, i.e. the IP addresses t

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread Manuel Bouyer
On Fri, Jun 23, 2023 at 03:52:21PM +0200, Matthias Petermann wrote: > Hi, > > On 23.06.23 02:45, RVP wrote: > > So, the server tries to write data into the socket; write() fails with > > errno = EHOSTDOWN which sshd(8) treats as a fatal error and it exits. > > The client tries to read/write to a c

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-23 Thread Matthias Petermann
Hi, On 23.06.23 02:45, RVP wrote: So, the server tries to write data into the socket; write() fails with errno = EHOSTDOWN which sshd(8) treats as a fatal error and it exits. The client tries to read/write to a closed connection, and it too quits. The part which doesn't make sense is the EHOSTD

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-22 Thread RVP
On Thu, 22 Jun 2023, Matthias Petermann wrote: ...and from the server log... ``` debug2: channel 0: rcvd adjust 131072 debug2: channel 0: rcvd adjust 131072 debug2: channel 0: rcvd adjust 196608 debug2: channel 0: rcvd adjust 131072 debug2: channel 0: rcvd adjust 131072 debug2: channel 0: rcvd

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-22 Thread Brian Buhrow
hello. Actually, on the server side, where you get the "host is down" message, that is a system error from the network stack itself. I've seen it when the arp cache times out and can't be refreshed in a timely manner. What happens if you run an extended ping session between the dom0 a

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-22 Thread Matthias Petermann
Hi, On 22.06.23 08:36, RVP wrote: Can you see any errors from sshd(8) in the logs on the DomU? If not, run the sshd server standalone like this: ``` /usr/sbin/sshd -Dddd -E/tmp/s.log ``` then post the `s.log' file after you run something like: ``` $ ssh -E/tmp/c.log -vvv XXX.NET 'dd if=/dev/z

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread RVP
On Wed, 21 Jun 2023, Matthias Petermann wrote: The log output at the time of the was: ``` debug2: tcpwinsz: 197420 for connection: 3 debug2: channel 0: window 1933312 sent adjust 163840 debug2: tcpwinsz: 197420 for connection: 3 debug2: channel 0: window 1933312 sent adjust 163840 debug2:

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Matthias Petermann
On 22.06.23 07:52, Matthias Petermann wrote: On 21.06.23 19:54, Matthias Petermann wrote: 2>log.txt ssh user@srv-net -vvv doas /sbin/dump -X -h 0 -b 64 -0auf - /data/119455aa-6ef8-49e0-b71a-9c87e84014cb > /mnt/test.dump ...just noticed another variation, this time client_loop: send disconnect

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Matthias Petermann
On 21.06.23 19:54, Matthias Petermann wrote: 2>log.txt ssh user@srv-net -vvv doas /sbin/dump -X -h 0 -b 64 -0auf - /data/119455aa-6ef8-49e0-b71a-9c87e84014cb > /mnt/test.dump ...just noticed another variation, this time client_loop: send disconnect occured: https://paste.petermann-it.de/?880

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Matthias Petermann
Hello, On 21.06.23 11:22, RVP wrote: On Wed, 21 Jun 2023, RVP wrote: A `Broken pipe' from ssh means the RHS of the pipeline exited prematurely. Is what I said, but, I see that ssh ignores SIGPIPE (network I/O--duh!), so that error message is even odder. Do a `2>log.txt ssh -vvv ...' and p

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Michael van Elst
r...@sdf.org (RVP) writes: >I don't get that: there's no pipe there when you do `> file'. So how come >a Broken pipe still? It's the communication between ssh and sshd where ssh can no longer write to a network connection closed by sshd. The problem is to find out why the connection got closed.

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread RVP
On Wed, 21 Jun 2023, RVP wrote: A `Broken pipe' from ssh means the RHS of the pipeline exited prematurely. Is what I said, but, I see that ssh ignores SIGPIPE (network I/O--duh!), so that error message is even odder. Do a `2>log.txt ssh -vvv ...' and post the `log.txt' file when you send the

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread RVP
On Wed, 21 Jun 2023, Matthias Petermann wrote: My mistake... the error message probably was slighty different but still related to the ssh_client_loop. Aah! ssh is stuffing errno in _many_ places, so it's definitely possible. See, for example: src/crypto/external/bsd/openssh/dist/sshbuf-mis

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread RVP
On Wed, 21 Jun 2023, Matthias Petermann wrote: Before I had dd in place, I used a redirection > $dumpname which results in the same kind of broken pipe issues. I just did verify this by repeating this as an isolated test case. I don't get that: there's no pipe there when you do `> file'. So

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Matthias Petermann
On 21.06.23 10:22, RVP wrote: On Wed, 21 Jun 2023, Matthias Petermann wrote: Before I had dd in place, I used a redirection > $dumpname which results in the same kind of broken pipe issues. I just did verify this by repeating this as an isolated test case. I don't get that: there's no pipe

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread Matthias Petermann
Hello, On 21.06.23 09:31, RVP wrote: On Tue, 20 Jun 2023, Matthias Petermann wrote: problems. Since there is a bit more steam on the system, I get irregular but predictable SSH connection disconnects (ssh client loop send disconnect: Broken pipe). I have already tried all possible combinatio

Re: ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-21 Thread RVP
On Tue, 20 Jun 2023, Matthias Petermann wrote: problems. Since there is a bit more steam on the system, I get irregular but predictable SSH connection disconnects (ssh client loop send disconnect: Broken pipe). I have already tried all possible combinations of ClientAliveInterval and ServerAli

ssh client_loop send disconnnect from Dom0 -> DomU (NetBSD 10.0_BETA/Xen)

2023-06-20 Thread Matthias Petermann
Hello all, I have a network problem here that I'm not sure what Xen's contribution is. There is one Dom0 and several DomUs. The DomUs are connected via a brigde to the Dom0 and the LAN. The filesystems of the DomUs are backed up to a USB disk attached to the host. To do this, Dom0 calls the