Hi, Ludovic Courtès <l...@gnu.org> skribis:
> We recently experienced a bug on berlin.guix where we’d be locked out of > SSH access because shepherd (0.9.1) would say that the maximum > connection number on the sshd inetd service had been reached. On berlin.guix, which is getting hammered, we see things like this: --8<---------------cut here---------------start------------->8--- Aug 9 23:32:13 localhost shepherd[1]: Service sshd-4183 (PID 55570) exited with 255. Aug 9 23:32:13 localhost shepherd[1]: Service sshd-4183 has been disabled. Aug 9 23:32:13 localhost shepherd[1]: Transient service sshd-4183 terminated, now unregistered. Aug 9 23:32:15 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:39528. Aug 9 23:32:15 localhost shepherd[1]: Service sshd-4189 has been started. Aug 9 23:32:20 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:40378. Aug 9 23:32:21 localhost shepherd[1]: Service sshd-4190 has been started. Aug 9 23:32:25 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:41190. Aug 9 23:32:25 localhost sshd[55635]: error: kex_exchange_identification: Connection closed by remote host Aug 9 23:32:25 localhost sshd[55635]: Connection closed by X.X.X.167 port 50938 Aug 9 23:32:26 localhost shepherd[1]: Service sshd-4191 has been started. Aug 9 23:32:26 localhost shepherd[1]: 7 connections still in use after sshd-4185 termination. Aug 9 23:32:26 localhost shepherd[1]: Service sshd-4185 (PID 55635) exited with 255. Aug 9 23:32:26 localhost shepherd[1]: Service sshd-4185 has been disabled. Aug 9 23:32:26 localhost shepherd[1]: Transient service sshd-4185 terminated, now unregistered. Aug 9 23:32:30 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:41918. Aug 9 23:32:31 localhost shepherd[1]: Service sshd-4192 has been started. Aug 9 23:32:34 localhost sshd[55632]: error: kex_exchange_identification: Connection closed by remote host Aug 9 23:32:34 localhost sshd[55632]: Connection closed by X.X.X.167 port 50966 Aug 9 23:32:34 localhost shepherd[1]: 7 connections still in use after sshd-4184 termination. Aug 9 23:32:34 localhost shepherd[1]: Service sshd-4184 (PID 55632) exited with 255. Aug 9 23:32:34 localhost shepherd[1]: Service sshd-4184 has been disabled. Aug 9 23:32:34 localhost shepherd[1]: Transient service sshd-4184 terminated, now unregistered. Aug 9 23:32:35 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:42736. Aug 9 23:32:36 localhost shepherd[1]: Service sshd-4193 has been started. Aug 9 23:32:40 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:43492. Aug 9 23:32:41 localhost shepherd[1]: Service sshd-4194 has been started. Aug 9 23:32:44 localhost sshd[56155]: error: kex_exchange_identification: Connection closed by remote host Aug 9 23:32:44 localhost sshd[56155]: Connection closed by X.X.X.80 port 52450 Aug 9 23:32:44 localhost shepherd[1]: 8 connections still in use after sshd-4186 termination. Aug 9 23:32:44 localhost shepherd[1]: Service sshd-4186 (PID 56155) exited with 255. Aug 9 23:32:44 localhost shepherd[1]: Service sshd-4186 has been disabled. Aug 9 23:32:44 localhost shepherd[1]: Transient service sshd-4186 terminated, now unregistered. Aug 9 23:32:45 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:44194. Aug 9 23:32:46 localhost shepherd[1]: Service sshd-4195 has been started. Aug 9 23:32:53 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:45170. Aug 9 23:32:53 localhost shepherd[1]: Service sshd-4196 has been started. Aug 9 23:32:56 localhost ntpd[1706]: Soliciting pool server X.X.X.107 Aug 9 23:32:58 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:45846. Aug 9 23:32:58 localhost shepherd[1]: Service sshd-4197 has been started. Aug 9 23:33:03 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:46514. Aug 9 23:33:03 localhost shepherd[1]: Service sshd-4198 has been started. Aug 9 23:33:08 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:47230. Aug 9 23:33:08 localhost shepherd[1]: Service sshd-4199 has been started. Aug 9 23:33:13 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:47940. Aug 9 23:33:13 localhost shepherd[1]: Service sshd-4200 has been started. Aug 9 23:33:17 localhost sshd[56715]: error: kex_exchange_identification: client sent invalid protocol identifier "" Aug 9 23:33:17 localhost sshd[56715]: banner exchange: Connection from X.X.X.104 port 37546: invalid format Aug 9 23:33:17 localhost shepherd[1]: 13 connections still in use after sshd-4188 termination. Aug 9 23:33:17 localhost shepherd[1]: Service sshd-4188 (PID 56715) exited with 255. Aug 9 23:33:17 localhost shepherd[1]: Service sshd-4188 has been disabled. Aug 9 23:33:17 localhost shepherd[1]: Transient service sshd-4188 terminated, now unregistered. Aug 9 23:33:18 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:48680. Aug 9 23:33:18 localhost shepherd[1]: Service sshd-4201 has been started. Aug 9 23:33:23 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:49546. Aug 9 23:33:23 localhost shepherd[1]: Service sshd-4202 has been started. Aug 9 23:33:26 localhost sshd[57102]: error: kex_exchange_identification: client sent invalid protocol identifier "OPT IONS / HTTP/1.0" Aug 9 23:33:26 localhost sshd[57102]: banner exchange: Connection from X.X.X.104 port 40378: invalid format Aug 9 23:33:26 localhost shepherd[1]: 14 connections still in use after sshd-4190 termination. Aug 9 23:33:26 localhost shepherd[1]: Service sshd-4190 (PID 57102) exited with 255. Aug 9 23:33:26 localhost shepherd[1]: Service sshd-4190 has been disabled. Aug 9 23:33:26 localhost shepherd[1]: Transient service sshd-4190 terminated, now unregistered. Aug 9 23:33:28 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:50188. Aug 9 23:33:28 localhost shepherd[1]: Service sshd-4203 has been started. Aug 9 23:33:32 localhost sshd[57360]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:33:32 localhost sshd[57360]: banner exchange: Connection from X.X.X.104 port 41918: invalid format Aug 9 23:33:32 localhost shepherd[1]: 14 connections still in use after sshd-4192 termination. Aug 9 23:33:32 localhost shepherd[1]: Service sshd-4192 (PID 57360) exited with 255. Aug 9 23:33:32 localhost shepherd[1]: Service sshd-4192 has been disabled. Aug 9 23:33:32 localhost shepherd[1]: Transient service sshd-4192 terminated, now unregistered. Aug 9 23:33:33 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.104:50848. Aug 9 23:33:33 localhost shepherd[1]: Service sshd-4204 has been started. Aug 9 23:33:35 localhost sshd[57713]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:33:35 localhost sshd[57713]: banner exchange: Connection from X.X.X.104 port 42736: invalid format Aug 9 23:33:35 localhost shepherd[1]: 14 connections still in use after sshd-4193 termination. […] Aug 9 23:33:39 localhost sshd[56941]: error: kex_exchange_identification: client sent invalid protocol identifier "GET / HTTP/1.0" Aug 9 23:33:39 localhost sshd[56941]: banner exchange: Connection from X.X.X.104 port 39528: invalid format Aug 9 23:33:39 localhost shepherd[1]: 13 connections still in use after sshd-4189 termination. Aug 9 23:33:39 localhost shepherd[1]: Service sshd-4189 (PID 56941) exited with 255. Aug 9 23:33:39 localhost shepherd[1]: Service sshd-4189 has been disabled. Aug 9 23:33:39 localhost shepherd[1]: Transient service sshd-4189 terminated, now unregistered. Aug 9 23:33:44 localhost sshd[57874]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:33:44 localhost sshd[57874]: banner exchange: Connection from X.X.X.104 port 43492: invalid format Aug 9 23:33:44 localhost shepherd[1]: 12 connections still in use after sshd-4194 termination. Aug 9 23:33:44 localhost shepherd[1]: Service sshd-4194 (PID 57874) exited with 255. Aug 9 23:33:44 localhost shepherd[1]: Service sshd-4194 has been disabled. Aug 9 23:33:44 localhost shepherd[1]: Transient service sshd-4194 terminated, now unregistered. […] Aug 9 23:35:40 localhost shepherd[1]: 4 connections still in use after sshd-4212 termination. Aug 9 23:35:40 localhost shepherd[1]: Service sshd-4212 (PID 59614) exited with 255. Aug 9 23:35:40 localhost shepherd[1]: Service sshd-4212 has been disabled. Aug 9 23:35:40 localhost shepherd[1]: Transient service sshd-4212 terminated, now unregistered. Aug 9 23:35:48 localhost sshd[59712]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:35:48 localhost sshd[59712]: banner exchange: Connection from X.X.X.104 port 58812: invalid format Aug 9 23:35:48 localhost shepherd[1]: 3 connections still in use after sshd-4213 termination. Aug 9 23:35:48 localhost shepherd[1]: Service sshd-4213 (PID 59712) exited with 255. Aug 9 23:35:48 localhost shepherd[1]: Service sshd-4213 has been disabled. Aug 9 23:35:48 localhost shepherd[1]: Transient service sshd-4213 terminated, now unregistered. Aug 9 23:35:49 localhost sshd[59891]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:35:49 localhost sshd[59891]: banner exchange: Connection from X.X.X.104 port 59748: invalid format Aug 9 23:35:49 localhost shepherd[1]: 2 connections still in use after sshd-4214 termination. Aug 9 23:35:49 localhost shepherd[1]: Service sshd-4214 (PID 59891) exited with 255. Aug 9 23:35:49 localhost shepherd[1]: Service sshd-4214 has been disabled. Aug 9 23:35:49 localhost shepherd[1]: Transient service sshd-4214 terminated, now unregistered. Aug 9 23:36:02 localhost sshd[60000]: error: kex_exchange_identification: banner line contains invalid characters Aug 9 23:36:02 localhost sshd[60000]: banner exchange: Connection from X.X.X.104 port 60776: invalid format Aug 9 23:36:02 localhost shepherd[1]: 1 connection still in use after sshd-4215 termination. Aug 9 23:36:02 localhost shepherd[1]: Service sshd-4215 (PID 60000) exited with 255. Aug 9 23:36:02 localhost shepherd[1]: Service sshd-4215 has been disabled. Aug 9 23:36:02 localhost shepherd[1]: Transient service sshd-4215 terminated, now unregistered. Aug 9 23:36:14 localhost ntpd[1706]: Soliciting pool server X.X.X.191 Aug 9 23:37:21 localhost ntpd[1706]: Soliciting pool server X.X.X.75 Aug 9 23:37:25 localhost shepherd[1]: Accepted connection on 0.0.0.0:22 from X.X.X.93:34472. Aug 9 23:37:25 localhost shepherd[1]: Service sshd-4216 has been started. Aug 9 23:38:28 localhost ntpd[1706]: Soliciting pool server X.X.X.38 Aug 9 23:38:46 localhost sshd[63802]: error: kex_exchange_identification: Connection closed by remote host Aug 9 23:38:46 localhost sshd[63802]: Connection closed by X.X.X.93 port 34472 Aug 9 23:38:46 localhost shepherd[1]: 1 connection still in use after sshd-4216 termination. Aug 9 23:38:46 localhost shepherd[1]: Service sshd-4216 (PID 63802) exited with 255. Aug 9 23:38:46 localhost shepherd[1]: Service sshd-4216 has been disabled. Aug 9 23:38:46 localhost shepherd[1]: Transient service sshd-4216 terminated, now unregistered. --8<---------------cut here---------------end--------------->8--- X.X.X.104 is opening a dozen connections that it keeps alive simultaneously for a while (obviously scanning for vulnerabilities), eventually closing them. The connection count shown in messages does go down to 1 in the end, as expected. As an example, ‘sshd-4189’ was up for 1.5 minute, and that seems to be roughly the lifetime of these. It does mean that for a couple of minutes we had a peak of 18 simultaneous connections: --8<---------------cut here---------------start------------->8--- ludo@berlin ~$ sudo sh -c 'cat /var/log/messages; zcat /var/log/messages.1.gz' | grep "still in use"| cut -c30-| sed '-es/.* \([0-9]\+\) connection.*/\1/g'|sort -un 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 --8<---------------cut here---------------end--------------->8--- That’s all we have so far. Ludo’.