Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Wed, Nov 19, 2003 at 09:26:10PM -0800, Len Sassaman wrote: >It is my intuition from this behavior that the sshd master process >listening for connections is unable to spawn a new process to complete >the authentication step, and thus the connection is being dropped. >There is no information of use in dmesg, nor in the system logs. (I've >cranked up LogLevel to DEBUG3 in sshd_config). I don't have a solution but a couple of suggestions for further investigation: With 150 users logged in (so that no more can log in), what happens if you start another sshd on a different port (or kill the master sshd and start another one on port 22). What happens if you "ktrace -i" sshd and compare the results when the 150th client logs in to the results when the 151st client fails to log in. Some doctoring of PIDs with sed or similar will allow you to diff the output without getting buried in non-differences. I presume that the clients are attempting to connect from more than one host (ie it's not a resource problem in the client). Peter ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
Tim Kientzle wrote: Try an 'fstat' when connections start getting dropped. I wonder if something (PAM module, maybe?) is opening a file on each connection and you're running out of per-process file descriptors. A similar thing happened here - although it wasn't sshd at fault. Len mentioned using ldap authentication. nss_ldap and/or pam_ldap are use TCP connections to connect to the LDAP server. In my case there was another big consumer of persistent ldap connections that caused slapd to reach its default 1024 descriptor limit (which required a compile-time adjustment). Found this by tracing the master slapd process. -Jamie ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
Hmm. Well, it certainly sounds like a resource limit to me, especially if it's a nice round number like "150" or "300". However, I'm also having a bit of trouble seeing, off the top of my head, which limit it might be. It sounds like you've got the ones I would think of. A quick skim of sshd.c suggests that it is pretty careful to document various failure modes in debugging output. There are one or two failures where it does not log, and they include the call to pipe() in the server loop -- if that fails, it bails without an error, which is a little surprising. Could you post server debug output for the first connection to the server that fails? This would let us "see how far it got"... In particular, whether it did spawn a child process, etc. I have never gotten this to fail when sshd is running in debug mode (i.e., sshd -ddd). However, given that it doesn't fork when run with -d, that still doesn't tell us too much. When I set LogLevel DEBUG3, this is as much info as I am given in the auth.log: Nov 20 16:39:19 clyde sshd[63993]: Failed none for rabbi from 127.0.0.1 port 62701 ssh2 And this is the debug output for the connection, as seen from the client: bash-2.05b# ssh -vvv -l rabbi localhost OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f debug1: Reading configuration data /etc/ssh/ssh_config debug1: Rhosts Authentication disabled, originating port will not be trusted. debug2: ssh_connect: needpriv 0 debug1: Connecting to localhost [::1] port 22. socket: Protocol not supported debug1: Connecting to localhost [127.0.0.1] port 22. debug1: Connection established. debug1: identity file /root/.ssh/identity type -1 debug1: identity file /root/.ssh/id_rsa type -1 debug1: identity file /root/.ssh/id_dsa type -1 ssh_exchange_identification: Connection closed by remote host This can't be a system-wide process related resource issue, I don't think, because once a user connects and authenticates, there are no problems of note. I'm leaning toward a socket related limit or user-level limit. However, since sysctl tells me: kern.ipc.maxsockbuf: 262144 kern.ipc.somaxconn: 16384 kern.ipc.numopensockets: 2201 kern.ipc.maxsockets: 49312 I tend to not believe the former, and why the latter would be occurring escapes me as well. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
Len Sassaman wrote: The problem is that after about 150 users log in (300ish sshd sessions, since I am using privsep), incoming connections start getting dropped. That number (150) sounds awfully familiar; I feel like I've seen it somewhere recently. H Try an 'fstat' when connections start getting dropped. I wonder if something (PAM module, maybe?) is opening a file on each connection and you're running out of per-process file descriptors. Tim Kientzle ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Thu, 20 Nov 2003, Ken Smith wrote: > On Thu, Nov 20, 2003 at 10:56:08AM -0500, Robert Watson wrote: > > > Hmm. Well, it certainly sounds like a resource limit to me, especially if > > it's a nice round number like "150" or "300". > > One possibility might be running out of pseudo-terminals to support the > login sessions. pty's are created as needed I think, and the code that > handles it is in sys/kern/tty_pty.c. The limits on it appear to be 256 > ptys: I thought about that, but the submitter indicated that pty's were not being allocated. However, that would be a really good thing to verify, since the numbers come out right... I should really clean up and commit my pty cleanup at some point, as well as support for forkpty()/openpty()/etc that avoid the sort of code found below. Presumably that would be a 5.3 thing. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories > > /* > * This function creates and initializes a pts/ptc pair > * > * pts == /dev/tty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] > * ptc == /dev/pty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] > * > * XXX: define and add mapping of upper minor bits to allow more > * than 256 ptys. > */ > > I don't know if simply changing the : > > static char *names = "pqrsPQRS"; > > to something longer is all that would be required or if there are > other factors involved. > > -- > Ken Smith > - From there to here, from here to | [EMAIL PROTECTED] > there, funny things are everywhere. | > - Theodore Geisel | > ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Thu, Nov 20, 2003 at 10:56:08AM -0500, Robert Watson wrote: > Hmm. Well, it certainly sounds like a resource limit to me, especially if > it's a nice round number like "150" or "300". One possibility might be running out of pseudo-terminals to support the login sessions. pty's are created as needed I think, and the code that handles it is in sys/kern/tty_pty.c. The limits on it appear to be 256 ptys: /* * This function creates and initializes a pts/ptc pair * * pts == /dev/tty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] * ptc == /dev/pty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv] * * XXX: define and add mapping of upper minor bits to allow more * than 256 ptys. */ I don't know if simply changing the : static char *names = "pqrsPQRS"; to something longer is all that would be required or if there are other factors involved. -- Ken Smith - From there to here, from here to | [EMAIL PROTECTED] there, funny things are everywhere. | - Theodore Geisel | ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: Help request: problems with a 5.1 server and large numbers of ssh users.
On Wed, 19 Nov 2003, Len Sassaman wrote: > It is my intuition from this behavior that the sshd master process > listening for connections is unable to spawn a new process to complete > the authentication step, and thus the connection is being dropped. There > is no information of use in dmesg, nor in the system logs. (I've cranked > up LogLevel to DEBUG3 in sshd_config). > > I have a RedHat Linux server running the 2.4.18-3smp kernel on a dual > Athlon MP 1800+ and 2048MB RAM that is known to handle 1000 users > without issue -- so I have to believe the FreeBSD box, though not as > beefy hardware-wise, should be able to do better than a few hundred > users. I believe this to be some sort of resource limit issue, but I > have addressed everything I could think of. Hmm. Well, it certainly sounds like a resource limit to me, especially if it's a nice round number like "150" or "300". However, I'm also having a bit of trouble seeing, off the top of my head, which limit it might be. It sounds like you've got the ones I would think of. A quick skim of sshd.c suggests that it is pretty careful to document various failure modes in debugging output. There are one or two failures where it does not log, and they include the call to pipe() in the server loop -- if that fails, it bails without an error, which is a little surprising. Could you post server debug output for the first connection to the server that fails? This would let us "see how far it got"... In particular, whether it did spawn a child process, etc. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects [EMAIL PROTECTED] Network Associates Laboratories ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Help request: problems with a 5.1 server and large numbers of ssh users.
Hi folks, I have a problem, and I am unable to find previous discussions of it. Any pointers or clues would be much appreciated. I have a FreeBSD 5.1 server that needs to be able to handle several thousand simultaneous ssh sessions from distinct users. (I am using FreeBSD 5.1 because I need to be able to support ldap authentication.) Hardware info: CPU: AMD Athlon(tm) MP 2000+ (1666.74-MHz 686-class CPU) real memory = 1610088448 (1535 MB) avail memory = 1558822912 (1486 MB) My version of ssh is 3.6.1p2 patched to address the security concerns. (I am not using 3.7.1p because it dropped support for password authentication with PAM, and I cannot assume keyboard-interactive authentication will be present in my users' clients.) All of these users are doing ssh port forwarding, and are not assigned ptys. I have not modified login.conf in any way -- the defaults of "no limits" remain. The kernel tunables in /boot/loader.conf are set to: kern.maxfiles="49312" kern.maxproc="24656" kern.maxprocperuid="11094" kern.ipc.maxsockets="24656" kern.ipc.somaxconn="8192" The kernel is compiled with NMBCLUSTERS=65536 and maxusers=0 (which defaults to 384). The problem is that after about 150 users log in (300ish sshd sessions, since I am using privsep), incoming connections start getting dropped. i.e., bash-2.05b# ssh -v localhost OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f debug1: Reading configuration data /etc/ssh/ssh_config debug1: Rhosts Authentication disabled, originating port will not be trusted. debug1: Connecting to localhost [::1] port 22. socket: Protocol not supported debug1: Connecting to localhost [127.0.0.1] port 22. debug1: Connection established. debug1: identity file /root/.ssh/identity type -1 debug1: identity file /root/.ssh/id_rsa type -1 debug1: identity file /root/.ssh/id_dsa type -1 ssh_exchange_identification: Connection closed by remote host debug1: Calling cleanup 0x805f010(0x0) bash-2.05b# It is my intuition from this behavior that the sshd master process listening for connections is unable to spawn a new process to complete the authentication step, and thus the connection is being dropped. There is no information of use in dmesg, nor in the system logs. (I've cranked up LogLevel to DEBUG3 in sshd_config). I have a RedHat Linux server running the 2.4.18-3smp kernel on a dual Athlon MP 1800+ and 2048MB RAM that is known to handle 1000 users without issue -- so I have to believe the FreeBSD box, though not as beefy hardware-wise, should be able to do better than a few hundred users. I believe this to be some sort of resource limit issue, but I have addressed everything I could think of. Here's the sysctl vm.zone output: vm.zone: ITEMSIZE LIMIT USEDFREE REQUESTS FFS2 dinode: 256,0, 1089, 21, 1359 FFS1 dinode: 128,0, 0, 0,0 FFS inode: 144,0, 1089, 59, 1359 SWAPMETA:276, 121576, 0, 0,0 unpcb: 140,65548,329, 63,31364 ripcb: 228,49317, 0, 0,0 syncache:136,15370, 0, 58,36747 tcptw:48,49385, 3812,255,89831 tcpcb: 360,49313, 1048, 63, 195072 inpcb: 228,49317, 4921, 94, 195072 udpcb: 228,49317, 1, 33, 114497 socket: 256,49320, 1383,102, 340934 KNOTE:64,0, 0,124, 114453 PIPE:176,0,622, 68,17402 DIRHASH:1024,0,138, 6, 138 NAMEI: 1024,0, 9, 11, 451791 VNODEPOLL:76,0, 0, 0,0 VNODE: 292,0, 1473, 35, 1473 g_bio: 144,0,259, 49, 186276 VMSPACE: 256,0,424, 26,11035 UPCALL: 44,0, 0, 0,0 KSE: 64,0,496, 62, 496 KSEGRP: 120,0,496, 62, 496 THREAD: 292,0,496, 11, 496 PROC:480,0,461, 35,11074 Files:60,0, 6051,153, 89241268 65536: 65536,0, 3, 3,3 32768: 32768,0, 3, 3, 32 16384: 16384,0, 56, 22, 1733 8192: 8192,0, 2, 4, 50 4096: 4096,0,736, 44,11965 2048: 2048,0, 71, 5, 359215 1024: 1024,0,408, 20, 284756 512: 512,0,102, 18,43908 256: 256,0, 5166, 84, 131327 128: 128,0, 6784,253, 535182 64: 64,0, 3032, 68,87489 32: 32,0, 2155,182, 211243 16: 16,0, 4485,