Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-21 Thread Peter Jeremy
On Wed, Nov 19, 2003 at 09:26:10PM -0800, Len Sassaman wrote:
>It is my intuition from this behavior that the sshd master process 
>listening for connections is unable to spawn a new process to complete 
>the authentication step, and thus the connection is being dropped. 
>There is no information of use in dmesg, nor in the system logs. (I've 
>cranked up LogLevel to DEBUG3 in sshd_config).

I don't have a solution but a couple of suggestions for further
investigation:

With 150 users logged in (so that no more can log in), what happens
if you start another sshd on a different port (or kill the master
sshd and start another one on port 22).

What happens if you "ktrace -i" sshd and compare the results when
the 150th client logs in to the results when the 151st client
fails to log in.  Some doctoring of PIDs with sed or similar will
allow you to diff the output without getting buried in non-differences.

I presume that the clients are attempting to connect from more than
one host (ie it's not a resource problem in the client).

Peter
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Jamie Clark
Tim Kientzle wrote:

Try an 'fstat' when connections start getting dropped.
I wonder if something (PAM module, maybe?) is opening a
file on each connection and you're running out of per-process
file descriptors.
A similar thing happened here - although it wasn't sshd at fault. Len 
mentioned using ldap authentication.

nss_ldap and/or pam_ldap are use TCP connections to connect to the LDAP 
server. In my case there was another big consumer of persistent ldap 
connections that caused slapd to reach its default 1024 descriptor limit 
(which required a compile-time adjustment). Found this by tracing the 
master slapd process.

-Jamie

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Len Sassaman
Hmm.  Well, it certainly sounds like a resource limit to me, 
especially if
it's a nice round number like "150" or "300".  However, I'm also 
having a
bit of trouble seeing, off the top of my head, which limit it might be.
It sounds like you've got the ones I would think of.  A quick skim of
sshd.c suggests that it is pretty careful to document various failure
modes in debugging output.  There are one or two failures where it does
not log, and they include the call to pipe() in the server loop -- if 
that
fails, it bails without an error, which is a little surprising.  Could 
you
post server debug output for the first connection to the server that
fails?  This would let us "see how far it got"...  In particular, 
whether
it did spawn a child process, etc.

I have never gotten this to fail when sshd is running in debug mode 
(i.e., sshd -ddd). However, given that it doesn't fork when run with 
-d, that still doesn't tell us too much.

When I set LogLevel DEBUG3, this is as much info as I am given in the 
auth.log:

Nov 20 16:39:19 clyde sshd[63993]: Failed none for rabbi from 127.0.0.1 
port 62701 ssh2

And this is the debug output for the connection, as seen from the 
client:

bash-2.05b# ssh -vvv -l rabbi localhost
OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Rhosts Authentication disabled, originating port will not be 
trusted.
debug2: ssh_connect: needpriv 0
debug1: Connecting to localhost [::1] port 22.
socket: Protocol not supported
debug1: Connecting to localhost [127.0.0.1] port 22.
debug1: Connection established.
debug1: identity file /root/.ssh/identity type -1
debug1: identity file /root/.ssh/id_rsa type -1
debug1: identity file /root/.ssh/id_dsa type -1
ssh_exchange_identification: Connection closed by remote host

This can't be a system-wide process related resource issue, I don't 
think, because once a user connects and authenticates, there are no 
problems of note. I'm leaning toward a socket related limit or 
user-level limit. However, since sysctl tells me:

kern.ipc.maxsockbuf: 262144
kern.ipc.somaxconn: 16384
kern.ipc.numopensockets: 2201
kern.ipc.maxsockets: 49312
I tend to not believe the former, and why the latter would be occurring 
escapes me as well. 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Tim Kientzle
Len Sassaman wrote:
The problem is that after about 150 users log in (300ish sshd sessions, 
since I am using privsep), incoming connections start getting dropped. 
That number (150) sounds awfully familiar; I feel like
I've seen it somewhere recently.  H
Try an 'fstat' when connections start getting dropped.
I wonder if something (PAM module, maybe?) is opening a
file on each connection and you're running out of per-process
file descriptors.
Tim Kientzle

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Robert Watson

On Thu, 20 Nov 2003, Ken Smith wrote:

> On Thu, Nov 20, 2003 at 10:56:08AM -0500, Robert Watson wrote:
> 
> > Hmm.  Well, it certainly sounds like a resource limit to me, especially if
> > it's a nice round number like "150" or "300".
> 
> One possibility might be running out of pseudo-terminals to support the
> login sessions.  pty's are created as needed I think, and the code that
> handles it is in sys/kern/tty_pty.c.  The limits on it appear to be 256
> ptys: 

I thought about that, but the submitter indicated that pty's were not
being allocated.  However, that would be a really good thing to verify,
since the numbers come out right...

I should really clean up and commit my pty cleanup at some point, as well
as support for forkpty()/openpty()/etc that avoid the sort of code found
below.  Presumably that would be a 5.3 thing. 

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


> 
> /*
>  * This function creates and initializes a pts/ptc pair
>  *
>  * pts == /dev/tty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv]
>  * ptc == /dev/pty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv]
>  *
>  * XXX: define and add mapping of upper minor bits to allow more
>  *  than 256 ptys.
>  */
> 
> I don't know if simply changing the :
> 
>   static char *names = "pqrsPQRS";
> 
> to something longer is all that would be required or if there are
> other factors involved.
> 
> -- 
>   Ken Smith
> - From there to here, from here to  |   [EMAIL PROTECTED]
>   there, funny things are everywhere.   |
>   - Theodore Geisel |
> 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Ken Smith
On Thu, Nov 20, 2003 at 10:56:08AM -0500, Robert Watson wrote:

> Hmm.  Well, it certainly sounds like a resource limit to me, especially if
> it's a nice round number like "150" or "300".

One possibility might be running out of pseudo-terminals to support
the login sessions.  pty's are created as needed I think, and the
code that handles it is in sys/kern/tty_pty.c.  The limits on it
appear to be 256 ptys:

/*
 * This function creates and initializes a pts/ptc pair
 *
 * pts == /dev/tty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv]
 * ptc == /dev/pty[pqrsPQRS][0123456789abcdefghijklmnopqrstuv]
 *
 * XXX: define and add mapping of upper minor bits to allow more
 *  than 256 ptys.
 */

I don't know if simply changing the :

static char *names = "pqrsPQRS";

to something longer is all that would be required or if there are
other factors involved.

-- 
Ken Smith
- From there to here, from here to  |   [EMAIL PROTECTED]
  there, funny things are everywhere.   |
  - Theodore Geisel |
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-20 Thread Robert Watson

On Wed, 19 Nov 2003, Len Sassaman wrote:

> It is my intuition from this behavior that the sshd master process
> listening for connections is unable to spawn a new process to complete
> the authentication step, and thus the connection is being dropped. There
> is no information of use in dmesg, nor in the system logs. (I've cranked
> up LogLevel to DEBUG3 in sshd_config). 
> 
> I have a RedHat Linux server running the 2.4.18-3smp kernel on a dual
> Athlon MP 1800+ and 2048MB RAM that is known to handle 1000 users
> without issue -- so I have to believe the FreeBSD box, though not as
> beefy hardware-wise, should be able to do better than a few hundred
> users. I believe this to be some sort of resource limit issue, but I
> have addressed everything I could think of. 

Hmm.  Well, it certainly sounds like a resource limit to me, especially if
it's a nice round number like "150" or "300".  However, I'm also having a
bit of trouble seeing, off the top of my head, which limit it might be. 
It sounds like you've got the ones I would think of.  A quick skim of
sshd.c suggests that it is pretty careful to document various failure
modes in debugging output.  There are one or two failures where it does
not log, and they include the call to pipe() in the server loop -- if that
fails, it bails without an error, which is a little surprising.  Could you
post server debug output for the first connection to the server that
fails?  This would let us "see how far it got"...  In particular, whether
it did spawn a child process, etc.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Help request: problems with a 5.1 server and large numbers of ssh users.

2003-11-19 Thread Len Sassaman
Hi folks,

I have a problem, and I am unable to find previous discussions of it. 
Any pointers or clues would be much appreciated.

I have a FreeBSD 5.1 server that needs to be able to handle several 
thousand simultaneous ssh sessions from distinct users. (I am using 
FreeBSD 5.1 because I need to be able to support ldap authentication.)

Hardware info:
CPU: AMD Athlon(tm) MP 2000+ (1666.74-MHz 686-class CPU)
real memory  = 1610088448 (1535 MB)
avail memory = 1558822912 (1486 MB)
My version of ssh is 3.6.1p2 patched to address the security concerns. 
(I am not using 3.7.1p because it dropped support for password 
authentication with PAM, and I cannot assume keyboard-interactive 
authentication will be present in my users' clients.)

All of these users are doing ssh port forwarding, and are not assigned 
ptys.

I have not modified login.conf in any way -- the defaults of "no 
limits" remain.

The kernel tunables in /boot/loader.conf are set to:

kern.maxfiles="49312"
kern.maxproc="24656"
kern.maxprocperuid="11094"
kern.ipc.maxsockets="24656"
kern.ipc.somaxconn="8192"
The kernel is compiled with NMBCLUSTERS=65536 and maxusers=0 (which 
defaults to 384).

The problem is that after about 150 users log in (300ish sshd sessions, 
since I am using privsep), incoming connections start getting dropped. 
i.e.,

bash-2.05b# ssh -v localhost
OpenSSH_3.6.1p2, SSH protocols 1.5/2.0, OpenSSL 0x0090701f
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Rhosts Authentication disabled, originating port will not be 
trusted.
debug1: Connecting to localhost [::1] port 22.
socket: Protocol not supported
debug1: Connecting to localhost [127.0.0.1] port 22.
debug1: Connection established.
debug1: identity file /root/.ssh/identity type -1
debug1: identity file /root/.ssh/id_rsa type -1
debug1: identity file /root/.ssh/id_dsa type -1
ssh_exchange_identification: Connection closed by remote host
debug1: Calling cleanup 0x805f010(0x0)
bash-2.05b#

It is my intuition from this behavior that the sshd master process 
listening for connections is unable to spawn a new process to complete 
the authentication step, and thus the connection is being dropped. 
There is no information of use in dmesg, nor in the system logs. (I've 
cranked up LogLevel to DEBUG3 in sshd_config).

I have a RedHat Linux server running the 2.4.18-3smp kernel on a dual 
Athlon MP 1800+ and 2048MB RAM that is known to handle 1000 users 
without issue -- so I have to believe the FreeBSD box, though not as 
beefy hardware-wise, should be able to do better than a few hundred 
users. I believe this to be some sort of resource limit issue, but I 
have addressed everything I could think of.

Here's the sysctl vm.zone output:

vm.zone:
ITEMSIZE LIMIT USEDFREE  REQUESTS
FFS2 dinode: 256,0,   1089, 21, 1359
FFS1 dinode: 128,0,  0,  0,0
FFS inode:   144,0,   1089, 59, 1359
SWAPMETA:276,   121576,  0,  0,0
unpcb:   140,65548,329, 63,31364
ripcb:   228,49317,  0,  0,0
syncache:136,15370,  0, 58,36747
tcptw:48,49385,   3812,255,89831
tcpcb:   360,49313,   1048, 63,   195072
inpcb:   228,49317,   4921, 94,   195072
udpcb:   228,49317,  1, 33,   114497
socket:  256,49320,   1383,102,   340934
KNOTE:64,0,  0,124,   114453
PIPE:176,0,622, 68,17402
DIRHASH:1024,0,138,  6,  138
NAMEI:  1024,0,  9, 11,   451791
VNODEPOLL:76,0,  0,  0,0
VNODE:   292,0,   1473, 35, 1473
g_bio:   144,0,259, 49,   186276
VMSPACE: 256,0,424, 26,11035
UPCALL:   44,0,  0,  0,0
KSE:  64,0,496, 62,  496
KSEGRP:  120,0,496, 62,  496
THREAD:  292,0,496, 11,  496
PROC:480,0,461, 35,11074
Files:60,0,   6051,153, 89241268
65536: 65536,0,  3,  3,3
32768: 32768,0,  3,  3,   32
16384: 16384,0, 56, 22, 1733
8192:   8192,0,  2,  4,   50
4096:   4096,0,736, 44,11965
2048:   2048,0, 71,  5,   359215
1024:   1024,0,408, 20,   284756
512: 512,0,102, 18,43908
256: 256,0,   5166, 84,   131327
128: 128,0,   6784,253,   535182
64:   64,0,   3032, 68,87489
32:   32,0,   2155,182,   211243
16:   16,0,   4485,