Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Kris Kennaway [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) ... Try tuning the pwd_mkdb parameters (see hash(3)) in /usr/src/usr.sbin/pwd_mkdb/pwd_mkdb.c and recompile: HASHINFO openinfo = { 4096, /* bsize */ 32, /* ffactor */ 256,/* nelem */ 2048 * 1024,/* cachesize */ NULL, /* hash() */ 0 /* lorder */ }; e.g. adjust nelem to 12000 to accomodate your significantly-larger-than-average password database. If this helps, please submit a PR requesting that someone make an option to pwd_mkdb to tune this at runtime (or better yet, submit the patch to do this yourself - it's straightforward to modify the source to do this). Thanks. That had no effect on the large number of seeks/reads to do a getpwuid of a specific uid. I tried boosting that number further, still no change. I suspect the problem is related to some change to the hash functions between 4.7 and 5.2.1 and I hope to get to the bottom of it today. I tried two getpwnam (as opposed to getpwuid) calls on 2 different userids, one took 1000 seek/reads, the other 16,000, so it's all pretty random, no doubt related to how stuff gets hashed. On 4.7 it takes just one or two reads/seeks. As each login via ipop, imap, and each sendmail, and just about everything will be doing getpwnam's I think this is our problem. -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) It looks like the overhaul of getpwent Apr/2003 to make it thread safe: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/getpwent.c may be the problem. I've tested the dbm_fetch function independently on a large file, and it is fine. I've opened a bug report, and plan to build a replacement 4.x mail server, as the most deterministic path to restoring adequate e-mail service to our users. Can anyone suggest a workaround ? -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: Quoting Bruce Campbell [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) It looks like the overhaul of getpwent Apr/2003 to make it thread safe: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/getpwent.c may be the problem. I've tested the dbm_fetch function independently on a large file, and it is fine. I've opened a bug report, and plan to build a replacement 4.x mail server, as the most deterministic path to restoring adequate e-mail service to our users. Can anyone suggest a workaround ? Well, somewhat unbelievably, copying a getpwent.c from 4.7 and remaking libc on 5.3 with it worked. Load average has gone from 70 to 2. And, so that this qualifies as a question... Am I crazy to pull an old getpwnam from 4.7 and blindly build it on 5.3 ? -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: ... Well, somewhat unbelievably, copying a getpwent.c from 4.7 and remaking libc on 5.3 with it worked. Load average has gone from 70 to 2. One of my co-workers has found a less kludgey workaround for the high load problem we were seeing on 5.3 with large /etc/master.passwd, as follows: --- /etc/nsswitch.conf.old Wed Jan 5 19:23:24 2005 +++ /etc/nsswitch.conf Wed Jan 5 19:23:43 2005 @@ -1,7 +1,7 @@ -group: compat +group: files group_compat: nis hosts: files dns networks: files -passwd: compat +passwd: files passwd_compat: nis shells: files System is purring with load average under 1 now, 200,000 pop/imap sessions per day and 200,000 e-mails per day, all spamassassinated. For more details and ongoing followup, see: http://www.freebsd.org/cgi/query-pr.cgi?pr=75855 -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Kris Kennaway [EMAIL PROTECTED]: Well, no, not quite. old: imap-uw-2002_1,1 new: imap-uw-2004a,1 OK, that's where you should start, then. Go back to the software configuration that you know is working and see if it still misbehaves. Kris Thanks. I shutdown imapd/ipop3d completely so I just had sendmail running, and still load av. was 20-30. Anyways, I have just found something very odd with both 5.2.1 and 5.3 on multiple different systems here, including a brand new GENERIC install. On 5.x, ls -l or ps waux is very slow with our /etc/master.passwd which has 11320 entries. I truss'ed those commands, and gave up after watching : lseek(4,0x17d000,SEEK_SET) = 1560576 (0x17d000) read(0x4,0x8074000,0x1000) = 4096 (0x1000) lseek(4,0x17e000,SEEK_SET) = 1564672 (0x17e000) read(0x4,0x8062000,0x1000) = 4096 (0x1000) lseek(4,0x17f000,SEEK_SET) = 1568768 (0x17f000) read(0x4,0x8066000,0x1000) = 4096 (0x1000) lseek(4,0x18,SEEK_SET) = 1572864 (0x18) scroll by for 10 minutes. (handle 4 = /etc/spwd.db) I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) 4.7 (with same master.passwd file) gave 59 lines of output, which seems normal. I'm speculating that imap and sendmail and just about everything use getpwuid and getpwuid is misbehaving on 5.x especially with a large master.passwd file. I will report this through the proper mechanism once I do just a bit more testing. And perhaps it is a known issue already and I'll look into that also. Or perhaps I have messed something up unwittingly, which I have been known to do. We do have an extremely busy 5.2.1 system running here fine on the same hardware, just it has a small /etc/master.passwd which may explain that systems success to date. Thank you to everyone who sent suggestions. -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) 4.7 (with same master.passwd file) gave 59 lines of output, which seems normal. I'm speculating that imap and sendmail and just about everything use getpwuid and getpwuid is misbehaving on 5.x especially with a large master.passwd file. Try tuning the pwd_mkdb parameters (see hash(3)) in /usr/src/usr.sbin/pwd_mkdb/pwd_mkdb.c and recompile: HASHINFO openinfo = { 4096, /* bsize */ 32, /* ffactor */ 256,/* nelem */ 2048 * 1024,/* cachesize */ NULL, /* hash() */ 0 /* lorder */ }; e.g. adjust nelem to 12000 to accomodate your significantly-larger-than-average password database. If this helps, please submit a PR requesting that someone make an option to pwd_mkdb to tune this at runtime (or better yet, submit the patch to do this yourself - it's straightforward to modify the source to do this). Kris pgpAjUrFD81hG.pgp Description: PGP signature