Re: [AOLSERVER] 3.4 memory usage
Harry Moreau wrote: Personally, I'm heartened to hear other people see leaks Yea, me too! I have several systems where nsd 3.2+ad12 will slowly consume memory until, if not restarted, it will crash the system. I had been debating whether I should upgrade to 3.4 and see if it would help, but it sounds to me like the problem is still present there. janine
Re: [AOLSERVER] 3.4 memory usage
I was travelling yesterday, plus we are still fighting a few fires since the 3.4 upgrade. To answer some of the questions/suggestions people have posted: 1. Yes, I'm sure we're running 7.6 TCL. I ran into a few problems with 8.X because we (intentionally) use poorly-constructed lists in a couple of places and 8.X complains while 7.6 doesn't. Second, I didn't want to deal with the character set issues right now - sounds like a mess. Third, 7.6 is faster than 8.X in the limited testing I've done, even in CPU-bound loops. I'm assuming this is because ns_shares are more efficient in 7.6, but not sure. Here's the test I did: 2. The server that is growing is a special-purpose server. Over 90% of all requests are one of two kinds, which should make tracking down the cause a lot easier. This server may do a lot of execs in some cases, which is why I asked about that. Our main server, with a large variety of pages, seems to be doing fine memory-wise. (And hasn't crashed - YAY!) The special server does around 750K hits/day. 3. We have monitoring tools that complain if a server takes longer than 10 seconds to respond to a request. It has been complaining about this same server. I remember when I ran benchmarks with some version of AS that it would periodically go to sleep for several seconds with the CPU idle. I need to go back and see if I can duplicate that behavior because I think we are seeing it in production. 4. We don't have adp even configured, but I'll check out the fastpath stuff for memory usage. We don't serve any static pages and very few graphics from AS. Here's an interesting stat: around 80% of our graphics hits are for a group of less than 20 files. (Not 80% of the graphics data transfers mind you, but 80% of the hits) 5. In one of my posts I said this server is not accumulating data in ns_shares, but didn't mean that it doesn't use them at all. All of our servers use ns_shares - alot. What I meant was that we are not doing anything that would explain why the server would start out with 81MB with the ns_shares loaded, then grow to 240MB. We do create new ns_share array entries while it's running, but not to this extent. We don't use nsv's - IMO that programming model is broken because regular TCL constructs can't be used on nsv's. Overall, I'd have to say I'm real happy with the upgrade from 2.3.3. 3.4 is more stable and the fact that we now have the source for the server we are running is a HUGE deal. Running 2.3.3 was pretty scary business-wise, especially when AOL bought Netscape and the future of AS seemed a bit in doubt. Footnote: is anyone (or most people) using zippy? I still haven't tried that. Jim Harry Moreau wrote: Personally, I'm heartened to hear other people see leaks Yea, me too! I have several systems where nsd 3.2+ad12 will slowly ... janine
Re: [AOLSERVER] 3.4 memory usage
Try 3.3+ad13. It has a memory leak fix involving TSDs that I back-ported from the 4.0 tree. +-- On Oct 17, Janine Sisk said: Yea, me too! I have several systems where nsd 3.2+ad12 will slowly consume memory until, if not restarted, it will crash the system.
Re: [AOLSERVER] 3.4 memory usage
In glancing at the zippy code, it looks like it used a power-of-2 algorithm, so I figured it might cause less heap fragmentation. I think that might be at least some of the problem. Does the standard gnu/linux memory allocator handle fragmentation poorly/well? +-- On Oct 17, Jim Wilcoxson said: Footnote: is anyone (or most people) using zippy? I still haven't tried that. The zippy malloc will probably use more memory but should improve performance. It should have no effect on memory leaks. I believe AOL uses it in production.
Re: [AOLSERVER] 3.4 memory usage
+-- On Oct 17, Jim Wilcoxson said: In glancing at the zippy code, it looks like it used a power-of-2 algorithm, so I figured it might cause less heap fragmentation. I think that might be at least some of the problem. Does the standard gnu/linux memory allocator handle fragmentation poorly/well? I think the standard Linux allocator is dl-malloc, which as I recall has pretty good fragmentation properties. The reason zippy may use more memory is that it keeps a separate pool of memory for each thread. This reduces lock contention but means that less free memory is shared between threads.
Re: [AOLSERVER] 3.4 memory usage
Janine Sisk wrote: Harry Moreau wrote: Personally, I'm heartened to hear other people see leaks Yea, me too! I have several systems where nsd 3.2+ad12 will slowly consume memory until, if not restarted, it will crash the system. I had been debating whether I should upgrade to 3.4 and see if it would help, but it sounds to me like the problem is still present there. janine for wahtever it's worth we had problems w/ leaks in 3.2 and 3.3, mainly when threads expired. 3.4 has ben behaving for us using tcl8.x... -mike -- Mike Hoegeman Email: [EMAIL PROTECTED] Phone: 805-279-7306
Re: [AOLSERVER] 3.4 memory usage
Ok, I must have missed something, or might have been off of the cluetrain too long, but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy the pinhead' and other weird stuff! Anyone have an URL or explanation? thanks, --brett On Wed, 17 Oct 2001 09:54:25 -0500 Rob Mayoff [EMAIL PROTECTED] wrote: +-- On Oct 17, Jim Wilcoxson said: In glancing at the zippy code, it looks like it used a power-of-2 algorithm, so I figured it might cause less heap fragmentation. I think that might be at least some of the problem. Does the standard gnu/linux memory allocator handle fragmentation poorly/well? I think the standard Linux allocator is dl-malloc, which as I recall has pretty good fragmentation properties. The reason zippy may use more memory is that it keeps a separate pool of memory for each thread. This reduces lock contention but means that less free memory is shared between threads. _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: [AOLSERVER] 3.4 memory usage
+-- On Oct 17, Brett Schwarz said: Ok, I must have missed something, or might have been off of the cluetrain too long, but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy the pinhead' and other weird stuff! It's the -z flag to nsd.
Re: [AOLSERVER] 3.4 memory usage
zippy is the -z command line option to AS. It causes an AOL-designed memory allocator to be used instead of the standard C library malloc. Properties of zippy are that it has separate heaps for each thread instead of a shared heap, thus avoiding the need to lock when malloc'ing private thread storage, and it uses a different alloc/freelist strategy. The wrappers to choose one vs the other are in thread/memory.c. The zippy allocator is in thread/pool.c Jim Ok, I must have missed something, or might have been off of the cluetrain too long, but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy the pinhead' and other weird stuff! Anyone have an URL or explanation? thanks, --brett On Wed, 17 Oct 2001 09:54:25 -0500 Rob Mayoff [EMAIL PROTECTED] wrote: +-- On Oct 17, Jim Wilcoxson said: In glancing at the zippy code, it looks like it used a power-of-2 algorithm, so I figured it might cause less heap fragmentation. I think that might be at least some of the problem. Does the standard gnu/linux memory allocator handle fragmentation poorly/well? I think the standard Linux allocator is dl-malloc, which as I recall has pretty good fragmentation properties. The reason zippy may use more memory is that it keeps a separate pool of memory for each thread. This reduces lock contention but means that less free memory is shared between threads. _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Re: [AOLSERVER] 3.4 memory usage
On 15 Oct, Jim Wilcoxson wrote: After running 12 hours, we're seeing 28 nsd threads using 253MB. Does that still seem reasonable for memory usage? Our baseline for this server is 81MB right after the server starts with around 12 threads. This server handled 762K requests today, total (less than that in the 12 hour period). I don't use Tcl 7.6 with 3.4, I use 8.3.2, but... There's a big difference between the memory management of 2.3 and 3.x versions. Some internal data structures are never freed, they get put into pools when they're done with - so you always see leaks to some extent. Be very careful that you always release db handles. Also, if adp pages terminate unexpectedly, sometimes global variables don't get deleted as they should - and I have suspected, but never proven, that there may be cases where ns_sets don't get deleted properly either. If your server gets busy periodically, memory usage will increase, but if you timeout the threads for inactivity, the growth rate slows. Despite this, restarting periodically is a good thing. We restart our five servers every night. I have often wondered if the leaks are something to do with the peculiar thread model on linux - where threads and processes are essentially the same thing. Do people see leaks on systems with lower level threads, e.g. Solaris or HP-UX? Unfortunately, I don't have such systems to experiment with. Personally, I'm heartened to hear other people see leaks; I thought it was just our strange set up that was causing it - We use several unusual add ons, e.g. a modified thread safe version of [incr tcl], the tDOM xml parser, extensive auto loading and a modified version of the vhost module. -- --Harry Moreau [EMAIL PROTECTED] http://www.online.ie - Ireland's premier portal.
Re: [AOLSERVER] 3.4 memory usage
On Tue, 16 Oct 2001, Dossy wrote: On 2001.10.15, Jim Wilcoxson [EMAIL PROTECTED] wrote: After running 12 hours, we're seeing 28 nsd threads using 253MB. Does That's 9.0 MB per thread. Sounds perfectly reasonable to me. (What a world we live in, where 9MB in a thread -- a lightweight structure -- is reasonable!) Jim, Check your adp cache size and your fastpath cache size. Both default to 5MB per thread. Try checking the nstelemetry.adp and see if it provides insight into how you're doing.
Re: [AOLSERVER] 3.4 memory usage
After running 12 hours, we're seeing 28 nsd threads using 253MB. Does that still seem reasonable for memory usage? Our baseline for this server is 81MB right after the server starts with around 12 threads. This server handled 762K requests today, total (less than that in the 12 hour period). AS 3.4, TCL 7.6, Linux 2.2.19 Are you running 7.6 for sure? I thought team dropped support for 7.6 in release after 3.3.1 ??? You said earlier that you weren't holding any info in ns_share ... are you creating any nsv variables at all? Also I'm curious how long is your server startup with and without setting the minthreads==maxthreads? Have you tried running it in this fashion? If there is a potential leak in the deletion/creation method of threads then never deleting them may help memory? What is your adp cache setup as and do you have cache for mmap/fastpath setup? If you could post a page somewhere with a *sanitized* listing of the telemetry output it would help. Hopefully we can track this down, CarlGet your FREE download of MSN Explorer at http://explorer.msn.com
[AOLSERVER] 3.4 memory usage
We started 3.4 on a production server this morning and after 90 minutes it looked like this: Last login: Mon Oct 15 05:29:02 2001 No mail. $ ps aux|grep nsd nsadmin 32565 0.0 3.4 40424 36132 ? S 04:15 0:01 bin/nsd -i -t nsd nsadmin 32568 0.0 3.4 40424 36132 ? S 04:15 0:01 bin/nsd -i -t nsd nsadmin 32569 0.0 3.4 40424 36132 ? S 04:15 0:02 bin/nsd -i -t nsd nsadmin 32570 0.1 3.4 40424 36132 ? S 04:15 0:15 bin/nsd -i -t nsd nsadmin 32571 0.0 3.4 40424 36132 ? S 04:15 0:02 bin/nsd -i -t nsd nsadmin 32572 0.0 3.4 40424 36132 ? S 04:15 0:00 bin/nsd -i -t nsd nsadmin 32576 0.0 3.4 40424 36132 ? S 04:15 0:04 bin/nsd -i -t nsd nsadmin 32579 1.9 3.4 40424 36132 ? S 04:15 2:39 bin/nsd -i -t nsd nsadmin 32582 2.1 3.4 40424 36132 ? S 04:15 2:55 bin/nsd -i -t nsd nsadmin 32588 0.1 3.4 40424 36132 ? S 04:15 0:08 bin/nsd -i -t nsd nsadmin 32676 2.6 3.4 40424 36132 ? S 04:17 3:32 bin/nsd -i -t nsd nsadmin447 2.6 3.4 40424 36132 ? S 04:20 3:27 bin/nsd -i -t nsd nsadmin 5929 1.7 3.4 40424 36132 ? S 06:00 0:31 bin/nsd -i -t nsd Almost 12 hours later it looks like this: # ps aux|grep nsd nsadmin 32565 0.0 7.6 85976 79552 ? S 04:15 0:01 bin/nsd -i -t nsd nsadmin 32568 0.0 7.6 85976 79552 ? S 04:15 0:04 bin/nsd -i -t nsd nsadmin 32569 0.0 7.6 85976 79552 ? S 04:15 0:08 bin/nsd -i -t nsd nsadmin 32570 0.1 7.6 85976 79552 ? S 04:15 0:56 bin/nsd -i -t nsd nsadmin 32571 0.0 7.6 85976 79552 ? S 04:15 0:10 bin/nsd -i -t nsd nsadmin 32572 0.0 7.6 85976 79552 ? S 04:15 0:00 bin/nsd -i -t nsd nsadmin 32576 0.0 7.6 85976 79552 ? S 04:15 0:10 bin/nsd -i -t nsd nsadmin 32579 2.6 7.6 85976 79552 ? S 04:15 13:38 bin/nsd -i -t nsd nsadmin 32582 2.4 7.6 85976 79552 ? S 04:15 12:47 bin/nsd -i -t nsd nsadmin 32588 0.1 7.6 85976 79552 ? S 04:15 0:33 bin/nsd -i -t nsd nsadmin 32676 2.7 7.6 85976 79552 ? S 04:17 13:57 bin/nsd -i -t nsd nsadmin447 2.9 7.6 85976 79552 ? S 04:20 14:48 bin/nsd -i -t nsd nsadmin 5929 2.6 7.6 85976 79552 ? S 06:00 10:49 bin/nsd -i -t nsd nsadmin 10619 3.3 7.6 85976 79552 ? S 10:31 4:38 bin/nsd -i -t nsd nsadmin 10711 3.0 7.6 85976 79552 ? S 10:34 4:04 bin/nsd -i -t nsd nsadmin 10712 1.7 7.6 85976 79552 ? S 10:34 2:19 bin/nsd -i -t nsd nsadmin 10714 1.4 7.6 85976 79552 ? S 10:34 1:54 bin/nsd -i -t nsd nsadmin 10726 1.4 7.6 85976 79552 ? S 10:35 1:56 bin/nsd -i -t nsd nsadmin 10727 1.5 7.6 85976 79552 ? S 10:35 2:02 bin/nsd -i -t nsd This server does not accumulate any data in ns_shares, so I'm trying to figure out if this 45MB memory growth is reasonable for adding 6 additional threads. It doesn't seem reasonable. Anyone have suggestions for tracking down what's happening? Is anyone using the zippy allocator in production? Any performance data you can share? Thanks, Jim
Re: [AOLSERVER] 3.4 memory usage
Im curious why you dont just set minthreads = maxthreads at startup to reduce load. This server does not accumulate any data in ns_shares, so I'm trying to figure out if this 45MB memory growth is reasonable for adding 6 additional threads. It doesn't seem reasonable. Anyone have suggestions for tracking down what's happening? Remember that the 6 threads may not represent all threads that have been started/died off. If like you said in previous post you have maxthreads = ~ 40 then 2 MB per thread does not seem unreasonable to me for a production server. Remember also that once allocated that memory may be later reused in the zippy alloc method. So even if your memory is showing 85 MB that may be what peak was but may not grow much larger if peak activity has been hit. Best Regards, Carl GarlandGet your FREE download of MSN Explorer at http://explorer.msn.com
Re: [AOLSERVER] 3.4 memory usage
Im curious why you dont just set minthreads = maxthreads at startup to reduce load. Because a) I don't know what a good value is for maxthreads, so overestimate it; b) It will take longer to get the server to accept requests when starting up. nbsp; BRnbsp;/P/DIVgt;This server does not accumulate any data in ns_shares, so I'm trying DIV/DIVgt;to figure out if this 45MB memory growth is reasonable for adding 6 DIV/DIVgt;additional threads. It doesn't seem reasonable. Anyone have DIV/DIVgt;suggestions for tracking down what's happening? DIV/DIV PRemember that the 6 threads may not represent all threads that have been started/died off. If like you said in previous post you have maxthreads = ~ 40 then 2 MB per thread does not seem unreasonable to me for a production server. Remember also that once allocated that memory may be later reused in the zippy alloc method. So even if your memory is showing 85 MB that may be what peak was but may not grow much larger if peak activity has been hit./P We're not using detached threads and I didn't have threadtimeout set (do now though), so I think no threads have died. We are using exec on this server, regular allocator, and TCL 7.6. I remember something a while back about exec leaks? Guess the real test will be whether we still need to restart the server every day. The good news is, it hasn't crashed yet like 2.3.3 was doing! Jim
[AOLSERVER] 3.4 memory usage
After running 12 hours, we're seeing 28 nsd threads using 253MB. Does that still seem reasonable for memory usage? Our baseline for this server is 81MB right after the server starts with around 12 threads. This server handled 762K requests today, total (less than that in the 12 hour period). AS 3.4, TCL 7.6, Linux 2.2.19 Any guidance? TIA. Jim nsadmin 4237 0.0 31.4 253508 244200 ? S 12:58 0:01 bin/nsd -i -t nsd nsadmin 4241 0.0 31.4 253508 244200 ? S 12:58 0:02 bin/nsd -i -t nsd nsadmin 4242 0.0 31.4 253508 244200 ? S 12:58 0:01 bin/nsd -i -t nsd nsadmin 4243 0.0 31.4 253508 244200 ? S 12:58 0:27 bin/nsd -i -t nsd nsadmin 4244 0.0 31.4 253508 244200 ? S 12:58 0:04 bin/nsd -i -t nsd nsadmin 4245 0.0 31.4 253508 244200 ? S 12:58 0:00 bin/nsd -i -t nsd nsadmin 4250 1.1 31.4 253508 244200 ? S 12:58 6:26 bin/nsd -i -t nsd nsadmin 4251 1.2 31.4 253508 244200 ? S 12:58 6:49 bin/nsd -i -t nsd nsadmin 4252 1.1 31.4 253508 244200 ? S 12:58 6:16 bin/nsd -i -t nsd nsadmin 4253 2.4 31.4 253508 244200 ? S 12:58 13:21 bin/nsd -i -t nsd nsadmin 4261 1.1 31.4 253508 244200 ? S 12:59 6:17 bin/nsd -i -t nsd nsadmin 4263 1.0 31.4 253508 244200 ? S 12:59 5:51 bin/nsd -i -t nsd nsadmin 4288 1.1 31.4 253508 244200 ? S 13:01 6:05 bin/nsd -i -t nsd nsadmin 4332 1.0 31.4 253508 244200 ? S 13:02 6:02 bin/nsd -i -t nsd nsadmin 4540 1.7 31.4 253508 244200 ? S 13:05 9:22 bin/nsd -i -t nsd nsadmin 4603 1.0 31.4 253508 244200 ? S 13:06 5:49 bin/nsd -i -t nsd nsadmin 6166 1.0 31.4 253508 244200 ? S 13:33 5:30 bin/nsd -i -t nsd nsadmin 13523 1.0 31.4 253508 244200 ? S 15:50 4:12 bin/nsd -i -t nsd nsadmin 16186 2.0 31.4 253508 244200 ? S 16:38 6:50 bin/nsd -i -t nsd nsadmin 16218 0.9 31.4 253508 244200 ? S 16:39 3:09 bin/nsd -i -t nsd nsadmin 16387 0.9 31.4 253508 244200 ? R 16:41 3:01 bin/nsd -i -t nsd nsadmin 18673 0.9 31.4 253508 244200 ? S 17:19 2:44 bin/nsd -i -t nsd nsadmin 18674 0.9 31.4 253508 244200 ? S 17:19 2:49 bin/nsd -i -t nsd nsadmin 18675 0.9 31.4 253508 244200 ? S 17:19 2:53 bin/nsd -i -t nsd nsadmin 19433 0.8 31.4 253508 244200 ? S 17:31 2:21 bin/nsd -i -t nsd nsadmin 23954 0.8 31.4 253508 244200 ? S 18:34 1:49 bin/nsd -i -t nsd nsadmin 25704 0.7 31.4 253508 244200 ? S 19:01 1:31 bin/nsd -i -t nsd