Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Janine Sisk

Harry Moreau wrote:
 Personally, I'm heartened to hear other people see leaks

Yea, me too!  I have several systems where nsd 3.2+ad12 will slowly
consume memory until, if not restarted, it will crash the system.  I had
been debating whether I should upgrade to 3.4 and see if it would help,
but it sounds to me like the problem is still present there.

janine



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Jim Wilcoxson

I was travelling yesterday, plus we are still fighting a few fires
since the 3.4 upgrade.  To answer some of the questions/suggestions
people have posted:

1. Yes, I'm sure we're running 7.6 TCL.  I ran into a few problems
with 8.X because we (intentionally) use poorly-constructed lists
in a couple of places and 8.X complains while 7.6 doesn't.  Second,
I didn't want to deal with the character set issues right now - sounds
like a mess.  Third, 7.6 is faster than 8.X in the limited testing
I've done, even in CPU-bound loops.  I'm assuming this is because
ns_shares are more efficient in 7.6, but not sure.  Here's the test
I did:

2. The server that is growing is a special-purpose server.  Over 90%
of all requests are one of two kinds, which should make tracking down
the cause a lot easier.  This server may do a lot of execs in some
cases, which is why I asked about that.  Our main server, with a large
variety of pages, seems to be doing fine memory-wise.  (And hasn't
crashed - YAY!)  The special server does around 750K hits/day.

3. We have monitoring tools that complain if a server takes longer
than 10 seconds to respond to a request.  It has been complaining
about this same server.  I remember when I ran benchmarks with some
version of AS that it would periodically go to sleep for several
seconds with the CPU idle.  I need to go back and see if I can
duplicate that behavior because I think we are seeing it in production.

4. We don't have adp even configured, but I'll check out the fastpath
stuff for memory usage.  We don't serve any static pages and very
few graphics from AS.  Here's an interesting stat: around 80% of
our graphics hits are for a group of less than 20 files.  (Not 80%
of the graphics data transfers mind you, but 80% of the hits)

5. In one of my posts I said this server is not accumulating data in
ns_shares, but didn't mean that it doesn't use them at all.  All of
our servers use ns_shares - alot.  What I meant was that we are not
doing anything that would explain why the server would start out with
81MB with the ns_shares loaded, then grow to 240MB.  We do create
new ns_share array entries while it's running, but not to this extent.
We don't use nsv's - IMO that programming model is broken because
regular TCL constructs can't be used on nsv's.

Overall, I'd have to say I'm real happy with the upgrade from 2.3.3.
3.4 is more stable and the fact that we now have the source for the
server we are running is a HUGE deal.  Running 2.3.3 was pretty
scary business-wise, especially when AOL bought Netscape and the
future of AS seemed a bit in doubt.

Footnote: is anyone (or most people) using zippy?  I still haven't
tried that.

Jim

 Harry Moreau wrote:   Personally, I'm heartened to hear other
people see leaks   Yea, me too!  I have several systems where nsd
3.2+ad12 will slowly ...   janine 



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Rob Mayoff

Try 3.3+ad13.  It has a memory leak fix involving TSDs that I
back-ported from the 4.0 tree.

+-- On Oct 17, Janine Sisk said:
 Yea, me too!  I have several systems where nsd 3.2+ad12 will slowly
 consume memory until, if not restarted, it will crash the system.



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Jim Wilcoxson

In glancing at the zippy code, it looks like it used a power-of-2
algorithm, so I figured it might cause less heap fragmentation.  I
think that might be at least some of the problem.  Does the standard
gnu/linux memory allocator handle fragmentation poorly/well?


 +-- On Oct 17, Jim Wilcoxson said:
  Footnote: is anyone (or most people) using zippy?  I still haven't
  tried that.

 The zippy malloc will probably use more memory but should improve
 performance. It should have no effect on memory leaks. I believe AOL
 uses it in production.




Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Rob Mayoff

+-- On Oct 17, Jim Wilcoxson said:
 In glancing at the zippy code, it looks like it used a power-of-2
 algorithm, so I figured it might cause less heap fragmentation.  I
 think that might be at least some of the problem.  Does the standard
 gnu/linux memory allocator handle fragmentation poorly/well?

I think the standard Linux allocator is dl-malloc, which as I recall has
pretty good fragmentation properties.

The reason zippy may use more memory is that it keeps a separate pool
of memory for each thread.  This reduces lock contention but means that
less free memory is shared between threads.



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Mike Hoegeman

Janine Sisk wrote:
 Harry Moreau wrote:

Personally, I'm heartened to hear other people see leaks


 Yea, me too!  I have several systems where nsd 3.2+ad12 will slowly
 consume memory until, if not restarted, it will crash the system.  I had
 been debating whether I should upgrade to 3.4 and see if it would help,
 but it sounds to me like the problem is still present there.

 janine



for wahtever it's worth
we had problems w/ leaks in 3.2 and 3.3, mainly when threads expired.
3.4 has ben behaving for us using tcl8.x...

-mike


--
 Mike Hoegeman
 Email: [EMAIL PROTECTED]
 Phone: 805-279-7306



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Brett Schwarz

Ok, I must have missed something, or might have been off of the cluetrain too long, 
but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy 
the pinhead' and other weird stuff!

Anyone have an URL or explanation?

thanks,

--brett


On Wed, 17 Oct 2001 09:54:25 -0500
Rob Mayoff [EMAIL PROTECTED] wrote:

 +-- On Oct 17, Jim Wilcoxson said:
  In glancing at the zippy code, it looks like it used a power-of-2
  algorithm, so I figured it might cause less heap fragmentation.  I
  think that might be at least some of the problem.  Does the standard
  gnu/linux memory allocator handle fragmentation poorly/well?

 I think the standard Linux allocator is dl-malloc, which as I recall has
 pretty good fragmentation properties.

 The reason zippy may use more memory is that it keeps a separate pool
 of memory for each thread.  This reduces lock contention but means that
 less free memory is shared between threads.

_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Rob Mayoff

+-- On Oct 17, Brett Schwarz said:
 Ok, I must have missed something, or might have been off of the cluetrain too long, 
but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy 
the pinhead' and other weird stuff!

It's the -z flag to nsd.



Re: [AOLSERVER] 3.4 memory usage

2001-10-17 Thread Jim Wilcoxson

zippy is the -z command line option to AS.  It causes an AOL-designed
memory allocator to be used instead of the standard C library malloc.

Properties of zippy are that it has separate heaps for each thread
instead of a shared heap, thus avoiding the need to lock when
malloc'ing private thread storage, and it uses a different
alloc/freelist strategy.

The wrappers to choose one vs the other are in thread/memory.c.
The zippy allocator is in thread/pool.c

Jim



 Ok, I must have missed something, or might have been off of the cluetrain too long, 
but what exactly is 'zippy'? I did a google search, but I was getting mostly 'zippy 
the pinhead' and other weird stuff!

 Anyone have an URL or explanation?

 thanks,

 --brett


 On Wed, 17 Oct 2001 09:54:25 -0500
 Rob Mayoff [EMAIL PROTECTED] wrote:

  +-- On Oct 17, Jim Wilcoxson said:
   In glancing at the zippy code, it looks like it used a power-of-2
   algorithm, so I figured it might cause less heap fragmentation.  I
   think that might be at least some of the problem.  Does the standard
   gnu/linux memory allocator handle fragmentation poorly/well?
 
  I think the standard Linux allocator is dl-malloc, which as I recall has
  pretty good fragmentation properties.
 
  The reason zippy may use more memory is that it keeps a separate pool
  of memory for each thread.  This reduces lock contention but means that
  less free memory is shared between threads.

 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com




Re: [AOLSERVER] 3.4 memory usage

2001-10-16 Thread Harry Moreau

On 15 Oct, Jim Wilcoxson wrote:
 After running 12 hours, we're seeing 28 nsd threads using 253MB.  Does
 that still seem reasonable for memory usage?  Our baseline for this
 server is 81MB right after the server starts with around 12 threads.
 This server handled 762K requests today, total (less than that in the
 12 hour period).


I don't use Tcl 7.6 with 3.4, I use 8.3.2, but...
There's a big difference between the memory management of 2.3
and 3.x versions.  Some internal data structures are never freed,
they get put into pools when they're done with - so you always see leaks to some 
extent.

Be very careful that you always release db handles.  Also, if adp pages terminate 
unexpectedly,
sometimes global variables don't get deleted as they should - and I have suspected, 
but never
proven, that there may be cases where ns_sets don't get deleted properly either.

If your server gets busy periodically, memory usage will increase, but if you timeout 
the threads
for inactivity, the growth rate slows.  Despite this, restarting periodically is a 
good thing.
We restart our five servers every night.

I have often wondered if the leaks are something to do with the peculiar thread model 
on linux -
where threads and processes are essentially the same thing.  Do people see leaks on 
systems with
lower level threads, e.g. Solaris or HP-UX?  Unfortunately, I don't have such systems 
to experiment
with.

Personally, I'm heartened to hear other people see leaks; I thought it was just our 
strange
set up that was causing it - We use several unusual add ons, e.g. a modified thread 
safe version of
[incr tcl], the tDOM xml parser, extensive auto loading and a modified version of the 
vhost module.

--
--Harry Moreau
[EMAIL PROTECTED]
http://www.online.ie - Ireland's premier portal.



Re: [AOLSERVER] 3.4 memory usage

2001-10-16 Thread Peter M. Jansson

On Tue, 16 Oct 2001, Dossy wrote:
 On 2001.10.15, Jim Wilcoxson [EMAIL PROTECTED] wrote:
  After running 12 hours, we're seeing 28 nsd threads using 253MB.  Does
 That's 9.0 MB per thread.  Sounds perfectly reasonable to me.

(What a world we live in, where 9MB in a thread -- a lightweight structure
-- is reasonable!)

Jim,

Check your adp cache size and your fastpath cache size.  Both default to
5MB per thread.

Try checking the nstelemetry.adp and see if it provides insight into how
you're doing.



Re: [AOLSERVER] 3.4 memory usage

2001-10-16 Thread carl garland
After running 12 hours, we're seeing 28 nsd threads using 253MB. Does
that still seem reasonable for memory usage? Our baseline for this
server is 81MB right after the server starts with around 12 threads.
This server handled 762K requests today, total (less than that in the
12 hour period).


AS 3.4, TCL 7.6, Linux 2.2.19 
Are you running 7.6 for sure? I thought team dropped support for 7.6 in release after 3.3.1 ??? You said earlier that you weren't holding any info in ns_share ... are you creating any nsv variables at all? Also I'm curious how long is your server startup with and without setting the minthreads==maxthreads? Have you tried running it in this fashion? If there is a potential leak in the deletion/creation method of threads then never deleting them may help memory? What is your adp cache setup as and do you have cache for mmap/fastpath setup? If you could post a page somewhere with a *sanitized* listing of the telemetry output it would help.

Hopefully we can track this down,
CarlGet your FREE download of MSN Explorer at http://explorer.msn.com


[AOLSERVER] 3.4 memory usage

2001-10-15 Thread Jim Wilcoxson

We started 3.4 on a production server this morning and after 90 minutes it
looked like this:

Last login: Mon Oct 15 05:29:02 2001
No mail.
$ ps aux|grep nsd
nsadmin  32565  0.0  3.4 40424 36132 ?   S   04:15   0:01 bin/nsd -i -t nsd
nsadmin  32568  0.0  3.4 40424 36132 ?   S   04:15   0:01 bin/nsd -i -t nsd
nsadmin  32569  0.0  3.4 40424 36132 ?   S   04:15   0:02 bin/nsd -i -t nsd
nsadmin  32570  0.1  3.4 40424 36132 ?   S   04:15   0:15 bin/nsd -i -t nsd
nsadmin  32571  0.0  3.4 40424 36132 ?   S   04:15   0:02 bin/nsd -i -t nsd
nsadmin  32572  0.0  3.4 40424 36132 ?   S   04:15   0:00 bin/nsd -i -t nsd
nsadmin  32576  0.0  3.4 40424 36132 ?   S   04:15   0:04 bin/nsd -i -t nsd
nsadmin  32579  1.9  3.4 40424 36132 ?   S   04:15   2:39 bin/nsd -i -t nsd
nsadmin  32582  2.1  3.4 40424 36132 ?   S   04:15   2:55 bin/nsd -i -t nsd
nsadmin  32588  0.1  3.4 40424 36132 ?   S   04:15   0:08 bin/nsd -i -t nsd
nsadmin  32676  2.6  3.4 40424 36132 ?   S   04:17   3:32 bin/nsd -i -t nsd
nsadmin447  2.6  3.4 40424 36132 ?   S   04:20   3:27 bin/nsd -i -t nsd
nsadmin   5929  1.7  3.4 40424 36132 ?   S   06:00   0:31 bin/nsd -i -t nsd


Almost 12 hours later it looks like this:

# ps aux|grep nsd 
nsadmin  32565  0.0  7.6 85976 79552 ?   S   04:15   0:01 bin/nsd -i -t nsd
nsadmin  32568  0.0  7.6 85976 79552 ?   S   04:15   0:04 bin/nsd -i -t nsd
nsadmin  32569  0.0  7.6 85976 79552 ?   S   04:15   0:08 bin/nsd -i -t nsd
nsadmin  32570  0.1  7.6 85976 79552 ?   S   04:15   0:56 bin/nsd -i -t nsd
nsadmin  32571  0.0  7.6 85976 79552 ?   S   04:15   0:10 bin/nsd -i -t nsd
nsadmin  32572  0.0  7.6 85976 79552 ?   S   04:15   0:00 bin/nsd -i -t nsd
nsadmin  32576  0.0  7.6 85976 79552 ?   S   04:15   0:10 bin/nsd -i -t nsd
nsadmin  32579  2.6  7.6 85976 79552 ?   S   04:15  13:38 bin/nsd -i -t nsd
nsadmin  32582  2.4  7.6 85976 79552 ?   S   04:15  12:47 bin/nsd -i -t nsd
nsadmin  32588  0.1  7.6 85976 79552 ?   S   04:15   0:33 bin/nsd -i -t nsd
nsadmin  32676  2.7  7.6 85976 79552 ?   S   04:17  13:57 bin/nsd -i -t nsd
nsadmin447  2.9  7.6 85976 79552 ?   S   04:20  14:48 bin/nsd -i -t nsd
nsadmin   5929  2.6  7.6 85976 79552 ?   S   06:00  10:49 bin/nsd -i -t nsd
nsadmin  10619  3.3  7.6 85976 79552 ?   S   10:31   4:38 bin/nsd -i -t nsd
nsadmin  10711  3.0  7.6 85976 79552 ?   S   10:34   4:04 bin/nsd -i -t nsd
nsadmin  10712  1.7  7.6 85976 79552 ?   S   10:34   2:19 bin/nsd -i -t nsd
nsadmin  10714  1.4  7.6 85976 79552 ?   S   10:34   1:54 bin/nsd -i -t nsd
nsadmin  10726  1.4  7.6 85976 79552 ?   S   10:35   1:56 bin/nsd -i -t nsd
nsadmin  10727  1.5  7.6 85976 79552 ?   S   10:35   2:02 bin/nsd -i -t nsd


This server does not accumulate any data in ns_shares, so I'm trying
to figure out if this 45MB memory growth is reasonable for adding 6
additional threads.  It doesn't seem reasonable.  Anyone have
suggestions for tracking down what's happening?

Is anyone using the zippy allocator in production?  Any performance data
you can share?

Thanks,
Jim



Re: [AOLSERVER] 3.4 memory usage

2001-10-15 Thread carl garland


Im curious why you dont just set minthreads = maxthreads at startup to reduce load. This server does not accumulate any data in ns_shares, so I'm trying
to figure out if this 45MB memory growth is reasonable for adding 6
additional threads. It doesn't seem reasonable. Anyone have
suggestions for tracking down what's happening?

Remember that the 6 threads may not represent all threads that have been started/died off. If like you said in previous post you have maxthreads = ~ 40 then 2 MB per thread does not seem unreasonable to me for a production server. Remember also that once allocated that memory may be later reused in the zippy alloc method. So even if your memory is showing 85 MB that may be what peak was but may not grow much larger if peak activity has been hit.
Best Regards,
Carl GarlandGet your FREE download of MSN Explorer at http://explorer.msn.com


Re: [AOLSERVER] 3.4 memory usage

2001-10-15 Thread Jim Wilcoxson

 Im curious why you dont just set minthreads = maxthreads at startup to reduce load.

Because a) I don't know what a good value is for maxthreads, so
overestimate it; b) It will take longer to get the server to accept
requests when starting up.

nbsp; BRnbsp;/P/DIVgt;This server does not accumulate any data in ns_shares, 
so I'm trying
 DIV/DIVgt;to figure out if this 45MB memory growth is reasonable for adding 6
 DIV/DIVgt;additional threads. It doesn't seem reasonable. Anyone have
 DIV/DIVgt;suggestions for tracking down what's happening?
 DIV/DIV
 PRemember that the 6 threads may not represent all threads that have been 
started/died off. If like you said in previous post you have maxthreads = ~ 40 then 2 
MB per thread does not seem unreasonable to me for a production server. Remember also 
that once allocated that memory may be later reused in the zippy alloc method. So 
even if your memory is showing 85 MB that may be what peak was but may not grow much 
larger if peak activity has been hit./P

We're not using detached threads and I didn't have threadtimeout set
(do now though), so I think no threads have died.

We are using exec on this server, regular allocator, and TCL 7.6.  I
remember something a while back about exec leaks?

Guess the real test will be whether we still need to restart the server
every day.  The good news is, it hasn't crashed yet like 2.3.3 was doing!

Jim



[AOLSERVER] 3.4 memory usage

2001-10-15 Thread Jim Wilcoxson

After running 12 hours, we're seeing 28 nsd threads using 253MB.  Does
that still seem reasonable for memory usage?  Our baseline for this
server is 81MB right after the server starts with around 12 threads.
This server handled 762K requests today, total (less than that in the
12 hour period).

AS 3.4, TCL 7.6, Linux 2.2.19

Any guidance?  TIA.
Jim

nsadmin   4237  0.0 31.4 253508 244200 ? S   12:58   0:01 bin/nsd -i -t nsd
nsadmin   4241  0.0 31.4 253508 244200 ? S   12:58   0:02 bin/nsd -i -t nsd
nsadmin   4242  0.0 31.4 253508 244200 ? S   12:58   0:01 bin/nsd -i -t nsd
nsadmin   4243  0.0 31.4 253508 244200 ? S   12:58   0:27 bin/nsd -i -t nsd
nsadmin   4244  0.0 31.4 253508 244200 ? S   12:58   0:04 bin/nsd -i -t nsd
nsadmin   4245  0.0 31.4 253508 244200 ? S   12:58   0:00 bin/nsd -i -t nsd
nsadmin   4250  1.1 31.4 253508 244200 ? S   12:58   6:26 bin/nsd -i -t nsd
nsadmin   4251  1.2 31.4 253508 244200 ? S   12:58   6:49 bin/nsd -i -t nsd
nsadmin   4252  1.1 31.4 253508 244200 ? S   12:58   6:16 bin/nsd -i -t nsd
nsadmin   4253  2.4 31.4 253508 244200 ? S   12:58  13:21 bin/nsd -i -t nsd
nsadmin   4261  1.1 31.4 253508 244200 ? S   12:59   6:17 bin/nsd -i -t nsd
nsadmin   4263  1.0 31.4 253508 244200 ? S   12:59   5:51 bin/nsd -i -t nsd
nsadmin   4288  1.1 31.4 253508 244200 ? S   13:01   6:05 bin/nsd -i -t nsd
nsadmin   4332  1.0 31.4 253508 244200 ? S   13:02   6:02 bin/nsd -i -t nsd
nsadmin   4540  1.7 31.4 253508 244200 ? S   13:05   9:22 bin/nsd -i -t nsd
nsadmin   4603  1.0 31.4 253508 244200 ? S   13:06   5:49 bin/nsd -i -t nsd
nsadmin   6166  1.0 31.4 253508 244200 ? S   13:33   5:30 bin/nsd -i -t nsd
nsadmin  13523  1.0 31.4 253508 244200 ? S   15:50   4:12 bin/nsd -i -t nsd
nsadmin  16186  2.0 31.4 253508 244200 ? S   16:38   6:50 bin/nsd -i -t nsd
nsadmin  16218  0.9 31.4 253508 244200 ? S   16:39   3:09 bin/nsd -i -t nsd
nsadmin  16387  0.9 31.4 253508 244200 ? R   16:41   3:01 bin/nsd -i -t nsd
nsadmin  18673  0.9 31.4 253508 244200 ? S   17:19   2:44 bin/nsd -i -t nsd
nsadmin  18674  0.9 31.4 253508 244200 ? S   17:19   2:49 bin/nsd -i -t nsd
nsadmin  18675  0.9 31.4 253508 244200 ? S   17:19   2:53 bin/nsd -i -t nsd
nsadmin  19433  0.8 31.4 253508 244200 ? S   17:31   2:21 bin/nsd -i -t nsd
nsadmin  23954  0.8 31.4 253508 244200 ? S   18:34   1:49 bin/nsd -i -t nsd
nsadmin  25704  0.7 31.4 253508 244200 ? S   19:01   1:31 bin/nsd -i -t nsd