It appears that my original send of this e-mail (sent from the new AOL communicator) was eaten by some spam filters (for reasons only partially understood). I am attempting to resend using AOL 8.0, so if people have problems with this copy as well, please let me know. Meanwhile, I apologize for the redundant send.

-Elizabeth
-----------------------------------------------------------------------------------------------------------------------
Calling all gurus - any advice is greatly appreciated....

We have a server that loads in a great deal of tcl - so much so that the resulting init script is 6.6M in size and contains over 9900 procs. In going from AOLserver 3.5.5 (tcl 8.4.4) to AOLserver 4.0 (tcl 8.4.4) we have seen two unexpected effects:

1. The startup of an individual thread has more than doubled in time (to ~4 seconds)
2. The slowdown gets even worse with multiple threads. On server initialization, we start a number of threads simultaneously  (~30-40) and have seen the overall server start time seriously degrade (to ~15 minutes)

Item number one is still a bit of a mystery. The init script is not dramatically different between our 3.5 server and our 4.0 server. A somewhat unscientific attempt to load the 4.0 init script into a 3.5
server showed a similar slowdown, so we do not currently believe that this slowdown is due to a change in the AOLserver core thread initialization code between 3.5 and 4.0 (but we haven't totally ruled
that out). The one difference we do know of is the use of tcl packages, but its not yet clear to us how that might dramatically affect the initialization time.

Based on data from various tools (quantify, pstack, etc.) - item number two seems to be due to a huge amount of malloc contention as the init scripts are simultaneously evaluated. Since, at the time of the  processing, the thread is being initialized, the thread memory allocator does not help avoid this contention since it must malloc to populate the pool. We have implemented a less-than-ideal workaround by adding the optional 'tclinitlock' ns parameter that forces serialization of the thread initialization (thus reducing the malloc thrashing and cutting our initialization time from 15 minutes to 6) - but this is obviously not an ideal solution. Call stacks of the two most prevalent areas of contention are included at the end of this e-mail.

For grins, we have tried some experiments with running without the thread memory allocator, and running with hoard - but this has just made the performance worse.

We are then left with two problems to solve:

1. Is there any way to reduce the time to initialize an interp? (besides the obvious, but not necessarily feasible, option of reducing the procs that are loaded). Has anyone seen similar behavior and have some insight into it?

2. Is there any way to avoid the malloc contention, but still allow concurrent processing?

Our investigation is continuing, but this is really the last known issue before we feel we can declare 4.0 ready for prime time, so any help is greatly appreciated.

Thanks,
-Elizabeth

Call Stacks from malloc contention:

ff099ae0 lwp_sema_wait (f818be78)
feecb288 _park    (f818bdc0, f818be78, 0, feef8a6c, f458bdc0, 0) + 10c
feecacc4 _swtch   (5, feeec9ac, f818be54, f818be50, f818be4c, f818be48) + 128
feecc814 _mutex_adaptive_lock (ff0b9fc8, 4c00, feeec9ac, 1, 4d58, fffeffff) + 120
feecc5c4 _cmutex_lock (ff0b9fc8, ff, feeec9ac, ff045724, 0, 0) + 50
ff045724 malloc   (3f9c, 0, fc, 8, f818b30c, f818b308) + 18
ff2612e0 GetBlocks (6ecb7a8, 8, e0, 0, 1c, 0) + 484
ff25fd48 TclpAlloc (a25, 1b3, feeec9ac, f818adec, 0, ff00) + 124
ff1bc6ec Tcl_Alloc (a25, a25, f818ace4, f818ac08, f818ac88, 2dae8a78) + 18
ff25cfbc Tcl_NewStringObj (1c776170, a24, ff2a857c, ff21500c, ff283ec0, 0) + bc
ff254884 TclCreateProc (6e449e8, 3da7c88, 164637e8, 6ffef70, 2247b4c0, f818af44) + bc
ff2544b4 Tcl_ProcObjCmd (0, 6e449e8, 4, 6eaba74, ff254260, 0) + 254
ff1b4e14 TclEvalObjvInternal (6e449e8, 4, 6eaba74, 0, 0, 0) + 4dc
ff1fb95c TclExecuteByteCode (6e449e8, 297f3a38, ff1e7c7c, 1000, f818b30c, f818b308) + 13e0
ff1fa478 TclCompEvalObj (6e449e8, 715c690, ff2420a8, 3a000000, f818b4d0, f818b4d4) + 224
ff1b6654 Tcl_EvalObjEx (6e449e8, 715c690, 0, 0, ff28b768, 0) + 13c
ff240de4 NamespaceEvalCmd (0, 6e449e8, 4, f818b728, 0, f818b5e4) + 170
ff24019c Tcl_NamespaceObjCmd (0, 6e449e8, 4, f818b728, ff23fffc, f818b674) + 1a0
ff1b4e14 TclEvalObjvInternal (6e449e8, 4, f818b728, 1672e112, 644455, 0) + 4dc
ff1b5d58 Tcl_EvalEx (6e449e8, 1672de38, 64c979, 20000, 0, 734c5c5) + 318
ff34d01c UpdateInterp (734c490, 1, ff34d06c, 734c490, ff2a8a1c, 0) + 68
ff34cf58 InitInterp (6e449e8, 406e0, f818ba64, ffffffff, fffffff8, 734ca20) + 1b8
ff34bc8c Ns_TclAllocateInterp (34c98, ff38620c, ff38620c, f818bbc8, 64, f818bcb8) + b0
ff34b914 Ns_TclEval (f818bb68, 34c98, 7447540, 332d0000, 332d, 7438170) + 1c
ff35be74 NsTclThread (7447538, 7438158, ff35be08, ff2e7d10, 0, 0) + 6c
ff2e4ee0 NsThreadMain (7438158, 0, 0, 0, 0, 0) + 8c
ff2e76ac ThreadMain (7438158, 0, 1, feeee000, 1, feeec9ac) + c
feedbc7c _thread_start (7438158, 0, 0, 0, 0, 0) + 40


ff099ae0 lwp_sema_wait (e798be78)
feecb288 _park    (e798bdc0, e798be78, 0, feef8a6c, ec18bdc0, 0) + 10c
feecacc4 _swtch   (5, feeec9ac, e798be54, e798be50, e798be4c, e798be48) + 128
feecc814 _mutex_adaptive_lock (ff0b9fc8, 4c00, feeec9ac, 1, 4d58, fffeffff) + 120
feecc5c4 _cmutex_lock (ff0b9fc8, ff, feeec9ac, ff045724, e0, 0) + 50
ff045724 malloc   (4933, 3f9c, 3c00, 27c56, 1c, 0) + 18
ff25fc9c TclpAlloc (492a, b6, 4929, 4929, a, a000000) + 78
ff1bc6ec Tcl_Alloc (492a, 492a, 2ca92e60, 4, 74, 30fa6758) + 18
ff239cdc TclRegisterLiteral (e798aa40, 2a55545d, 4929, 0, e798a830, 0) + 364
ff1e85c8 TclCompileScript (2615f930, 2a315390, 644440, 0, e798aa40, 2615cabd) + 588
ff1e7a58 TclSetByteCodeFromAny (2615f930, 261619b0, 0, 0, 1c, 0) + a0
ff1e7c98 SetByteCodeFromAny (2615f930, 261619b0, ff1e7c7c, 1000, e798b2cc, e798b2c8) + 1c
ff1fa30c TclCompEvalObj (2615f930, 261619b0, ff2420a8, 3a000000, e798b490, e798b494) + b8
ff1b6654 Tcl_EvalObjEx (2615f930, 261619b0, 0, 0, ff28b768, 0) + 13c
ff240de4 NamespaceEvalCmd (0, 2615f930, 4, e798b6e8, 0, e798b5a4) + 170
ff24019c Tcl_NamespaceObjCmd (0, 2615f930, 4, e798b6e8, ff23fffc, e798b634) + 1a0
ff1b4e14 TclEvalObjvInternal (2615f930, 4, e798b6e8, 1672e112, 644455, 0) + 4dc
ff1b5d58 Tcl_EvalEx (2615f930, 1672de38, 64c979, 20000, 0, 2615cabd) + 318
ff34d01c UpdateInterp (2615c988, 1, ff34d06c, 2615c988, ff2a8a1c, 0) + 68
ff34cf58 InitInterp (2615f930, 406e0, e798ba24, 7efefeff, 81010100, ff00) + 1b8
ff34bc8c Ns_TclAllocateInterp (34c98, 2262c, 0, 0, 48ad678, ff3688f8) + b0
ff34b914 Ns_TclEval (0, 34c98, 15bfab00, e798bb18, 3a, e798bc28) + 1c
ff354b7c EvalCallback (15ca08e0, e798bc34, ffffffff, 32332d00, 2d00, 48ad678) + 24
ff354c80 NsTclSchedProc (15ca08e0, 26f, 12eac060, ff354c70, 362d, 48ad678) + 10
ff33e5c0 EventThread (6, 48ad660, ff33e3a4, ff2e7d10, 0, 0) + 21c
ff2e4ee0 NsThreadMain (48ad660, 0, 0, 0, 0, 0) + 8c
ff2e76ac ThreadMain (48ad660, 0, 1, feeee000, 1, feeec9ac) + c
feedbc7c _thread_start (48ad660, 0, 0, 0, 0, 0) + 40


-----------------------------------------------
Elizabeth Thomas
Principal Software Engineer
America Online, Inc.




-- AOLserver - http://www.aolserver.com/

To Remove yourself from this list, simply send an email to <[EMAIL PROTECTED]> with the body of "SIGNOFF AOLSERVER" in the email message. You can leave the Subject: field of your email blank.

Reply via email to