At 15:17 -0500 1/9/04, Perrin Harkins wrote:
On Fri, 2004-01-09 at 14:52, Stas Bekman wrote:
We really need more real world benchmarks to make a good judgement. It's
probably quite certain that the performance is going to be worse if you spawn
threads, but don't deploy the benefits available exclusively to threads
> (shared opcode tree, shared vars, etc). That reminds me, does anyone know what happened with the shared opcode tree? Does it not work, or is it just dwarfed by the size of the non-shared stuff? The size problems these guys are having seem to point to little or no sharing happening between threads.
I'm sure you know my PerlMonks article "Things yuu need to know before programming Perl ithreads" ( http://www.perlmonks.org/index.pl?node_id=288022 ).
I recently ran a little test that showed (at least to Devel::Size) that you have _at least_ about 250Kbyte of "data" that needs to be copied between threads if you _only_ do:
use threads; use threads::shared;
And I'm not sure whether this number isn't too low, because I don't know for sure whether the CV's in the stash haven't been counted correctly. If they were not, then you would come at about 400Kbyte of "data" for a _bare_ thread.
Loading a few modules, each with their initializations, add up _very_ quickly to several Mbytes of "data" that needs to be cloned _every_ time you start a thread. And these are _not_ simple copies: all of the stashes need to be walked to make sure that all the [SAHC]V's are properly copied to the thread's copy. So it's taking a _lot_ of CPU as well...
So yes, in general I think you can say that the data copied for each thread, quickly dwarves whatever optrees are shared.
How is this different from fork? When you fork, OS shares all memory pages between the parent and the child. As variables are modified, memory pages become dirty and unshared. With forking mutable (vars) and non-mutable (OpCodes) share the same memory pages, so ones a mutable variable changes, the opcodes allocated from the same memory page, get unshared too. So you get more and more memory unshared as you go. in the long run (unless you use size limiting tools) all the memory gets unshared.
With ithreads, opcode tree is always shared and mutable data is copied at the very beginning. So your memory consumption should be exactly the same after the first request and after 1000's request (assuming that you don't allocate any memory at run time). Here you get more memory consumed at the beginning of the spawned thread, but it stays the same.
So let's say you have 8MB Opcode Tree and 4MB of mutable data. The process totalling at 12MB. Using fork you will start off with all 12MB shared and get memory unshared as you go. With threads, you will start off with 4MB upfront memory consumption and it'll stay the same. Now if in your fork setup you braket at 8MB with a size limiting tool to restart, you will get the same 4MB overhead per process. Besides equal memory usage you get better run-time performance with threads, because it doesn't need to copy dirty pages as with forks (everything was done at the perl_clone, which can be arranged long before the request is served) (and you get a slowdown at the same time because of context management).
So, as you can see it's quite possible that threads will perform better than forks and consume equal or less amount of memory if the opcode tree is bigger than the mutable data.
__________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:[EMAIL PROTECTED] http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com
-- Reporting bugs: http://perl.apache.org/bugs/ Mail list info: http://perl.apache.org/maillist/modperl.html