At 13:26 -0800 1/9/04, Stas Bekman wrote:
Elizabeth Mattijsen wrote:
I'm sure you know my PerlMonks article "Things yuu need to know
before programming Perl ithreads" (
http://www.perlmonks.org/index.pl?node_id=288022 ).
So yes, in general I think you can say that the data copied for
each thread, quickly dwarves whatever optrees are shared.
How is this different from fork? When you fork, OS shares all memory
pages between the parent and the child. As variables are modified,
memory pages become dirty and unshared. With forking mutable (vars)
and non-mutable (OpCodes) share the same memory pages, so ones a
mutable variable changes, the opcodes allocated from the same memory
page, get unshared too. So you get more and more memory unshared as
you go. in the long run (unless you use size limiting tools) all the
memory gets unshared.
Well, yes. But you forget that when you load module A, usually
modules B..Z are loaded as well, hidden from your direct view. And
Perl has always taken the approach of using more memory rather than
more CPU. So most modules are actually optimized by their authors to
store intermediate results in maybe not so intermediate variables.
Not to mention, many modules build up internal data-structures that
may never be altered. Even compile time constants need to have a CV
in the stash where they exist, even though they're optimized away in
the optree at compile time. And a CV qualifies as "data" as far as
threads are concerned.
With ithreads, opcode tree is always shared and mutable data is
copied at the very beginning. So your memory consumption should be
exactly the same after the first request and after 1000's request
(assuming that you don't allocate any memory at run time). Here you
get more memory consumed at the beginning of the spawned thread, but
it stays the same.
Well, I see it this way: With threads, you're going to get the hit
for everything possible at the beginning. With fork, you get hit
whenever anything _actually_ changes. And spread out over time. I
would take fork() anytime over that.
So let's say you have 8MB Opcode Tree and 4MB of mutable data. The
process totalling at 12MB. Using fork you will start off with all
12MB shared and get memory unshared as you go. With threads, you
will start off with 4MB upfront memory consumption and it'll stay
the same.
But if you start 100 threads, you'll 400 MByte, whereas fork 100
times, you'll start off witb basically 12 MByte and a bit. Its the
_memory_ usage that is causing the problem.
On top of that, I think you will find quite the opposite in the
amount of OpTree and mutable data usage. A typical case would easier
be something like 4MB of optree and 8MB of mutable data.
To prove my point, I have taken my Benchmark::Thread::Size module
(available from CPAN) and tested the behaviour of POSIX with and
without anything exported.