On Thursday 15 December 2005 03:35 pm, Perrin Harkins wrote:
> Not a great way to start your post. Please read this:
> http://modperlbook.org/html/ch10_01.html
My apologies. I do own the book and I have read it, but it was some time ago
and I didn't remember that some of my questions were addressed. My biggest
question was the last one pertaining to the possible trick to optimize
memory. I'm starting to realize after reading over my original post (and your
reply) that I could have been more clear with some of my questions.
> > Also - as of the current Perl 5.8 series, we're still not sharing /
> > doing COW with variable memory (SV/HV/AV) right?
>
> COW doesn't care what's in the memory. It will share pages with
> variables in them if they don't change. Even reading a variable in a
> new context can change it in perl though.
Well, I should have been more clear here that I was no longer talking about
prefork. What I meant by COW here was actually that if you were going to use
multiplicity with Perl_clone, since the threads would share memory, Perl
would either need to:
(a) create a copy of all variables;
or
(b) use a thread sync primitive like an rwsem and do its own COW on this data.
If the latter were true (perl doing its own COW under multiplicity), it would
seem to me that reading a variable shouldn't automatically require the
variable to be copied (except perhaps under SvMAGIC), though I understand
that if you were using prefork and the OS was doing COW, if you so much as
wrote a byte, you'd invalidate the page.
I remember there being talks in the mod_perl dev docs somewhere about a
GvSHARED proposal... this question was just 'did this ever see the light of
day?'
> How much data are we talking about? If it's more than a few MBs, I
> wouldn't recommend it. Just put it in MySQL, or Cache::FastMmap, or
> BerkeleyDB, and then access only the bits you need from each process.
Oh no, nothing on that order. Each structure consists of a base hash, with
let's say 20 keys, and perhaps a few nested hashes under these keys. What I'm
not clear yet on in my study of perlguts is how much overhead is associated
with a hash, because while the individual structures aren't huge, there could
be a lot of them.
(In essence it's a very flexible data modeller whose models are created during
startup by a ordered dependency launcher 'plugin' system. Certain 'plugins'
if you will can decorate existing models, so I maintain the hash structure to
describe all the fields of the model).
Here's another way to put my question... in the case that I do:
sub _get_Model {
return {
hash => [qw(stuff here)],
};
}
is this root hash, its keys / values / etc stored along with variables, or is
it in the op tree? IE, without GvSHARED, would a Perl_clone under
multiplicity create a separate copy of this structure per thread?
I can't put the data in SQL for performance reasons. Cache::FastMmap, despite
being fast, would worry me as well, especially given that the structures
would have to be serialized/deserialized. At that point, I'd rather just keep
a copy per thread as I currently do. And BDB would be dependency creep, which
is quite difficult to fight in perl sometimes (that's just because there are
a zillion useful CPAN modules) :)
I'm just always looking for ways to optimize really. I've almost completed a
multithreaded SIP server in C that provides a perl environment, sort of a
mod_perl for SIP if you will. This is unfortunately stalled due to $work at
the moment. In the process of doing so, I found both the mod_perl 2 code and
the perl code itself to be enlightening, but honestly the perl code is
beastly and takes time getting used to :)
I find myself wishing there were a comprehensive perlguts book.
> - Perrin
Thanks again.
Cheers,
Chase