Re: Questions about optimizing memory usage

Chase Venters Fri, 16 Dec 2005 01:33:59 -0800

On Thursday 15 December 2005 03:35 pm, Perrin Harkins wrote:
> Not a great way to start your post.  Please read this:
> http://modperlbook.org/html/ch10_01.html


My apologies. I do own the book and I have read it, but it was some time ago 
and I didn't remember that some of my questions were addressed. My biggest 
question was the last one pertaining to the possible trick to optimize 
memory. I'm starting to realize after reading over my original post (and your 
reply) that I could have been more clear with some of my questions.

> >     Also - as of the current Perl 5.8 series, we're still not sharing /
> > doing COW with variable memory (SV/HV/AV) right?
>
> COW doesn't care what's in the memory.  It will share pages with
> variables in them if they don't change.  Even reading a variable in a
> new context can change it in perl though.

Well, I should have been more clear here that I was no longer talking about 
prefork. What I meant by COW here was actually that if you were going to use 
multiplicity with Perl_clone, since the threads would share memory, Perl 
would either need to:

(a) create a copy of all variables;
or
(b) use a thread sync primitive like an rwsem and do its own COW on this data.

If the latter were true (perl doing its own COW under multiplicity), it would 
seem to me that reading a variable shouldn't automatically require the 
variable to be copied (except perhaps under SvMAGIC), though I understand 
that if you were using prefork and the OS was doing COW, if you so much as 
wrote a byte, you'd invalidate the page.

I remember there being talks in the mod_perl dev docs somewhere about a 
GvSHARED proposal... this question was just 'did this ever see the light of 
day?'

> How much data are we talking about?  If it's more than a few MBs, I
> wouldn't recommend it.  Just put it in MySQL, or Cache::FastMmap, or
> BerkeleyDB, and then access only the bits you need from each process.

Oh no, nothing on that order. Each structure consists of a base hash, with 
let's say 20 keys, and perhaps a few nested hashes under these keys. What I'm 
not clear yet on in my study of perlguts is how much overhead is associated 
with a hash, because while the individual structures aren't huge, there could 
be a lot of them.

(In essence it's a very flexible data modeller whose models are created during 
startup by a ordered dependency launcher 'plugin' system. Certain 'plugins' 
if you will can decorate existing models, so I maintain the hash structure to 
describe all the fields of the model).

Here's another way to put my question... in the case that I do:

sub _get_Model {
        return {
                hash => [qw(stuff here)],
        };
}

is this root hash, its keys / values / etc stored along with variables, or is 
it in the op tree? IE, without GvSHARED, would a Perl_clone under 
multiplicity create a separate copy of this structure per thread?

I can't put the data in SQL for performance reasons. Cache::FastMmap, despite 
being fast, would worry me as well, especially given that the structures 
would have to be serialized/deserialized. At that point, I'd rather just keep 
a copy per thread as I currently do. And BDB would be dependency creep, which 
is quite difficult to fight in perl sometimes (that's just because there are 
a zillion useful CPAN modules) :)

I'm just always looking for ways to optimize really. I've almost completed a 
multithreaded SIP server in C that provides a perl environment, sort of a 
mod_perl for SIP if you will. This is unfortunately stalled due to $work at 
the moment. In the process of doing so, I found both the mod_perl 2 code and 
the perl code itself to be enlightening, but honestly the perl code is 
beastly and takes time getting used to :)

I find myself wishing there were a comprehensive perlguts book.

> - Perrin

Thanks again.

Cheers,
Chase

Re: Questions about optimizing memory usage

Reply via email to