I wrote about ZendMM some time ago
(http://julien-pauli.developpez.com/tutoriels/php/internals/zend-memory-manager/)
, that's in french language ;-)

To shorten the conversation a little bit, I would suggest to trace the
memory with valgrind/massif. That's not too hard if you know what you
do, if not, then it can take some time.

Basically, Johannes gave some good hints; but memory management is a
hard task to compute and deal with, I suggest you dont try to figure
out how many memory it "would" take, as the computation is really too
hard to be accurate. Only memory debuggers will show you exactly what
happens.

BTW, there might be a little leak inside token_get_all() as it doesn't
seem to free memory it allocated. Not very easy to find as it plays
with lex scanner.

Julien.P

2011/6/8 Johannes Schlüter <johan...@schlueters.de>:
> On Tue, 2011-06-07 at 21:03 +0200, David Zülke wrote:
>> 144 (not 114!) bytes is for an integer; I'm not quite sure what the
>> overheads are for arrays, which token_get_all() produces in
>> abundance :) An empty array seems to occupy 312 bytes of memory.
>>
>> Also, strings have memory allocated in 8 byte increments as far as I
>> know, so "1" eats up 8 bytes, and "12345678901234567" will consume 24
>> bytes for the raw text, not 17.
>
> I'm too lazy to do the actual math (well best would be to do
> sizeof(zval), sizeof(HashTable), sizeof(Bucket) on your system) and
> there are few things to consider:
>
>      * The sizes are different from 32 bit and 64bit; with 64bit
>        there's a difference between Windows and Unix/Linux (on Win a
>        long will still be 32 bit, but pointers 64 bit, on Linux/Unix
>        both are 64bit)
>      * On some architectures memory segments have to be aligned in some
>        way which might waste memory
>      * As David mentioned HashTables (Arrays) are more complex.
>      * token_get_all() returns an array of (string | array of (long,
>        string, long) )
>      * A long takes sizeof(zval)
>      * A string takes sizeof(zval)+strlen()+1
>      * and array is a HashTable + space for buckets, this includes
>        place for some not used elements
>      * Each element inside the HT needs additional space for a Bucket
>        with some meta data
>      * While running your script you also keep the complete script file
>        in memory. You also keep some temporary parser data in memory
>        while the resulting array is being filled.
>
> In the end it's not fully trivial to gather the size needed. And I'm
> sure my list is missing loooots of things.
>
> http://schlueters.de/blog/archives/142-HashTables.html has an short
> introduction to HashTables. Skipping many of the details.
>
> johannes
>
>> David
>>
>>
>> On 07.06.2011, at 20:26, Mike van Riel wrote:
>>
>> > Am i then also correct to assume that the output of
>> > memory_get_peak_usage is used for determining the memory_limit?
>> >
>> > Also: after correcting with your new information (zval = 114 bytes
>> > instead of 68) I still have a rather large offset:
>> >
>> >    640952+2165950+114+(276697*114)+(276697*3*114)+2165950 = 131146798 =
>> > 125M
>> >
>> > (not trying to be picky here; I just don't understand)
>> >
>> > _If_ my calculations are correct then a zval should be approx 216 bytes
>> > (excluding string contents):
>> >
>> >    ((244000000-640952-2165950-2165950) / 4) / 276697 = 215.9647B
>> >
>> > Mike
>> >
>> > On Tue, 2011-06-07 at 19:50 +0200, David Zülke wrote:
>> >> memory_get_peak_usage() is the maximum amount of memory used by the VM of 
>> >> PHP (but not by some extensions for instance) up until the point where 
>> >> that function is called. So the actual memory usage may be even higher 
>> >> IIRC. But yeah, you're basically right. I've explained in another message 
>> >> why it might be so much more than you expected (zval overhead, basically)
>> >>
>> >> David
>> >
>> >
>> >
>>
>
>
>

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to