Ben Tilly wrote:
> Tom Metro wrote:
>> Any recommendation for tools to do this?
> 
> Devel::Leak and Devel::LeakTrace are my best suggestion.

Devel::LeakTrace hasn't had a release since 2003. I see a
Devel::LeakTrace::Fast released in 2007, which says it is a rewrite of
the former.

http://search.cpan.org/~andya/Devel-LeakTrace-Fast-0.11/lib/Devel/LeakTrace/Fast.pm

  Devel::LeakTrace::Fast...trace[s] SV allocations of a running program.

  At END time Devel::LeakTrace::Fast identifies any remaining variables,
  and reports on the lines in which the came into existence.

Devel::Leak does the same thing, except you get to pick when it starts
and stops examining the allocations.

It seems like either of these would produce a flood allocations, most of
which would be expected, and neither would tell you anything in an
actual OOM failure scenario. I could see Devel::Leak being useful if
applied to a small chunk of code where you anticipated finding leaks. In
better designed code, you could then widen the scope as you cleaned up
leaks.


> If you are constantly growing memory usage, this can help figure out why.

As noted in my response to Uri, memory growth over the run is expected,
and my suspicion is that recent changes are more likely to have
introduced problems of storage inefficiencies, rather than leaks.

I'd like to be able to answer two questions:
1. In the immediate term, which variables are using significantly more
RAM in the new version of the code compared to the old, and
2. In the long term, which variables are consuming the most memory, such
that optimization efforts can be focused there.


>> The ideal solution would be something that could hook the OOM exception
>> and dump the symbol table along with stats for how much memory each
>> symbol is occupying. Another useful possibility would be dumping the
>> call stack.
> 
> The symbol table is not enough.  It doesn't see data in lexical
> variables. 

I guess I didn't mean literally the symbol table. The heap, with perl
identifiers.


> And figuring out how much memory an array or hash may be
> taking is easier said than done, because doing it means walking the
> array or hash and figuring that out.  But with circular data
> structures you have to keep track of where you have been, which
> requires somewhere to stick that information, but you're already out
> of memory.

I guess that's where the emergency buffer would come into play.

Anyway, I think there are other ways of getting good clues as to where
the problem is without having to trap the OOM error.


>> It seems to be an unusual error in that you often see
>> multiple of them, as if the first few are warnings, and then eventually
>> it is fatal.
> 
> Random possibilities.  Could it be that Perl not always check whether
> it got memory when asked?  So you don't crash until you ask for memory
> somewhere that checked properly.

You wouldn't think it would print "OOM!" if it wasn't checking the
return from malloc.


> Or perhaps Perl doesn't exit on asking for memory, but crashes when it
> tries to use memory it doesn't have.

That's certainly possible, but there's no segfault error, and the
documentation says OOM should be a fatal exit.


> Either way I think you are better off inserting things that drop debug
> state every so often, and then figure out what is growing, and try to
> narrow it down.  

Conor Walsh's mention of Devel::Size suggests one approach: I could log
the size of a handful of the most suspect variables. Apply that to both
the current and last known working version of the code. Then compare
each version running with identical input data to see what has changed.

If only there was a more automated way of doing this...ah, here we go:
http://search.cpan.org/~cgautam/Devel-DumpSizes-0.01/lib/Devel/DumpSizes.pm

  Devel::DumpSizes - Dump the name and size in bytes (in increasing
  order) of variables that are available at a give point in a script.

  This module was written while debugging a huge long running script.
  The main use being to understand how variable sizes were fluctuating
  during script execution. It uses PadWalker and Devel::Symdump to get
  the variables. It uses Devel::Size to report the size of each
  variable.

That's the ticket...if it works.


> Good luck.

Thanks, and thanks to all for the suggestions.

 -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/

_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to