Title: RE: Name advice: check license of dependencies

> On a side note, I discovered that using Data::Dumper on my output 
> object causes memory use to go through the roof.  I think 
> Data::Dumper is chasing into the CPANPLUS data structures and 
> thrashing my machine.  Is anyone familiar enough with CPANPLUS 
> internals to know whether Data::Dumper problems are well
> known, or if  I've stumbled on some new bug?

Assuming you are on Win32 then yes this is definitely a well known bug. The main problem is that under normal Win32 builds perl uses the OS'es malloc/realloc which doesn't seem to be smart enough to just expand the previously allocated buffer when possible. This means that every time DD appends part of the data structure it has to copy the entire existing structure. A second problem is that DD needs to catalog every single SV that it encounters in order to detect reference cycles, if there are many SV's involved this can be a lot of metadata.

Its worth noting that on Win32 many times setting the $Data::Dumper::Useqq=1; (or in later versions the $Data::Dumper::Useperl=1;) will force DD not to use the XS implementation. It seems the pureperl code doesn't suffer from this performance degradation as badly so often a dump that will overflow your available memory in XS will finish in a reasonable time in Pureperl.

Another option is to try using Data::Dump::Streamer instead. DDS takes longer to dump on average but never degrades like DD does as it doesn't build its output in memory before outputting unless specifically asked to do so. The fact its easier to read and much more accurate and correct than DD is another reason to consider it. (It can dump closures properly, including enclosed vars!)

BTW, there is a last case where DD has real problems. It relates to pseudo hashes and a rather insidious bug:

 my @hash_list=({foo=>[]});
 my $x=$hash_list{foo};

This will cause perl to use the address of [] as the index in the @hash_list to do the pseudo hash lookup on. Which can result in the array being extended to a huge size. Its possible that perl will be ok with this, but when DD goes to build an in memory string with several million undefs in it it gets really unhappy for obvious reasons. DDS otoh doesn't suffer from this problem as several million undefs in an array are emitted as a list constructor like (undef) x $count, so while the dump will take a long time, the memory usage will be low and the program will terminate without exhausing available ram.

Yves

Reply via email to