Hello All, Thank tou for your answers. I think you're both right about handling files by chunks instead of loading a huge file into memory at once. However, there's definitely something wrong in the way that XML::Simple manages memory usage, which is my point.
The file I'm loading is a 124 MB file. There's no reason at all to reach 3 GB in memory allocation. Look at this. The XML file I'm loading is a simple catalog that contains catalog elements and each element contains properties like price and other stuff. Today, I made a test with XML::Simple. Instead of making it parse the whole file, I opened the file in my script and started taking element by element (<item>.*?</item>). Each element I took, I parsed it with XML::Simple, which returns a hash reference. Then, I added that hash reference to a main hash I created at the begining of the script. At the end, I got the whole catalog into my main hash, exactly as XML::Simple should have done it with the whole file. I looked at memory usage while my script was working and It never took more than 150 MB of memory (and that was at the end), which is the logical thing to expect. Also, the script worked very fast (less than 3 minutes) for all 30,000 catalog elements. Anyway, thank you very much for your suggestions, I'm using them too to get the best solution working. Cheers, Paco Zarabozo A. -------------------------------------------------- From: "Jenda Krynicky" <[email protected]> Sent: Monday, December 15, 2008 10:09 AM To: "Active State Perl Mailing List" <[email protected]> Subject: Re: Perl "Out of Memory!" Issue > Don't. While XML::Simple is kinda nice if the XML is fairly small and > simple, once it grows big you'd better reach for a different tool. > The "parse the whole XML into a maze of objects"-style modules are > out of question in this case, their memory footprint would be most > likely even bigger So you are left with those that let you process > the file in chunks. Either stream based parsers like the SAX modules > and XML::Parser (IMHO, they lead to code that's hard to understand > and debug), parsers that let you specify what tag encloses a > digestible chunk and then give you the data of that tag+content one > at a time like XML::Twig. Or a module that lets you filter the tags > as they are encontered, transform the data structure as it's built > and process the datastructure at whatever level(s) that's convenient > like XML::Rules. -------------------------------------------------- From: "Christian Walde" <[email protected]> Sent: Sunday, December 14, 2008 11:53 PM To: "Zarabozo, Francisco (GE, Corporate)" <[email protected]>; "Active State Perl Mailing List" <[email protected]> Subject: Re: Perl "Out of Memory!" Issue > That's solved in a simple manner. Get a module that does doesn't load the > entire file into memory at once. Such as this: > http://search.cpan.org/~mirod/XML-Twig-3.32/Twig.pm _______________________________________________ ActivePerl mailing list [email protected] To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
