https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=37020
--- Comment #19 from Leo Stoyanov <leo.stoya...@bywatersolutions.com> --- Created attachment 176921 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=176921&action=edit Bug 37020: [alternate] Changed script to use XML:Twig to reduce memory usage. In the original script, MARC::Batch->new('XML', ...) was holding onto large chunks of parsed data for MARCXML files, resulting in excessive memory usage even after the script tried to free references. A hacky approach was to manually clear the batch’s internal structures, but Data::Dumper showed that MARC::Batch wasn’t storing data in those specific fields. Hence, the hack offered no solution to the underlying caching. By contrast, XML::Twig can stream XML elements one by one, calling a handler and then discarding the parsed chunk. Note, there is no guarantee this implementation works for non-XML files as it stands (although, in theory it should). The focus is on the XML:Twig implementation for reference as a solution. Overall, batching seems to be eating up memory. To test: 1. Run perl misc/migration_tools/bulkmarcimport.pl -m=MARCXML -b -d -v --commit=1000 --file=file_path_here on a large MARCXML/XML file (for example, 2 GB or greater). 2. On whatever machine or container it is ran, the script will likely cause an out-of-memory error and crash the environment. 3. Apply the patch, run "restart_all", and redo step 1. The script should utilize much less memory to import records from MARCXML/XML files. -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/