Hi Christopher, thanks for the script. It gives us a first hint on what may go on internally. Nex ,some profiling output could be helpful, so could you please run the complete script with the following Java option..
java -Xrunhprof:cpu=samples,depth=15 ... ..and send me the java.hprof.txt file, which will be stored in the directory the code is started from? The Java profiler also provides a "heap" option (see -Xrunhprof:help), but I don’t actually know how to reasonably interpret the output.. For testing purposes, it is sometimes helpful to further reduce the amount of memory that’s assigned to the JVM (via -Xmx...m). Best, Christian ___________________________ 2013/6/12 Christopher.Ball <[email protected]>: > Alex, > > Here is the script (.bxs file) contents in its partitioned form (broken out > into 6 seperate scripts rather than one script): > > SET STRIPNS true > SET ADDCACHE true > SET TEXTINDEX false > SET ATTRINDEX false > > OPEN Release-Canonicals-Comparative > > XQUERY db:output('
 -- ' || current-time() || ' -- 
') > > XQUERY db:output("
#12") > SET BINDINGS > $db=Release-Canonicals-Comparative,$containerSetStart=110001,$containerSetCount=10000 > RUN ..\webapp\release-identification\xquery\generate-comparison-db.xq > > XQUERY db:output("
#13") > SET BINDINGS > $db=Release-Canonicals-Comparative,$containerSetStart=120001,$containerSetCount=10000 > RUN ..\webapp\release-identification\xquery\generate-comparison-db.xq > > XQUERY db:output('
 -- ' || current-time() || ' -- 
') > > If I run it in this partitioned form, it is quite fast, roughly 5 minutes > per "RUN" command. If I concatenate them all together, it progressively > slows down, consumes all the available memory, starts to swap memory to > disc, cpu climbs to 100% and eventually fails with a memory error. > > Christopher > > > > > On 6/12/2013 2:45 AM, Alexander Holupirek wrote: > > Christopher, > > it may be sufficient if you can pass the script (.bxs) file that you use to > process the data. > > Would that be possible? > Alex > > On 12.06.2013, at 02:46, "Christopher.Ball" > <[email protected]> wrote: > > Christian, > So I have finally upgraded to BaseX 7.7 and found I am still having the out > of memory issue. > > Given the size and nature of the data I am working with I am at a loss of > how to provide you with a simple example that replicates the problem. > > On the flip side, one behavior I am noticing is that breaking the work in to > discrete chunks in separate batch scripts gives dramatically faster > performance and avoids the memory error. This strongly suggests that > something is preventing garbage collection between unrelated tasks in a > batch script. > > Is there any way I can force garbage collection in a batch script? I tried > closing and reopening databases but that had no effect (actually shocked > that it did not). > > Let me know, > > Christopher > > On 5/20/2013 6:24 AM, Christian Grün wrote: > > Hi Christopher, hi Ben, > > yes, this sounds like unwanted behavior, and I believe it should be > fixable as the commands scripts I’ve been working with didn’t cause > memory leaks. I’ll be glad to track down the possible issues. Could > (one/both) of you pass me on a script that causes the problems? > > Christian > > PS: I would be grateful if you could additionally check if the problem > persists in the latest stable snapshot. > ___________________________ > > On Mon, May 20, 2013 at 10:33 AM, Ben Companjen > <[email protected]> > wrote: > > I recognise your problem, and reported it, but never got back to it > with more details. I used BaseX client/server 7.5 beta. My first > database contained 2.7 million documents, but I created a new one from > an exported subset of 700k documents. That helped lower the memory use > directly after loading the DB. > > Any chance you use the SQL module in your processing? > > My guess was that it had been a design choice to keep previously > opened documents from a database in use in memory. But running out of > memory probably wasn't ;) > > Ben > > On 20 May 2013 04:32, Christopher.R.Ball > <[email protected]> > wrote: > > I have a BaseX script (.bxs) I am running that does queries in batches (sets > of 5k documents), but as it progresses it bogs down in speed, does not > release memory between sets even if I force it to close and reopen the db > between queries, and eventually runs out of memory. > > But, if I break the same BaseX script into separate files still doing the > same exact batches it is extremely fast and memory efficient. > > Very suggestive of a memory leak . . . > > I am running on BaseX 7.6.1 Beta. > > Any thoughts? > > Is there a way to force the script to do garbage collection? > > > _______________________________________________ > BaseX-Talk mailing list > > [email protected] > https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk > > _______________________________________________ > BaseX-Talk mailing list > > [email protected] > https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk > > _______________________________________________ > BaseX-Talk mailing list > [email protected] > https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk > > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- > Dr. Alexander Holupirek > |-- Room E 221, 0049 7531 88 2188 (phone) 3577 (fax) > |-- Database & Information Systems Group, U Konstanz > `-- https://scikon.uni-konstanz.de/personen/alexander.holupirek/ > > > > _______________________________________________ BaseX-Talk mailing list [email protected] https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

