Hi Christopher,

thanks for the script. It gives us a first hint on what may go on
internally. Nex ,some profiling output could be helpful, so could you
please run the complete script with the following Java option..

  java -Xrunhprof:cpu=samples,depth=15 ...

..and send me the java.hprof.txt file, which will be stored in the
directory the code is started from? The Java profiler also provides a
"heap" option (see -Xrunhprof:help), but I don’t actually know how to
reasonably interpret the output..

For testing purposes, it is sometimes helpful to further reduce the
amount of memory that’s assigned to the JVM (via -Xmx...m).

Best,
Christian
___________________________

2013/6/12 Christopher.Ball <[email protected]>:
> Alex,
>
> Here is the script (.bxs file) contents in its partitioned form (broken out
> into 6 seperate scripts rather than one script):
>
> SET STRIPNS   true
> SET ADDCACHE  true
> SET TEXTINDEX false
> SET ATTRINDEX false
>
> OPEN Release-Canonicals-Comparative
>
> XQUERY db:output('&#xa; -- ' || current-time() || ' -- &#xa;')
>
> XQUERY db:output("&#xa;#12")
> SET BINDINGS
> $db=Release-Canonicals-Comparative,$containerSetStart=110001,$containerSetCount=10000
> RUN ..\webapp\release-identification\xquery\generate-comparison-db.xq
>
> XQUERY db:output("&#xa;#13")
> SET BINDINGS
> $db=Release-Canonicals-Comparative,$containerSetStart=120001,$containerSetCount=10000
> RUN ..\webapp\release-identification\xquery\generate-comparison-db.xq
>
> XQUERY db:output('&#xa; -- ' || current-time() || ' -- &#xa;')
>
> If I run it in this partitioned form, it is quite fast, roughly 5 minutes
> per "RUN" command. If I concatenate them all together, it progressively
> slows down, consumes all the available memory, starts to swap memory to
> disc, cpu climbs to 100% and eventually fails with a memory error.
>
> Christopher
>
>
>
>
> On 6/12/2013 2:45 AM, Alexander Holupirek wrote:
>
> Christopher,
>
> it may be sufficient if you can pass the script (.bxs) file that you use to
> process the data.
>
> Would that be possible?
> Alex
>
> On 12.06.2013, at 02:46, "Christopher.Ball"
> <[email protected]> wrote:
>
> Christian,
> So I have finally upgraded to BaseX 7.7 and found I am still having the out
> of memory issue.
>
> Given the size and nature of the data I am working with I am at a loss of
> how to provide you with a simple example that replicates the problem.
>
> On the flip side, one behavior I am noticing is that breaking the work in to
> discrete chunks in separate batch scripts gives dramatically faster
> performance and avoids the memory error. This       strongly suggests that
> something is preventing garbage collection between unrelated tasks in a
> batch script.
>
> Is there any way I can force garbage collection in a batch script? I tried
> closing and reopening databases but that had no effect (actually shocked
> that it did not).
>
> Let me know,
>
> Christopher
>
> On 5/20/2013 6:24 AM, Christian Grün wrote:
>
> Hi Christopher, hi Ben,
>
> yes, this sounds like unwanted behavior, and I believe it should be
> fixable as the commands scripts I’ve been working with didn’t cause
> memory leaks. I’ll be glad to track down the possible issues. Could
> (one/both) of you pass me on a script that causes the problems?
>
> Christian
>
> PS: I would be grateful if you could additionally check if the problem
> persists in the latest stable snapshot.
> ___________________________
>
> On Mon, May 20, 2013 at 10:33 AM, Ben Companjen
> <[email protected]>
>  wrote:
>
> I recognise your problem, and reported it, but never got back to it
> with more details. I used BaseX client/server 7.5 beta. My first
> database contained 2.7 million documents, but I created a new one from
> an exported subset of 700k documents. That helped lower the memory use
> directly after loading the DB.
>
> Any chance you use the SQL module in your processing?
>
> My guess was that it had been a design choice to keep previously
> opened documents from a database in use in memory. But running out of
> memory probably wasn't ;)
>
> Ben
>
> On 20 May 2013 04:32, Christopher.R.Ball
> <[email protected]>
>  wrote:
>
> I have a BaseX script (.bxs) I am running that does queries in batches (sets
> of 5k documents), but as it progresses it bogs down in speed, does not
> release memory between sets even if I force it to close and reopen the db
> between queries, and eventually runs out of memory.
>
> But, if I break the same BaseX script into separate files still doing the
> same exact batches it is extremely fast and memory efficient.
>
> Very suggestive of a memory leak . . .
>
> I am running on BaseX 7.6.1 Beta.
>
> Any thoughts?
>
> Is there a way to force the script to do garbage collection?
>
>
> _______________________________________________
> BaseX-Talk mailing list
>
> [email protected]
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
> _______________________________________________
> BaseX-Talk mailing list
>
> [email protected]
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
> _______________________________________________
> BaseX-Talk mailing list
> [email protected]
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
>
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
> Dr. Alexander Holupirek
> |-- Room E 221, 0049 7531 88 2188 (phone) 3577 (fax)
> |-- Database & Information Systems Group, U Konstanz
> `-- https://scikon.uni-konstanz.de/personen/alexander.holupirek/
>
>
>
>
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to