Hi Mark,

The long running query is quite significant.

On 30/05/14 18:26, Mark Feblowitz wrote:
That’s a good idea.

One improvement I’ve already made was to relocate the DB to local
disk - having it on a shared filesystem is an even worse idea.

The updates tend to be on the order of 5-20 triples at a time.

If you could batch up changes, that will help for all sorts of reasons. c.f. autocommit and JDBC where many small changes run really slowly.

This is part of the issue - write transactions have a significant fixed
cost that you get for even a (theoretical) transaction of no changes. It
has to write a few bytes and do a disk-sync.  Reads continue during this
time but longer write times means there is less chance of system being
able to write the journal to the main database.  JENA-567 may help but
isn't faster (it's slower) but it saves memory.

Read transactions have near zero cost in TDB - Fuseki/TDB is read-centric.

What's more, TDB block size is 8Kbytes so one change in a block is 8K of transaction state. Multiple times for multiple indexes. So 5 triples of change get every little shared block effect and the memory footprint is disproportionally large.

<thinking out loud id=1>

A block size of 1 or 2k for the leaf blocks in TDB, leaving the branch blocks at 8k (they have in different block managers = files) would be worth experimenting with.

</thinking out loud>

<thinking out loud id=2>

We could provide some netty/mima/... based server that did the moral
equivalent of the SPARQL Protocol (cf jena-jdbc, jena-client). HTTP is said to be an appreciable cost. This is no judgement of Jetty/tomcat, it is the nature of HTTP; it is cautious wording because I haven't observed it myself - Jetty locally, together with careful streaming of results seems to be quite effective. Fast encoding of results would be good for both.

</thinking out loud>

I believe I identified the worst culprit, and that was using
OWLFBRuleReasoner rather than RDFSExptRuleReasoner or
TransitiveReasoner. My guess is that the longish query chain over a
large triplestore, using the Owl reasoner was leading to very long
query times and lots of memory consumption. Do you think that’s a
reasonable guess?

That does look right. Long running queries, or the effect of an intense stream of small back-to-back queries combined with the update pattern, leave no time for the system to flush the journal back to the main database. This leads to memory usage and eventually OOME.

How I reached that conclusion was to kill the non-responsive (even
for a small query) Fuseki and restart with RDFSExptRuleReasoner (same
DB, with many triples). After that, both the small query and the
multi-join query responded quite quickly.

If necessary, I’ll try to throttle the posts, since I’m in complete
control of the submissions.

That should at least prove whether this discussion has correctly diagnosed the interactions leading to OOME. What we have is ungrace-ful ("disgraceful") behaviour as the load reaches system saturation. It ought to be more graceful but, fundamentally, its always going to be possible to flood a system, any system, with more work than it is capable of.

        Andy

Thanks,

Mark

On May 30, 2014, at 12:52 PM, Andy Seaborne <a...@apache.org> wrote:

Mark,

How big are the updates?

An SSD for the database and the journal will help.

Every transaction is a commit, and a commit is a disk operation to
ensure the commit record is permanent.  That is not cheap with a
rotational disk (seek time), and much better with an SSD.

If you are driving Fuseki as hard as possible, something will break
- the proposal in JENA-703 amounts to slowing the clients down as
well as being more defensive.

Andy

On 30/05/14 15:39, Rob Vesse wrote:
Mark

This sounds like the same problem described in
https://issues.apache.org/jira/browse/JENA-689

TL;DR

For a system with no quiescent periods continually receiving
updates the in-memory journal continues to expand until such time
as an OOM occurs. There will be little/no data loss because the
journal is a write ahead log and is first written to disk (you
will lose at most the data from the transaction that encountered
the OOM).  Therefore once the system is restarted the journal
will be replayed and flushed.

See https://issues.apache.org/jira/browse/JENA-567 for an
experimental feature that may mitigate this and see
https://issues.apache.org/jira/browse/JENA-703 for the issue
tracking the work to remove this limitation

Rob

Reply via email to