[
https://issues.apache.org/jira/browse/JENA-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456407#comment-13456407
]
Andy Seaborne commented on JENA-321:
------------------------------------
Proposal: start request does not send an event to every graph, just to the
dataset object. I don't think anything relies on GS-triggered events.
For now, before an event model is sorted out, no event is better than a
contract we don't want later.
(Ditto finish request.)
Long term: a formal contract in the events on a graphstore/dataset and the
relationship to graph events. There are distinct kinds of dataset - ones that
are raw storage and ones that are a collection of graphs. Getting perfect
uniformity may not make sense -- hard/costly to have triple events on quad
actions where the graph is a view of the dataset.
See also a discussion on JENA-189
> Update notification events are fired on a per Graph basis instead of
> GraphStore
> -------------------------------------------------------------------------------
>
> Key: JENA-321
> URL: https://issues.apache.org/jira/browse/JENA-321
> Project: Apache Jena
> Issue Type: Improvement
> Components: ARQ
> Reporter: Stephen Allen
> Priority: Minor
>
> Before every update operation starts, UpdateEngineMain attempts to fire
> notification events to listeners that an update is about to occur.
> Unfortunately, it tries to fire an event for each named graph in the system.
> Because TDB represents named graphs as quads, the only way to get a list of
> all the named
> graphs to fire an event for is to perform an entire table scan, project just
> the graph part of the quad and then perform a distinct operation.
> There are a few problems with this approach:
> 1) This is pretty dang inefficient, as the entire database is scanned on
> every update query
> 2) With a large number of named graphs, you have to fire a lot of events,
> which is also inefficient
> 3) If you have a lot of named graphs, the distinct operation has to store
> every graph name in an in-memory hashset
> A user appears to have run into issue 3). The underlying cause seems to be a
> mismatch in the design of the graph notification. This needs to be
> redesigned to fire a single event for the entire graphstore.
> Relevant code in DatasetGraphTDB.java (line 262).
> 14:08:23 WARN Fuseki :: [1] RC = 500 : Java heap space
> java.lang.OutOfMemoryError: Java heap space
> at java.util.HashMap.resize(HashMap.java:462)
> at java.util.HashMap.addEntry(HashMap.java:755)
> at java.util.HashMap.put(HashMap.java:385)
> at java.util.HashSet.add(HashSet.java:200)
> at org.openjena.atlas.iterator.FilterUnique.accept(FilterUnique.java:35)
> at org.openjena.atlas.iterator.Iter$3.hasNext(Iter.java:187)
> at org.openjena.atlas.iterator.Iter.hasNext(Iter.java:825)
> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> at
> com.hp.hpl.jena.sparql.modify.GraphStoreUtils.actionAll(GraphStoreUtils.java:43)
> at
> com.hp.hpl.jena.sparql.modify.GraphStoreUtils.sendToAll(GraphStoreUtils.java:33)
> at
> com.hp.hpl.jena.sparql.modify.GraphStoreBasic.startRequest(GraphStoreBasic.java:53)
> at
> com.hp.hpl.jena.sparql.modify.UpdateEngineMain.execute(UpdateEngineMain.java:37)
> at
> com.hp.hpl.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:56)
> at com.hp.hpl.jena.update.UpdateAction.execute$(UpdateAction.java:330)
> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:323)
> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:303)
> at com.hp.hpl.jena.update.UpdateAction.execute(UpdateAction.java:255)
> at
> org.apache.jena.fuseki.servlets.SPARQL_Update.execute(SPARQL_Update.java:235)
> at
> org.apache.jena.fuseki.servlets.SPARQL_Update.executeForm(SPARQL_Update.java:226)
> at
> org.apache.jena.fuseki.servlets.SPARQL_Update.perform(SPARQL_Update.java:122)
> at
> org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:97)
> at
> org.apache.jena.fuseki.servlets.SPARQL_Update.doPost(SPARQL_Update.java:78)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
> at
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)
> at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:480)
> at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:941)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)
> at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875)
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
> 14:08:23 INFO Fuseki :: [1] 500 Java heap space
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira