[ https://issues.apache.org/jira/browse/JENA-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020533#comment-15020533 ]
Andy Seaborne edited comment on JENA-1068 at 11/21/15 3:59 PM: --------------------------------------------------------------- {{DatasetImpl}} is the only implementation of {{Dataset}} in the system - there may be others as extensions somewhere. {{DatasetImpl}} caches {{Graph -> Model}}. It is making assumptions about how the underlying {{DatasetGraph}} works, such as calls of {{getGraph(Node)}} return the same Java object. The caching in {{DatasetGraphCaching}} meant that usually, but not always, the same Java object was being returned. If the {{DatasetGraphCaching}} cache ejected the graph after some time, then the {{DatasetImpl}} is affected. Removing the caching means it is likely to be a different object each time. The effect of the {{DatasetImpl}} cache is to reduce costs in model creation for a given graph. It is not clear that, in current Jena, Model costs are significant - some simple profiling did not identify anything. In the past, EnhGraph cache initialization could be detectable but a change in caching code got rid of that. To be safe, the default model can be cached as it always exists. This will preserves one corner case behaviour of the API (caught by a test), that the model for querying with SPARQL is the same object as the model for returned resources when querying a dataset with only a default model. Assuming object identity is not a good thing to rely on but there is a test so keep the behaviour. Otherwise, remove the {{DatasetImpl}} cache. was (Author: andy.seaborne): {{DatasetImpl}} is the only implementation of {{Dataset}} in the system - there may be others as extensions somewhere. {{DatasetImpl}} caches {{Graph -> Model}}. It is making assumptions about how the underlying {{DatasetGraph}} works, such as calls of {{getGraph(Node)}} return the same Java object. The caching in {{DatasetGraphCaching}} meant that usually, but not always, the same Java object was being returned. If the {{DatasetGraphCaching}} cache ejected the graph after some time, then the {{DatasetImpl}} is affected. Removing the caching means it is likely to be a different object each time. The effect of the {{DatasetImpl}} cache is to reduce costs in model creation for a given graph. It is not clear that, in current Jena, Model costs are significant - some simple profiling did not identify anything. In the past, EnhGraph cache initialization could be detectable but a change in caching code got rid of that. Therefore, remove the {{DatasetImpl}} cache as well. > Remove DatasetGraphCaching > -------------------------- > > Key: JENA-1068 > URL: https://issues.apache.org/jira/browse/JENA-1068 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ, SDB > Affects Versions: Jena 3.0.0 > Reporter: Andy Seaborne > Assignee: Andy Seaborne > Priority: Minor > Fix For: Jena 3.0.1 > > > This class seems to do nothing much but does add complexity. There is > caching in {{DatasetImpl}} that competes with this caching. Nowadays, Jena's > storage systems are using {{GraphView}} which is very lightweight so caching > the creation of graphs is not beneficial and the cache has to be concurrent > and transaction safe. > The internal class {{DatasetGraphCaching.Helper}} is used only in SDB. > {{Helper}} could be moved to into SDB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)