[ 
https://issues.apache.org/jira/browse/JENA-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15020533#comment-15020533
 ] 

Andy Seaborne edited comment on JENA-1068 at 11/21/15 3:59 PM:
---------------------------------------------------------------

{{DatasetImpl}} is the only implementation of {{Dataset}} in the system - there 
may be others as extensions somewhere.

{{DatasetImpl}} caches {{Graph -> Model}}.  It is making assumptions about how 
the underlying {{DatasetGraph}} works, such as calls of {{getGraph(Node)}} 
return the same Java object.

The caching in {{DatasetGraphCaching}} meant that usually, but not always, the 
same Java object was being returned. If the {{DatasetGraphCaching}} cache 
ejected the graph after some time, then the {{DatasetImpl}} is affected. 
Removing the caching means it is likely to be a different object each time.

The effect of the {{DatasetImpl}} cache is to reduce costs in model creation 
for a given graph.  It is not clear that, in current Jena, Model costs are 
significant - some simple profiling did not identify anything. In the past, 
EnhGraph cache initialization could be detectable but a change in caching code 
got rid of that.

To be safe, the default model can be cached as it always exists. This will 
preserves one corner case behaviour of the API (caught by a test), that the 
model for querying with SPARQL is the same object as the model for returned 
resources when querying a dataset with only a default model. Assuming object 
identity is not a good thing to rely on but there is a test so keep the 
behaviour.

Otherwise, remove the {{DatasetImpl}} cache.  


was (Author: andy.seaborne):
{{DatasetImpl}} is the only implementation of {{Dataset}} in the system - there 
may be others as extensions somewhere.

{{DatasetImpl}} caches {{Graph -> Model}}.  It is making assumptions about how 
the underlying {{DatasetGraph}} works, such as calls of {{getGraph(Node)}} 
return the same Java object.

The caching in {{DatasetGraphCaching}} meant that usually, but not always, the 
same Java object was being returned. If the {{DatasetGraphCaching}} cache 
ejected the graph after some time, then the {{DatasetImpl}} is affected. 
Removing the caching means it is likely to be a different object each time.

The effect of the {{DatasetImpl}} cache is to reduce costs in model creation 
for a given graph.  It is not clear that, in current Jena, Model costs are 
significant - some simple profiling did not identify anything. In the past, 
EnhGraph cache initialization could be detectable but a change in caching code 
got rid of that.

Therefore, remove the {{DatasetImpl}} cache as well.

> Remove DatasetGraphCaching
> --------------------------
>
>                 Key: JENA-1068
>                 URL: https://issues.apache.org/jira/browse/JENA-1068
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ, SDB
>    Affects Versions: Jena 3.0.0
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: Jena 3.0.1
>
>
> This class seems to do nothing much but does add complexity.  There is 
> caching in {{DatasetImpl}} that competes with this caching.  Nowadays, Jena's 
> storage systems are using {{GraphView}} which is very lightweight so caching 
> the creation of graphs is not beneficial and the cache has to be concurrent 
> and transaction safe.
> The internal class {{DatasetGraphCaching.Helper}} is used only in SDB.  
> {{Helper}} could be moved to into SDB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to