[jira] [Commented] (JENA-626) SPARQL Query Caching

ASF GitHub Bot (JIRA) Fri, 18 Dec 2015 06:33:06 -0800

    [ 
https://issues.apache.org/jira/browse/JENA-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064001#comment-15064001
 ]


ASF GitHub Bot commented on JENA-626:
-------------------------------------

Github user afs commented on the pull request:

    https://github.com/apache/jena/pull/95#issuecomment-165791089
  
    Thank you for pushing on with this.   It's looking good and nicely targeted 
at just the Fuseki codebase.
    
    I've cleaned up the white space in the master branch of Jena for the files 
this PR touches which will make comparision easier.
    
    Some design discussion points:
    
    **Cache per dataset**
    
    There is one cache for the system.  This should be one cache per dataset, 
so there will need to be
    a map `ConcurrentHashMap<String, Cache>` where the string is the serice URI.
    
    `getCache()` becomes `getCache(uri)`.
    
    Now when an update happens, only the cache for the dataset needs to be 
invalidated.
    It also leaves open the possibility of different cache settings for 
different datasets.
    
    **No Cache**
    
    What happens if `SPARQL_Query` (SPARQL_Query_Daatset) is used directly?
    In other words, do things work for a datset if called dirctly, not via 
SPARQL_Query_Cache?
    
    Then whether the cache is used can be as simple as switching the SPARQL_* 
implementation.
    
    **Cache key**
    
    Can the cache key be simply based on the query object+responsetype?  
Queries are equal if they are the same structure.
    
    See org.apache.jena.atlas.lib.Pair as an example of a class that is good as 
a key
    because it provides `.hashCode` and `.equals` based on it structure, not 
it's object identity.
    It devolves work to the two elements.
    
    **Inherit to share code**
    
    SPARQL_Query_Cache seems to have a lot of the operations of SPARQL_Query.  
Can the code by DRYed (DRY - Don't Repeat Yourself)?
    
    It looks to me like SPARQL_Query_Cache can share with SPARQL_Query. It 
needs to do that by extending SPARQL_QueryDataset then
    override `execute()` (that will need to make that protected in 
SPARQL_Query), then have generateKey, getKey, getCache.
    
    To call `SPARQL_Query` from `SPARQL_Query_Cache`, the no cache hit path, 
instead of:
    
    ```
    ActionSPARQL queryServlet = new SPARQL_QueryDataset() ;
                        queryServlet.executeLifecycle(action) ;
    ```
    use
    ```
        super.execute(....)
    ```
    A new `SPARQL_QueryDataset` does not get created each time.



> SPARQL Query Caching
> --------------------
>
>                 Key: JENA-626
>                 URL: https://issues.apache.org/jira/browse/JENA-626
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Assignee: Saikat Maitra
>              Labels: java, linked_data, rdf, sparql
>
> Add a caching layer to Fuseki to cache the results of SPARQL Query requests.  
> This cache should allow for in-memory and disk-based caching, configuration 
> and cache management, and coordination with data modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (JENA-626) SPARQL Query Caching

Reply via email to