Hello Andy, Osma,

Thank you for your feedback.

I am also more inclined towards option 2  to implement the SPARQL_Cache
servlet as part of Fuseki. The fact that it will have access to Fuseki
internal and admin console makes a lot of sense.

I really like the idea of selective CacheEntry invalidation at graph level
or triple level, I will think about it in more detail for implementation.
As of now the CacheEntry is set to expire every 5 mins.

Regards
Saikat





On Fri, Mar 13, 2015 at 4:33 PM, Osma Suominen <osma.suomi...@helsinki.fi>
wrote:

> On 13/03/15 12:37, Andy Seaborne wrote:
>
>  To check my understanding, let me put that in my own words: This is
>> essentially fronting the Fuseki server with another webapp that
>> intercepts all requests to the Fuseki server.  ("All requests" because
>> it has to see update to manage the cache).
>>
> [...]
>
>> Both will work - personally, I'd go for option 2 because the
>> Sparql_Cache servlet can have detailed access to the internals of
>> Fuseki, for example the service registry and the admin interface.  It
>> can be integrated into the UI (in Fuseki2) so the admin console can be
>> extended to manage the cache.  See the stats page in the Fuseki2 UI.  If
>> its external, the management will be a separate function.
>>
>
> I agree with Andy. I don't see much of a benefit for option 1 (an external
> service) since you can already deploy a reverse proxy such as Varnish or
> nginx in front of Fuseki and get pretty much the same benefits. See my
> quick tutorial for installing Varnish in front of Fuseki [1]. (this simple
> configuration will not cache POST requests, but it is possible to tweak
> Varnish to cache those as well)
>
> The problem with an external cache is that it cannot easily be made aware
> of changes in the data. So when anything in the data changes, you have to
> arrange to flush the cache - in most cases the entire cache, since it's
> very difficult to know whether it has affected the results of a particular
> query.
>
> Where I can see the benefit is basically option 2 that is integrated to
> the Fuseki servlet and is aware of updates. Even just flushing the entire
> cache when an update comes in would be an improvement on using an external
> cache, since at least there would be no possibility of stale data being
> served. But if the cache could somehow be made aware of which parts of the
> data (perhaps on a graph level, if not triple level) contributed to the
> cached results of a particular query, it could perhaps do even smarter
> invalidation and not throw away everything when a single triple changes.
>
> -Osma
>
>
> [1] https://github.com/NatLibFi/Skosmos/wiki/FusekiTuning#http-caching
>
>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Teollisuuskatu 23)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suomi...@helsinki.fi
> http://www.nationallibrary.fi
>

Reply via email to