Hello Andy, Osma, Thank you for your feedback.
I am also more inclined towards option 2 to implement the SPARQL_Cache servlet as part of Fuseki. The fact that it will have access to Fuseki internal and admin console makes a lot of sense. I really like the idea of selective CacheEntry invalidation at graph level or triple level, I will think about it in more detail for implementation. As of now the CacheEntry is set to expire every 5 mins. Regards Saikat On Fri, Mar 13, 2015 at 4:33 PM, Osma Suominen <osma.suomi...@helsinki.fi> wrote: > On 13/03/15 12:37, Andy Seaborne wrote: > > To check my understanding, let me put that in my own words: This is >> essentially fronting the Fuseki server with another webapp that >> intercepts all requests to the Fuseki server. ("All requests" because >> it has to see update to manage the cache). >> > [...] > >> Both will work - personally, I'd go for option 2 because the >> Sparql_Cache servlet can have detailed access to the internals of >> Fuseki, for example the service registry and the admin interface. It >> can be integrated into the UI (in Fuseki2) so the admin console can be >> extended to manage the cache. See the stats page in the Fuseki2 UI. If >> its external, the management will be a separate function. >> > > I agree with Andy. I don't see much of a benefit for option 1 (an external > service) since you can already deploy a reverse proxy such as Varnish or > nginx in front of Fuseki and get pretty much the same benefits. See my > quick tutorial for installing Varnish in front of Fuseki [1]. (this simple > configuration will not cache POST requests, but it is possible to tweak > Varnish to cache those as well) > > The problem with an external cache is that it cannot easily be made aware > of changes in the data. So when anything in the data changes, you have to > arrange to flush the cache - in most cases the entire cache, since it's > very difficult to know whether it has affected the results of a particular > query. > > Where I can see the benefit is basically option 2 that is integrated to > the Fuseki servlet and is aware of updates. Even just flushing the entire > cache when an update comes in would be an improvement on using an external > cache, since at least there would be no possibility of stale data being > served. But if the cache could somehow be made aware of which parts of the > data (perhaps on a graph level, if not triple level) contributed to the > cached results of a particular query, it could perhaps do even smarter > invalidation and not throw away everything when a single triple changes. > > -Osma > > > [1] https://github.com/NatLibFi/Skosmos/wiki/FusekiTuning#http-caching > > > > -- > Osma Suominen > D.Sc. (Tech), Information Systems Specialist > National Library of Finland > P.O. Box 26 (Teollisuuskatu 23) > 00014 HELSINGIN YLIOPISTO > Tel. +358 50 3199529 > osma.suomi...@helsinki.fi > http://www.nationallibrary.fi >