Re: hot swap tdb behind fuseki

Andy Seaborne Sun, 15 Feb 2015 11:35:17 -0800

On 13/02/15 17:05, Paul Tyson wrote:

On Fri, 2015-02-13 at 16:49 +0000, Andy Seaborne wrote:

On 12/02/15 22:09, Paul Tyson wrote:

On Wed, 2015-02-11 at 10:53 +0000, Andy Seaborne wrote:

Paul,


You can add a new, pre-built database to a running Fuseki2 server with a
new name but you can't hot swap an existing name at the moment in Fuseki
itself (1 or 2).


I built jena-fuseki2 out of the development repository.


Fine if you want to do that but there are nightly builds for convenience
(these are not releases):

https://repository.apache.org/content/repositories/snapshots/org/apache/jena/jena-fuseki/


Ah, I didn't know about that (no link on downloads page). I did it the
hard way, but learned a few things.

Fuseki2 has not been formally released - that's a development snapshotbuilt from source each night or so.

The download page is things formally voted on, pgp-signed, and archivedby the Apache foundation.

I don't see
anything in the administrative console about adding a new tdb location.
Is that what the "Add persistent dataset" feature is supposed to do?


Yes.

That didn't work in my set-up. It didn't add the new name to the list of
existing datasets (it did however seem to reserve the name, because I
couldn't add the same name again).


Hmm - it does for me (Firefox, Ubuntu, latest development code).

"add new dataset"
    check persistent - add name - press "create dataset"

and it jumps to the "existing datasets" which show the new dataset.

(maybe you have cached old javascript because IIRC this is a fixed bug -
but I have had a lot of problems with aggressive caching by browsers -
control-F5 may not fix it)


OK, probably a crippled browser (icecat). I saw the new tdb directories
in run/databases, and when I restarted fuseki the new datasets appeared.


Good!

Even aside from the name reuse issue (solvable), it is the stopping
using the old database that is a problem on Windows and this is not
fixable; Java lacks the capability [1].  It needs a JVM restart to
release the old files in that case.  It works on Linux.


I'm running it on Linux. Are you saying that if the jena-fuseki2 code
allowed reuse of the dataset name, you could rebind a new tdb location
to the same name without restarting the server? (Or, more generally,
just refresh the data from the existing location, to account for
filesystem links that may have changed.)


It should allow it - it current doesn't - JENA-869 which when I last
looked at the innards showed that internally management of the name had
got messy/unclear.  Sorry - plain old messy bug.


I'm thinking the balanced reverse proxy setup would be the best
solution, as you indicated previously. Thanks for the help.

Regards,
--Paul


        Andy


Thanks,
--Paul


A proper fix to all this isn't likely before Fuseki 2.0.0 -- it would
allow an external name to be reused with a different database location
which half fixes the problem.  At least the internal caches of a
database go away.

Your idea of having multiple servers and have a load balancer (or a
reverse proxy) to maintain the external valid URL and switch between
them is a good one.  At Epimorphics, we do this in production - we
typically have two copies of the database, two servers, for fault
tolerance reasons and for load spikes.  We  swap servers by changing
the load balancers, or a moral equivalent if the backends are stacks of
several machines (sometimes, we run presentation on a different machine
to the database - it depends on the scale of the system).

        Andy

[1]
http://bugs.java.com/view_bug.do?bug_id=4724038
and several others.

On 11/02/15 00:07, Paul Tyson wrote:

On Tue, 2015-02-10 at 23:38 +0000, Stian Soiland-Reyes wrote:

Are you using the tdb to swap just for reading, or would you need to
synchronize transactions?

Below I'll assume you mean 'reading', and that you want to swap
because you have a 'newer' tdb store from "somewhere".


Yes, that's the use case exactly.



With fuseki2 (I have not checked with Fuseki 1 which only has a REST
interface) you can "hot add" a new tdb store in the web interface. If
the tdb directory with the given name already exist under
/etc/fuseki/databases it will be re-used and made live immediately.


I currently use fuseki 1 but was looking to upgrade to fuseki2 anyway,
so this sounds like it will solve my problem.

In my setup I use this in combination with data loading, so that I can
load "offline" with tdbloader2 and then immediately make it live in an
existing running Fuseki 2.


I create the new tdb (using tdbloader) and then update a symlink to
target the new tdb. Currently I restart fuseki and the new tdb is read
from the symlink. I want to eliminate the restart. It sounds like the
fuseki2 web interface will allow this.


There's unfortunately an open issue in the web interface with removing
and adding a store with the same name -
https://issues.apache.org/jira/browse/JENA-869 - so if you try this
now with the current SNAPSHOT of Fuseki 2 you would have to make a new
database name for every swap and copy the tdb store into that before
adding it in the user interface.  You could probably hide/simplify
that name from the URI with a simple Apache httpd ProxyPass or
RewriteRule


Thanks for the pointers and warning. I'll see if I can work it out.

Regards,
--Paul




On 10 February 2015 at 19:19, Paul Tyson <phty...@sbcglobal.net> wrote:

I've looked through the user documentation but did not find a clue to
this problem. I have not dug too deeply into the code.

The problem is to safely re-initialize a running fuseki server to read a
new tdb location.

I've thought of using 2 (or more) jetty or tomcat workers in a
load-balancing configuration, which would allow staged restarts. But
before I go there I thought I would ask if there is an easier way.

Does anyone have a usage pattern for this, or can point me to some
documentation or classes that would get me started?

Thanks,
--Paul

Re: hot swap tdb behind fuseki

Reply via email to