Re: DROP ALL behavior

Andy Seaborne Fri, 05 Sep 2014 14:13:54 -0700

On 05/09/14 12:58, Hugh Cayless wrote:

Hello all,


I've used Jena-Fuseki previously, but when I needed to reload all my data
(I'm using TDB), I've generally erased the contents of my data directory
and recreated it because it's faster than dropping the graph. I'm noticing
now though that if I issue a SPARQL DROP ALL update, the graph does indeed
get dropped, but if I check the size of my data directory, it's the same as
it was. When my data gets added back, the data directory gets that much
larger, eventually causing me to run out of free space on the volume.

Is there some sort of vacuum procedure I need to run to clear the stale
data? Or a reset command that will restore the contents of the data
directory to its default, empty state? It would be nice to be able to do
this without stopping Fuseki, as it will be serving other databases besides
the one I'm currently messing with.

Thanks,
Hugh


Hugh,

Space is not recycled back to the OS so files do not get smaller. Spaceis partially reused but it could be better.

The node table is not cleared up - NodeIds are reused should RDF data beadded again with the same URIs or literals. BNodes will likely be freshones so they do waste space in the node tables. The cost of referencecounting node usage would be very high.

In indexes, space should be reused but isn't as well as it should be andits only reused within the same JVM run. Restart looses the chance toreuse the space.


I'm afraid the only reset is to stop the server and delete the files.

Fuseki2 will add the option of deleting a database. However, on MSWindows, the well-know java bug that memory mapped files can't bedeleted until the the JVM exists blocks even this.


        Andy

Re: DROP ALL behavior

Reply via email to