Hi all,

We're running Fuseki with a few TDB datasets, and it seems to be acting
rather inefficiently.

Here's version numbers:

> [root@opendata<production> ~]# /usr/bin/java -jar
> /usr/share/java/fuseki-server.jar --version
> Jena:       VERSION: 2.7.5-SNAPSHOT
> Jena:       BUILD_DATE: 2012-10-21T09:26:22+0100
> ARQ:        VERSION: 2.9.5-SNAPSHOT
> ARQ:        BUILD_DATE: 2012-10-21T09:29:20+0100
> TDB:        VERSION: 0.9.5-SNAPSHOT
> TDB:        BUILD_DATE: 2012-10-21T09:40:32+0100
> Fuseki:     VERSION: 0.2.6-SNAPSHOT
> Fuseki:     BUILD_DATE: 2012-10-21T09:44:10+0100

Here's top:

> top - 11:29:56 up 123 days, 18:52,  1 user,  load average: 1.06, 1.20,
1.27
> Tasks: 208 total,   1 running, 207 sleeping,   0 stopped,   0 zombie
> Cpu(s):100.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi, 
0.0%si,  0.0%st
> Mem:   6132016k total,  5290072k used,   841944k free,    93864k buffers
> Swap:   499704k total,   499704k used,        0k free,  1290944k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ 
COMMAND                                                                         
                                    

> 22324 fuseki    20   0 6624m 1.2g  23m S 99.3 20.0   1172:44
java                                                                            
                                    


You can see the trends at
<http://opendata.oucs.ox.ac.uk/oucs.ox.ac.uk/opendata.oucs.ox.ac.uk/cpu.html>.

The journal files look like this:

> [root@opendata<production> tdb]# ls */journal.jrnl -lh
> -rw-r--r-- 1 fuseki fuseki 3.5M Jun 15 03:55 courses/journal.jrnl
> -rw-r--r-- 1 fuseki fuseki  22M Jun 15 03:37 equipment/journal.jrnl
> -rw-r--r-- 1 fuseki fuseki 5.2M Jun 15 02:17 itservices/journal.jrnl
> -rw-r--r-- 1 fuseki fuseki 448M Jun 15 03:59 public/journal.jrnl
> -rw-r--r-- 1 fuseki fuseki  18M Jun 12 14:41 seesec/journal.jrnl

Looking at the Fuseki logs, there have been various quiet periods where
there shouldn't have been any read locks, and I would have thought these
would have been cleared (particularly as non-public stores don't attract
search engines or "users").

We're getting rather a number of Java heap space errors.java has a
"-Xmx1g" (that's right, right? :D).

The TDB DBs also seem to be growing over time disproportionally to any
increase in triples. For example, the entire TDB directory for our
public store is 28GB on disk; dumping it and reloading it recently put
it at 97MB. The trend can be seen at
<http://opendata.oucs.ox.ac.uk/oucs.ox.ac.uk/opendata.oucs.ox.ac.uk/df.html>;
the sudden drops on the by-year graph are me dumping and reloading. The
increase in disk usage in the last few days is — I suspect ­— something
else.

I'm thinking this could be managed by periodically shutting down Fuseki,
applying the journal, reloading the store, and then setting Fuseki going
again. However, I'm loathe to do this without understanding why it gets
the way it does.

Any thoughts? Answers of "yes, we've fixed this; you need to upgrade"
are perfectly reasonable ;-).

Yours,

Alex

-- 
Alexander Dutton
Linked Open Data Architect, Office of the CIO; data.ox.ac.uk, OxPoints
IT Services, University of Oxford


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to