Re: Re: What is killing my Fuseki instance?

Lorenz Buehmann Mon, 24 Oct 2022 07:00:46 -0700

I have the machine running now for hours, but to be fair, I didn'tproduce any load in the meantime.


On 24.10.22 14:17, Andy Seaborne wrote:

Hi Bob, good article!

Especially the "check your data before loading" bit.


https://bobdc.com/miscfiles/dataset2.ttl
You can remove all those "rdfs:subClassOf" triples. That all happensautomatically.
On 23/10/2022 20:36, Bob DuCharme wrote:
> The good news is that I have gotten Fuseki running on a free tier AWS
> EC2 instance with very little trouble and was able to use the HTML
> interface and the SPARQL endpoint, as described at
> https://www.bobdc.com/blog/ec2fuseki/
>
> The bad news: it just randomly stops, even when there has been no
> querying activity, typically after 30-60 minutes of being up:
>
>    17:17:50 INFO  Server          ::   OS:     Linux
> 5.10.144-127.601.amzn2.x86_64 amd64
>    17:17:50 INFO  Server          ::   PID:    3314
>    17:17:51 INFO  Server          :: Started 2022/10/23 17:17:51 UTC on
> port 3030
>    Killed
>
> The instance has 1GB of memory. I had only loaded 162K of data.
>
> Should I set JVM_ARGS different from the default?

Yes - as Lorenz says.
It needs to be less than the machine size, and allow a bit of otherspace (OS, file system cache). Guess: 0.75G.
What I think is happening is that even when "nothing" is happening,there is still some small amount of work going on. Not from Fusekiitself but, for example, UI pings the server, a bit of Java runs.
The heap will slowly increase because there is no pressure to do afull GC and, if the heap size is set larger than the machine,eventually a request to grow the heap larger then the OS allowshappens and the OS kills the process. No java/Fuseki log message.
Even though this work is very small, on a t2.micro "eventually" mightbe quite soon.
Another factor you may come across later, when using TDB2 on a smallinstance, is that the TDB2 caches will need tuning smaller for safety.Most likely, at 162K all the data ends up in RAM and the node tablecache isn't large so it won't be a problem because it never gets verybig.
For 162K of data, and it's read-only ("publishing"), I'd try puttingeverything in-memory at startup.
# Transactional in-memory dataset.
:dataset rdf:type ja:MemoryDataset ;
    ja:data "data1.trig"; ## Or a triples format like .ttl.
    .

which is

   fuseki-server --file DATA --update /dataset2 ## --update optional

or load with a script. Downside updates are lost.
It is possible to tune down TDB cache sizes. And if anyone is reallydesperate, 32-bit JVMs (but don't unless you really have to).
The mechanism is rather clunky to apply from Fuseki at the moment.

    Andy

>
> Thanks,
>
> Bob
>

Re: Re: What is killing my Fuseki instance?

Reply via email to