I have the machine running now for hours, but to be fair, I didn't
produce any load in the meantime.
On 24.10.22 14:17, Andy Seaborne wrote:
Hi Bob, good article!
Especially the "check your data before loading" bit.
https://bobdc.com/miscfiles/dataset2.ttl
You can remove all those "rdfs:subClassOf" triples. That all happens
automatically.
On 23/10/2022 20:36, Bob DuCharme wrote:
> The good news is that I have gotten Fuseki running on a free tier AWS
> EC2 instance with very little trouble and was able to use the HTML
> interface and the SPARQL endpoint, as described at
> https://www.bobdc.com/blog/ec2fuseki/
>
> The bad news: it just randomly stops, even when there has been no
> querying activity, typically after 30-60 minutes of being up:
>
> 17:17:50 INFO Server :: OS: Linux
> 5.10.144-127.601.amzn2.x86_64 amd64
> 17:17:50 INFO Server :: PID: 3314
> 17:17:51 INFO Server :: Started 2022/10/23 17:17:51 UTC on
> port 3030
> Killed
>
> The instance has 1GB of memory. I had only loaded 162K of data.
>
> Should I set JVM_ARGS different from the default?
Yes - as Lorenz says.
It needs to be less than the machine size, and allow a bit of other
space (OS, file system cache). Guess: 0.75G.
What I think is happening is that even when "nothing" is happening,
there is still some small amount of work going on. Not from Fuseki
itself but, for example, UI pings the server, a bit of Java runs.
The heap will slowly increase because there is no pressure to do a
full GC and, if the heap size is set larger than the machine,
eventually a request to grow the heap larger then the OS allows
happens and the OS kills the process. No java/Fuseki log message.
Even though this work is very small, on a t2.micro "eventually" might
be quite soon.
Another factor you may come across later, when using TDB2 on a small
instance, is that the TDB2 caches will need tuning smaller for safety.
Most likely, at 162K all the data ends up in RAM and the node table
cache isn't large so it won't be a problem because it never gets very
big.
For 162K of data, and it's read-only ("publishing"), I'd try putting
everything in-memory at startup.
# Transactional in-memory dataset.
:dataset rdf:type ja:MemoryDataset ;
ja:data "data1.trig"; ## Or a triples format like .ttl.
.
which is
fuseki-server --file DATA --update /dataset2 ## --update optional
or load with a script. Downside updates are lost.
It is possible to tune down TDB cache sizes. And if anyone is really
desperate, 32-bit JVMs (but don't unless you really have to).
The mechanism is rather clunky to apply from Fuseki at the moment.
Andy
>
> Thanks,
>
> Bob
>