It's often misunderstood, but Java programs use memory in addition to the
configured heap.  Fuseki in my experience sometimes uses a LOT more, more
than I could explain.  Some of the folks here (Andy for sure) spent some
time looking at it with me and weren't able to come to any conclusions.
You can look throught he list archives for the discussion, maybe 6 months
ago.

I ended up significantly overallocating memory to the instance and being
done with it.

How much RAM does your instance have?  You mentioned -Xmx 5600, and total
usage of 17GB ram+swap - sounds like you have maybe 8GB ram?    I'd try
16GB and see how it does; watch the total memory usage.



On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mikael.peso...@lingsoft.fi>
wrote:

>
>
>
> On 29/01/2019 16:28, Rob Vesse wrote:
> > This may be partly a case of a simple looking query having unexpected
> execution semantics.  Strictly speaking your query says select all triples
> in the specific graph then join them with these list of values for ?s.  Now
> the optimiser should, and does appear, to do the right thing and flip the
> join order i.e. it uses the concrete values from the VALUES block to search
> for triples with those subjects in the specific graph.  However if the
> query had other elements involved the optimiser might not kick in, a better
> query would place the VALUES prior to using the variables defined in the
> VALUES block.
> Thanks for the reminder on VALUES order
> >
> > This sounds like memory/cache thrashing.  From what you have described,
> running variants on this query 50k times, you are basically walking over
> your entire dataset extracting it piece by piece?
> Dataset is larger, these small sets (VALUES) are coming from out
> external index for similar document search. Index returns id and related
> metadata is fetched from Jena.
> >
> > Assuming the Graph URI and the URIs in your VALUES block change in each
> query then every query is looking at a different section of the database
> causing a lot of data to be cached and then evicted both in terms of
> on-heap memory structures (the node table cache) and potentially also for
> the off heap memory mapped files which may be being paged in and out as the
> code traverses the B-Tree indexes.
> >
> > Is there also some other query involved that extracts the Graph URIs and
> Subject URIs of interest that is being executed in parallel with the
> script?  Or has the input from the script been pre-calculated ahead of
> time, comes from elsewhere etc?
> There is no parrallelism from our part in this case. Only one php script
> running and making GSP calls.
> >
> > Rob
> >
> > On 29/01/2019, 14:06, "Mikael Pesonen" <mikael.peso...@lingsoft.fi>
> wrote:
> >
> >
> >      Server:
> >
> >      /usr/bin/java
> >
> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
> >      -Xmx5600M -jar fuseki-server.jar --update --port 3030
> >      --loc=/home/text/tools/jena_data_test/ /ds
> >
> >      No custom configs, default installation package.
> >
> >
> >      Sparql similar to this (returns 5-10 triplets) :
> >
> >      CONSTRUCT { ?s ?p ?o }
> >      FROM <
> https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
> >      WHERE
> >      {
> >               ?s ?p ?o
> >
> >      VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
> >      lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
> >      lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
> >      lsr:239c6da0-4c24-4539-a277-c9756d6257ee
> >      lsr:2ef0190d-6271-447a-992f-6225fc440897
> >      lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
> >      lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
> >      lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
> >      lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
> >      lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
> >      }
> >
> >
> >      I solved this by adding sleep to script. So I guess it's about the
> java
> >      memory manager not getting time to free memory? Even with sleep it
> was
> >      barely doable, memory consumption changing rapidly between 1,5 gig
> - 6 gig.
> >
> >
> >
> >      On 29/01/2019 15:50, Andy Seaborne wrote:
> >      > Mikael,
> >      >
> >      > There aren't enough details except to mention the suspects like
> sorting.
> >      >
> >      > With all the questions on the list, I personally don't track the
> >      > details of each installation so please also remind me of your
> current
> >      > setup.
> >      >
> >      >     Andy
> >      >
> >      > On 29/01/2019 11:32, Mikael Pesonen wrote:
> >      >>
> >      >> I'm not able to run a basic read-only script without running out
> of
> >      >> memory on the server.
> >      >>
> >      >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
> Fuseki
> >      >> when running out of memory.
> >      >> All I'm running is simple sparql query getting few triples of
> >      >> resource. This is run for about 50k times.
> >      >>
> >      >> All settings are default, using GSP.
> >      >>
> >      >>
> >
> >      --
> >      Lingsoft - 30 years of Leading Language Management
> >
> >      www.lingsoft.fi
> >
> >      Speech Applications - Language Management - Translation - Reader's
> and Writer's Tools - Text Tools - E-books and M-books
> >
> >      Mikael Pesonen
> >      System Engineer
> >
> >      e-mail: mikael.peso...@lingsoft.fi
> >      Tel. +358 2 279 3300
> >
> >      Time zone: GMT+2
> >
> >      Helsinki Office
> >      Eteläranta 10
> >      FI-00130 Helsinki
> >      FINLAND
> >
> >      Turku Office
> >      Kauppiaskatu 5 A
> >      FI-20100 Turku
> >      FINLAND
> >
> >
> >
> >
> >
> >
>
> --
> Lingsoft - 30 years of Leading Language Management
>
> www.lingsoft.fi
>
> Speech Applications - Language Management - Translation - Reader's and
> Writer's Tools - Text Tools - E-books and M-books
>
> Mikael Pesonen
> System Engineer
>
> e-mail: mikael.peso...@lingsoft.fi
> Tel. +358 2 279 3300
>
> Time zone: GMT+2
>
> Helsinki Office
> Eteläranta 10
> FI-00130 Helsinki
> FINLAND
>
> Turku Office
> Kauppiaskatu 5 A
> FI-20100 Turku
> FINLAND
>
>

-- 
Dan Pritts
ICPSR Computing & Network Services
University of Michigan

Reply via email to