Re: [Neo4j] Neo4j Lucene index lookup performance when memory-constrained

Zongheng Yang Wed, 29 Jul 2015 00:57:13 -0700

Hmm, I thought Neo4j's page cache uses java.nio.ByteBuffer, which
eventually does mmap, and hence it is technically backed by the OS page
cache?  I understand all of your other points.


On Wed, Jul 29, 2015 at 12:31 AM Chris Vest <[email protected]>
wrote:

> Neo4j has it’s own page cache, the size of which is controlled by
> dbms.pagecache.memory. This page cache is not backed by the OS page cache,
> and it is not used for Lucene’s memory mapping. Lucene does its own
> independent IO, and thus benefits from the index files fitting in the OS
> page cache.
>
> --
> Chris Vest
> System Engineer, Neo Technology
> [ skype: mr.chrisvest, twitter: chvest ]
>
>
> On 29 Jul 2015, at 02:33, Zongheng Yang <[email protected]> wrote:
>
> Hi Neo4j devs,
>
> My application does the following: constantly do some Lucene index
> lookups, then loop over the result nodes and get the IDs:
>
>         ResourceIterator<Node> nodes = graphDb.findNodes(
>             label, "name" + attr, search);
>         Set<Long> userIds = new HashSet<Long>();
>         while (nodes.hasNext()) {
>             userIds.add(nodes.next().getId());
>         }
>
> *Environment.* Linux box, 15GB RAM, 2GB JVM heap. The Neo4j store files
> total 29GB on-disk; the Lucene indexes total 6GB. Using Neo4j 2.2 embedded;
> cache_type is set to none.
>
> *Symptom 1. *When the Neo4j page cache size (dbms.pagecache.memory) is
> set to low enough (<= 8.5GB) -- hence leaving enough space for the Lucene
> indexes -- the latency looks good enough.
>
> *Symptom 2.* However, when it is set slightly larger -- to 9.5GB or 10GB
> -- the following starts to happen during the queries. *Constant* high IO
> wait; the OS constantly reads in tens of MBs; *constant* stream of 3k+
> maj_flt for the Java process.  It seems *as if the indexes could not
> evict the Neo4j pages*, or in other words, as if *the index pages were
> being independently LRU-cached*.  The CPU constantly waits for IO to
> bring in some pages (I'd guess most likely all Lucene pages) to do any work
> (1% usr usage every ~10 seconds).
>
> This is very surprising to me, as I'd expect even in memory-constrained
> cases like this the following would happen: the Lucene indexes would
> compete against and eventually win over the Neo4j store pages (brought into
> memory by full warmup done at start time) in the OS page cache, and hence
> the high IO would occur initially but decrease to none later (5.8 GB of
> indexes should fit comfortably in 15GB RAM).
>
> Could someone explain why the above would be happening?
>
> Zongheng
>
>
>
> --
>
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
>
>
> For more options, visit https://groups.google.com/d/optout.
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "Neo4j" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/neo4j/cXM8fKY8-zs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Neo4j Lucene index lookup performance when memory-constrained

Reply via email to