Hmm, I thought Neo4j's page cache uses java.nio.ByteBuffer, which eventually does mmap, and hence it is technically backed by the OS page cache? I understand all of your other points.
On Wed, Jul 29, 2015 at 12:31 AM Chris Vest <[email protected]> wrote: > Neo4j has it’s own page cache, the size of which is controlled by > dbms.pagecache.memory. This page cache is not backed by the OS page cache, > and it is not used for Lucene’s memory mapping. Lucene does its own > independent IO, and thus benefits from the index files fitting in the OS > page cache. > > -- > Chris Vest > System Engineer, Neo Technology > [ skype: mr.chrisvest, twitter: chvest ] > > > On 29 Jul 2015, at 02:33, Zongheng Yang <[email protected]> wrote: > > Hi Neo4j devs, > > My application does the following: constantly do some Lucene index > lookups, then loop over the result nodes and get the IDs: > > ResourceIterator<Node> nodes = graphDb.findNodes( > label, "name" + attr, search); > Set<Long> userIds = new HashSet<Long>(); > while (nodes.hasNext()) { > userIds.add(nodes.next().getId()); > } > > *Environment.* Linux box, 15GB RAM, 2GB JVM heap. The Neo4j store files > total 29GB on-disk; the Lucene indexes total 6GB. Using Neo4j 2.2 embedded; > cache_type is set to none. > > *Symptom 1. *When the Neo4j page cache size (dbms.pagecache.memory) is > set to low enough (<= 8.5GB) -- hence leaving enough space for the Lucene > indexes -- the latency looks good enough. > > *Symptom 2.* However, when it is set slightly larger -- to 9.5GB or 10GB > -- the following starts to happen during the queries. *Constant* high IO > wait; the OS constantly reads in tens of MBs; *constant* stream of 3k+ > maj_flt for the Java process. It seems *as if the indexes could not > evict the Neo4j pages*, or in other words, as if *the index pages were > being independently LRU-cached*. The CPU constantly waits for IO to > bring in some pages (I'd guess most likely all Lucene pages) to do any work > (1% usr usage every ~10 seconds). > > This is very surprising to me, as I'd expect even in memory-constrained > cases like this the following would happen: the Lucene indexes would > compete against and eventually win over the Neo4j store pages (brought into > memory by full warmup done at start time) in the OS page cache, and hence > the high IO would occur initially but decrease to none later (5.8 GB of > indexes should fit comfortably in 15GB RAM). > > Could someone explain why the above would be happening? > > Zongheng > > > > -- > > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > > > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to a topic in the > Google Groups "Neo4j" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/neo4j/cXM8fKY8-zs/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
