I think a bit of elaboration might be in order. EmbeddedReadOnlyGraphDatabase was created for one specific purpose:
Being able to interactively introspect a graph without having to shut down the application that uses it. Specifically the tools that we wanted to support with this were the Neo4j shell and Neoclipse. EmbeddedReadOnlyGraphDatabase (EROGD) has two major issues with way caching is done internally in Neo4j (one issue with each cache): - When the EROGD reads data from the file system it will, like a normal EGD, cache the node and relationship objects. If a normal EGD modifies the graph "under the feet" of the EROGD, there is no way for the EROGD to know that the data in cache is now stale, which will lead to an inconsistent view of the graph. If for example the EROGD has cached Node[15] with the information that it is connected to some other node through Relationship[344], and Relationship[344] is deleted you will get InvalidRecordException (as you described). And of course if relationships are added to Node[15] these will not be seen at all by the EROGD (until Node[15] is evicted from the cache due to not being used for a while). - Neo4j also caches data on the filesystem level by memory mapping (mmap) hot regions of the store files. Writes to these regions will not be flushed to the actual file until the mmapped window is evicted due to being less hot than other windows, or when the transaction log for Neo4j is rotated. This means that from the p.o.v. of the EROGD the actual data written to disk will look inconsistent. Which would also lead to InvalidRecordExcaption. This situation is actually made even more complicated by the fact that unix operating systems will attempt to share memory mapped data from the same file between multiple processes, but the normal EGD and the EROGD will not make the same decisions on which regions to mmap, they might not even decide on the same size for mmap windows. We haven't tested how well different operating systems deal with reading data that was written to an mmap region through non-mmap syscalls from a different process, most likely this varies from OS to OS. The second of these problems is of course the worst, since it cannot be worked around. The first one can be mitigated by configuring Neo4j to not use the object cache, by passing the cache_type=none parameter to the constructor of the EROGD. This should really be made default for EROGD, unless we decide to completely remove EROGD. I hope that sheds some light on the reasons why you experience these problems with EmbeddedReadOnlyGraphDatabase, and what the intention of creating it was. As a side note I can mention that I had a different idea for how to solve the introspection-of-live-graph problem at the time EmbeddedReadOnlyGraphDatabase was created: Create network based implementation of the GraphDatabaseService API and connect directly to the running instance. This would completely avoid the cache staleness problem, but at the cost of network overhead for each graph operation, which is probably fine for tooling purposes. With the JVM agent attach protocol it would be possible to inject such a server into a running graph database that wasn't originally configured for it. I in fact implemented this as the RemoteGraphDatabase subproject. Since my colleagues did not share my vision about that idea, this project didn't receive much attention after its initial inception. It was also never really used for these purposes, but rather misused for building applications, leading us to deprecate the project. When we then later discovered a severe bug in the implementation of the remote transaction handling logic, we completely removed the project. I still believe this to be a superior model for tools, but would build it differently if I were to build it today. -tobias On Mon, Aug 1, 2011 at 4:48 PM, Jim Webber <j...@neotechnology.com> wrote: > Hi Mathias, > > EmbeddedReadOnlyGraphDatabase is not quite what it seems, and I think > should be deprecated/removed. The correct way for database instances to > become consistent is through the HA protocol. > > Jim > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Tobias Ivarsson <tobias.ivars...@neotechnology.com> Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user