Hi all, Accessing a remote database for read-centric purposes should be done through HA. Even if we could bind a read-only local instance of EGD to a data store on disk, the caching will become out of sync with respect to the on-disk store.
HA avoids this because it's a proper protocol for synchronising databases. Jim On 2 Aug 2011, at 21:01, Utility Mail wrote: > I agree! > In my opinion a remote access to a live instance of a GD is really to be > hoped. Let me explain my current test case with neo4j: I created an instance > of an EmbeddedGraphDatabase that ingests continously csv files coming froma a > polling service. At the same time I need to create an indipendent service > indipentent from the first one that query (to retrive and not to modify) the > GD. I'm tried with EROGD but the active index segment has become corrupted! > Even if EGD is thread safe and I can create multiple thread sharing the same > instance of GD what to do when, like in my case, I need to have indipendent > service (app) accessing at the same time to the EGD? > > > Paolo Forte > > p.s. > I'm not sure is correlated and for sure is a lack of my knowledge of > webadmin, but how can I control my EGD status (number of nodes,edges, etc.) > via webadmin while it is ingesting new data? > > > > Il giorno 01/ago/2011, alle ore 20:52, Tobias Ivarsson > <tobias.ivars...@neotechnology.com> ha scritto: > >> I think a bit of elaboration might be in order. >> >> EmbeddedReadOnlyGraphDatabase was created for one specific purpose: >> >> Being able to interactively introspect a graph without having to shut down >> the application that uses it. >> >> Specifically the tools that we wanted to support with this were the Neo4j >> shell and Neoclipse. >> >> EmbeddedReadOnlyGraphDatabase (EROGD) has two major issues with way caching >> is done internally in Neo4j (one issue with each cache): >> >> - When the EROGD reads data from the file system it will, like a normal >> EGD, cache the node and relationship objects. If a normal EGD modifies the >> graph "under the feet" of the EROGD, there is no way for the EROGD to know >> that the data in cache is now stale, which will lead to an inconsistent view >> of the graph. If for example the EROGD has cached Node[15] with the >> information that it is connected to some other node through >> Relationship[344], and Relationship[344] is deleted you will get >> InvalidRecordException (as you described). And of course if relationships >> are added to Node[15] these will not be seen at all by the EROGD (until >> Node[15] is evicted from the cache due to not being used for a while). >> - Neo4j also caches data on the filesystem level by memory mapping (mmap) >> hot regions of the store files. Writes to these regions will not be flushed >> to the actual file until the mmapped window is evicted due to being less hot >> than other windows, or when the transaction log for Neo4j is rotated. This >> means that from the p.o.v. of the EROGD the actual data written to disk will >> look inconsistent. Which would also lead to InvalidRecordExcaption. This >> situation is actually made even more complicated by the fact that unix >> operating systems will attempt to share memory mapped data from the same >> file between multiple processes, but the normal EGD and the EROGD will not >> make the same decisions on which regions to mmap, they might not even decide >> on the same size for mmap windows. We haven't tested how well different >> operating systems deal with reading data that was written to an mmap region >> through non-mmap syscalls from a different process, most likely this varies >> from OS to OS. >> >> The second of these problems is of course the worst, since it cannot be >> worked around. The first one can be mitigated by configuring Neo4j to not >> use the object cache, by passing the cache_type=none parameter to the >> constructor of the EROGD. This should really be made default for EROGD, >> unless we decide to completely remove EROGD. >> >> I hope that sheds some light on the reasons why you experience these >> problems with EmbeddedReadOnlyGraphDatabase, and what the intention of >> creating it was. >> >> As a side note I can mention that I had a different idea for how to solve >> the introspection-of-live-graph problem at the time >> EmbeddedReadOnlyGraphDatabase was created: Create network based >> implementation of the GraphDatabaseService API and connect directly to the >> running instance. This would completely avoid the cache staleness problem, >> but at the cost of network overhead for each graph operation, which is >> probably fine for tooling purposes. With the JVM agent attach protocol it >> would be possible to inject such a server into a running graph database that >> wasn't originally configured for it. I in fact implemented this as the >> RemoteGraphDatabase subproject. >> Since my colleagues did not share my vision about that idea, this project >> didn't receive much attention after its initial inception. It was also never >> really used for these purposes, but rather misused for building >> applications, leading us to deprecate the project. When we then later >> discovered a severe bug in the implementation of the remote transaction >> handling logic, we completely removed the project. >> I still believe this to be a superior model for tools, but would build it >> differently if I were to build it today. >> >> -tobias >> >> On Mon, Aug 1, 2011 at 4:48 PM, Jim Webber <j...@neotechnology.com> wrote: >> >>> Hi Mathias, >>> >>> EmbeddedReadOnlyGraphDatabase is not quite what it seems, and I think >>> should be deprecated/removed. The correct way for database instances to >>> become consistent is through the HA protocol. >>> >>> Jim >>> _______________________________________________ >>> Neo4j mailing list >>> User@lists.neo4j.org >>> https://lists.neo4j.org/mailman/listinfo/user >>> >> >> >> >> -- >> Tobias Ivarsson <tobias.ivars...@neotechnology.com> >> Hacker, Neo Technology >> www.neotechnology.com >> Cellphone: +46 706 534857 >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user