Wow, thanks a lot for the quick and kind reply! I did browse the package that you suggested and hope that the following notes reflect the underlying data/storage model accurately (in case anybody else is interested):
Neo maintains 6 files (excluding indexes): - Node Storage - Node Records are fixed length and of the form [node_id,relationship_id,property_id], where the latter two are pointers to the respective records in another file. - Do you store a number of such records for each node, namely max(#relations, #properties)? Or is there just one such record and other relationships and properties are retrieved by following the linked lists in the respective storage containers? - Relationship Storage - Relationship records contain the id of the relationship, id of relationship type, pointers to the source and destination nodes, as well as pointers to the next and previous relationships on both nodes (i.e. doubly linked list), and a pointer to the next (first) property associated with the edge. - RelationshipType Storage - ID plus associated data - string (dynamic length) - 3 files to store the properties, which are variable length. Distinguishes between string and array properties, plus one index file on properties. - Properties are stored in blocks and Neo4j maintains which blocks are available - Property records contains property type (char, string, etc), index block id, previous and next block, previous and next property, associated values plus some header information. I hope that this is a somewhat accurate description of what is going on. One thing that I could not figure out: how to you prevent that the relationships of a single node get scattered throughout the relationship storage file which would lead to a lot of seeking if I wanted to retrieve the relationships of a node? And similarly so for the properties? Thank you, Matthias On Tue, May 5, 2009 at 5:14 PM, Anders Nawroth <and...@neotechnology.com>wrote: > hi! > > > Your best shot right now is to search the mailing list archives and > > look at the code. > > And here's the searchable mail archive: > http://www.mail-archive.com/user@lists.neo4j.org/info.html > > /anders > > > > If you are interested in how nodes and relationships > > are stored on disk I would suggest take a look at > > org.neo4j.impl.nioneo.store package. > > > > There is also a batch insert implementation (that will be available in > > b9 release) that can be found int the org.neo4j.impl.batchinsert > > package. The batch insert implementation works directly with the > > nioneo.store package and drops support for transactions and the like > > to achieve maximum insert speed. It may be easier to start looking > > there since transactions tend to make things more complicated. > > > > Regards, > > Johan > > > > On Tue, May 5, 2009 at 7:50 AM, Matthias Broecheler > > <matth...@knowledgefrominformation.com> wrote: > > > >> Hello everybody, > >> > >> I just spend a good deal of time learning about Neo4j and got very > >> interested in the database framework. It seems applicable to the kind of > >> things I would like to do on graph datasets. However, I could not find > any > >> information on the underlying data model of neo4j. How are nodes/edges > >> stored on disk? How do you optimize retrieval? What kind of index > structures > >> do you use? Do you have any information for contributing developers? > >> > >> Thank you very much and keep up the good work. > >> Best, > >> Matthias > >> > > _______________________________________________ > > Neo mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > > > -- > Anders Nawroth [and...@neotechnology.com] > GTalk, Skype: anders.nawroth > Phone: +46 737 894 163 > http://twitter.com/nawroth > http://blog.nawroth.com/ > > _______________________________________________ > Neo mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Matthias Broecheler Department of Computer Science University of Maryland 4468 A.V. Williams Building College Park, MD 20742 USA Phone: +1.240.476.7110 E-Mail: matth...@cs.umd.edu _______________________________________________ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user