Wow, thanks a lot for the quick and kind reply!

I did browse the package that you suggested and hope that the following
notes reflect the underlying data/storage model accurately (in case anybody
else is interested):

Neo maintains 6 files (excluding indexes):

   - Node Storage
      - Node Records are fixed length and of the form
      [node_id,relationship_id,property_id], where the latter two are
pointers to
      the respective records in another file.
      - Do you store a number of such records for each node, namely
      max(#relations, #properties)? Or is there just one such record and other
      relationships and properties are retrieved by following the
linked lists in
      the respective storage containers?
      - Relationship Storage
      - Relationship records contain the id of the relationship, id of
      relationship type, pointers to the source and destination nodes,
as well as
      pointers to the next and previous relationships on both nodes
(i.e. doubly
      linked list), and a pointer to the next (first) property
associated with the
      edge.
   - RelationshipType Storage
      - ID plus associated data - string (dynamic length)
      - 3 files to store the properties, which are variable length.
   Distinguishes between string and array properties, plus one index file on
   properties.
      - Properties are stored in blocks and Neo4j maintains which blocks are
      available
      - Property records contains property type (char, string, etc), index
      block id, previous and next block, previous and next property, associated
      values plus some header information.

I hope that this is a somewhat accurate description of what is going on.
One thing that I could not figure out: how to you prevent that the
relationships of a single node get scattered throughout the relationship
storage file which would lead to a lot of seeking if I wanted to retrieve
the relationships of a node? And similarly so for the properties?

Thank you,
Matthias


On Tue, May 5, 2009 at 5:14 PM, Anders Nawroth <and...@neotechnology.com>wrote:

> hi!
>
> > Your best shot right now is to search the mailing list archives and
> > look at the code.
>
> And here's the searchable mail archive:
> http://www.mail-archive.com/user@lists.neo4j.org/info.html
>
> /anders
>
>
> >  If you are interested in how nodes and relationships
> > are stored on disk I would suggest take a look at
> > org.neo4j.impl.nioneo.store package.
> >
> > There is also a batch insert implementation (that will be available in
> > b9 release) that can be found int the org.neo4j.impl.batchinsert
> > package. The batch insert implementation works directly with the
> > nioneo.store package and drops support for transactions and the like
> > to achieve maximum insert speed. It may be easier to start looking
> > there since transactions tend to make things more complicated.
> >
> > Regards,
> > Johan
> >
> > On Tue, May 5, 2009 at 7:50 AM, Matthias Broecheler
> > <matth...@knowledgefrominformation.com> wrote:
> >
> >> Hello everybody,
> >>
> >> I just spend a good deal of time learning about Neo4j and got very
> >> interested in the database framework. It seems applicable to the kind of
> >> things I would like to do on graph datasets. However, I could not find
> any
> >> information on the underlying data model of neo4j. How are nodes/edges
> >> stored on disk? How do you optimize retrieval? What kind of index
> structures
> >> do you use? Do you have any information for contributing developers?
> >>
> >> Thank you very much and keep up the good work.
> >> Best,
> >>    Matthias
> >>
> > _______________________________________________
> > Neo mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
> --
> Anders Nawroth [and...@neotechnology.com]
> GTalk, Skype: anders.nawroth
> Phone: +46 737 894 163
> http://twitter.com/nawroth
> http://blog.nawroth.com/
>
> _______________________________________________
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Matthias Broecheler
Department of Computer Science
University of Maryland

4468 A.V. Williams Building
College Park, MD 20742
USA

Phone: +1.240.476.7110
E-Mail: matth...@cs.umd.edu
_______________________________________________
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to