Re: [Neo4j] Modelling with neo4j

Peter Hunsberger Sat, 24 Sep 2011 20:48:14 -0700

I'm going to take a slightly different tack here than the responses you've
got so far...

First as others have pointed out, this is three entities and two
relationships

    joe - is a - janitor - at the - school

This is important, the lighter weight the relationships, the less problem
you are going to have with needing some form of meta type for them.  You
really want to follow the basic OO lead here and stick to "isa" and "hasa".
 The "at the" is essentially a "hasa" (a school has a janitor).  If you can
map your relationships to these two simple concepts then you've got a
realistic model, otherwise you probably need to refactor / abstract or
normalize (take your pick depending on your background, they're essentially
the same thing at this level of modelling).

That leads to point number two.  Modelling is modelling is modelling, and
although a graph database might let you get up and running easily it's not
going to save you from modelling your domain properly in the long run.  The
worse job you do up front the more work it's going to be to fix it in the
long run.  One common pattern is that if you think you have a meta type then
use that for your entity and add the details that makes it a specific sub
type as instance data:

    a janitor is a type of job

therefore the entity should be "job", not janitor, job type is a property of
job

    school is a type of building (or maybe even more abstractly a type of
location), building type is a property of building

therefore the entity should be building. Though in this case it may be
location depending on exactly what the domain is going to be used for,
though most likely a building "hasa" location...

As long as your number of subtypes is some reasonably small number then this
pattern of abstraction works well.  Here, "reasonably small" is in the range
of things you don't mind coding up a enum for in your code.  (How's that for
a non-committal type guild-line? I'll pick 10 as an arbitrary limit if
pushed.)

Note that this need for abstraction is true for relational or graph
databases. A relational database can behave polymorphically with respect to
type just as much as a graph database, the difference is that with a
relational database, as you make the model more abstract the number of joins
needed to fully resolve type goes up (assuming its a fully normalized
model).  With a graph database you can always be an edge away.  However, in
this case the cost is the number of relationships that must be examined.
 There is no free lunch, the space / time trade off will always be there and
that is what you have to worry about as you determine whether you want to
abstract more and build more of a meta model.

This brings us to my third point.  Your layers are perfectly realizable in a
graph database (or a relational database for that matter).  There is no
reason why the entity "janitor" can't have an "isa" relationship with
"person" and it in turn can have an "isa" relationship with the metatype
"mammal" if need be. Same with school, building and even structure if you
need to go that far. If you really need this, proceed with caution, the
metatypes are going to have many relationships as your instances grow and
the cost of maintaining the metadata this way could get expensive, not so
much in ways of space (since you've normalized out the metadata) but in
terms of the time needed to traverse all the relationships.  However, if you
have millions of each janitor or school and all of their equivalent sub
types then this may be the way to go.  IE; if you have M subtypes of job
where M is a medium sized number (say 10 < M > 50) then you have can reduce
the size of the index on job by instead implementing the M different
subtypes (you more-or-less divide the index size for jobs by M).  As I said,
the cost is that you know have a whole bunch more relationship to the
metatype, but if you don't normally have to touch them, or if finding
specific instances of a subtype is a common and / or maybe expensive
operation for some reason you've got the right model.

Last point.  What about the case where you have more than 50, or whatever
you consider some large number of subtypes?  In that case your model is
likely wrong.  There's hopefully a way to split the subtypes into categories
. If some categories overlap then figure out what makes them overlap and
split that off as a category (type or metatype) of it's own.  In other
words, it's time to refactor, which takes us full circle back to point 1 and
it's now time to go get some sleep.

Hope this helps.

Peter Hunsberger

On Sat, Sep 24, 2011 at 12:52 AM, loldrup <lold...@gmail.com> wrote:

> I'm trying to figure out how to model the world most flexibly (okay, so I'm
> sticking to modelling organisations for now, but still). My main problem
> seems to occur when I want to allow the model to naturally expand in
> complexity. Say we have the following relationship:
>
> Joe is a janitor at the school.
>
> This can easily be modelled with two entities and a relationship. Now say I
> have some common properties for janitors. I would have to make a link from
> the janitor-relation to some node denoting the type 'janitor' which could
> then hold information on these common things. Unfortunately, relationships
> doesn't support that.
>
> Long story short: the problem is that sometimes I want my things to act as
> things, sometimes as types, sometimes as interfaces, and I cannot know in
> advance which of these modalities I'm going to need.
>
> Therefore, I'm considering going with this model:
>
> Imagine a graph in three layers. The lower layer represents things, the
> middle layer represents types and the upper layer represents interfaces.
> Initially i populate only the lowest layer, but as need arise I go back and
> promote various things to also be types or interfaces. These then crop up
> in
> the second and third layer of the graph, respectively. When this happens, a
> vertical relationship is added between the element in the lower layer and
> its new type/interface in three higher layers.
>
> Now the question is: how to model this scheme in neo4j? A number of
> challenges pops up:
>
> * Neo4j relationships cannot be n-ary, so every relationship must be
> modelled with a hyperrelationship, thus allowing future relations to the
> second and third layers.
>
> * In a modalities-are-a-changing-paradigm it doesn't really make sense to
> distinguish between relations and entities; at different points in time,
> one
> element may have to act in the roles of both. Neo4j however makes a
> fundamental destinction between the two things. I could choose too model
> all
> relationships as nodes, but will that not make graph traversals messy?
>
> * Neo4j doesn't come with a type strong destinction between such three
> layers of modalityy
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Modelling-with-neo4j-tp3363823p3363823.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Modelling with neo4j

Reply via email to