Re: [Neo4j] Traversal RelationshipExpander

2010-11-26 Thread Kalin Wilson Development
Thanks again, Craig. I think we're on the same track. I agree about the 'join' 
node. I'm not quite sure what to call it yet in my model but the concept looks 
right.
Thanks for the traversal tips, they make sense too.

Kalin

On Nov 25, 2010, at 4:50 AM, Craig Taverner wrote:

 Hi Kalin,
 
 I'm not sure I follow about duplicating. The suggestion I made did not
 involve any duplicating. The AssetType nodes would contain properties
 appropriate to the type of asset, and the Asset node would contain
 properties only appropriate to that instance (or none if none are
 appropriate). Perhaps it is easier if I rather revert back to your original
 names, and use Asset for the type again, and come up with a new name for the
 specific instance? Then you get something like:
 
 Producer[1] --(contributes_to)-- [X] --(subscribes_to)-- Consumer1
   |
 (IS_A)
   |
   V
   Asset[P] --(subscribes_to)-- Consumer2
 
 
 Now you can see that the X is really just like a join table in a relational
 database. In fact my suggestion is very similar to a common refactoring that
 occurs in a relational database when you have a foreign key to another table
 and then need to add properties to that relationship, you create an
 intermediate 'join table' and add the properties there.
 
 I definitely still think you will need to expand your
 producer-asset-consumer triple to be a producer-x-asset-consumer, where the
 'x' is the 'join table' that allows a consumer to subscribe to assets by a
 particular producer. Which is what you want, right?
 
 I will answer, based on my model suggestion, each of the queries you ask
 below:
 
 1. Given a Consumer, what Assets does it subscribe to?
 
 
 Traverse from consumer along both outgoing 'subscribes_to' and outgoing
 'is_a' relationships, and you will get all assets regardless of whether the
 consumer subscribes to the asset in general, or the specific producer-asset.
 
 
 2. Given a Consumer, what Producers is it dependent upon?
 
 
 Traverse from consumer along outgoing 'subscribes_to', incoming 'is_a' and
 incoming 'contributes_to' relationships, and you will get all producers of
 all assets (specific or otherwise) that the consumer subscribes to.
 
 
 3. Given a Producer, what Assets does it contribute to?
 
 
 Traverse from producer along outgoing 'contributes_to', and outgoing 'is_a'
 relationships, and you get the Assets they they contribute to.
 
 
 4. Given a Producer, what Consumers are dependent on it?
 
 
 Traverse from producer along outgoing 'contributes_to', outgoing 'is_a'
 relationships and incoming 'subscribes_to' and you get all Consumers that
 depend on assets that producer contributes to. This one is more subtle,
 though, because it depends on what you mean by 'dependent on'. Since the
 traverser I described will not exclude consumers that subscribe_to assets
 that are produced by both the given producer and others also (giving the
 consumer the choice of producer). If you only want consumers that have no
 choice, leave the 'is_a' relationship out of the traverser.
 
 
 5. Given an Asset, what Producers contribute to it and what Consumers
 subscribe to it?
 
 
 For producers traverse on incoming 'is_a' and incoming *contributes_to'. For
 consumers traverse on incoming 'is_a' and incoming 'subscribes_to'.
 
 There's other node types and properties within the graph that aren't
 important or that I can't discuss (this is a mock model anyway).
 
 
 Hopefully the fact that I renamed my AssetType back to your original Asset,
 and have a new 'join table', possibly called 'producer-asset' for the
 additional concept I added before will make it easier to see how to fit this
 model into your existing structures.
 
 What is the consensus about duplicating node data within a network? I can
 see how using indexing or a hierarchy, as you've pointed out, might help
 with that.
 Part of my hangup is that I'm looking at using Neo4J as an adjunct to a
 RDBMS to store dependency relationships. Each node will have information
 that ties back to the RDBMS for lookups. But that doesn't require absolutely
 unique nodes. Keeping the two DBs in sync will be a challenge but I don't
 have the option to push all of the data into Neo4J and I'd rather not manage
 the dependencies in the RDBMS.
 
 
 I dislike duplicating data in neo4j as much as I dislike duplicating it in a
 RDBMS. I'm sorry if I gave the impression in my previous email that I was
 suggestion duplication. My suggestion was just the addition of this
 'join-table' idea (an extra node in the graph), so make it possible to
 capture the concept of a consumer subscribing to an asset and an asset by a
 particular producer.
 
 Actually, now that I've talked around it, I think I see how your model would
 work for what I want to do. I'd have to see how a Traverser would return the
 results.
 

Re: [Neo4j] Traversal RelationshipExpander

2010-11-25 Thread Kalin Wilson Development
Thanks for your suggestions regarding my model, Craig. I agree about the ids. 
Using a generalization for Asset Types might be something to look at.
I'm still stuck on wanting to capture the general nature of an Asset as well as 
the specific triple of Producer-Asset-Consumer. I'll have to think about your 
statement that Asset is
specific to a Producer. In my current model, an Asset may have multiple 
Producers and multiple Consumers. The concept of an Asset is more general than 
a single Producer-Consumer relationship.
Maybe it's appropriate to duplicate the Asset node (not the actual node but the 
properties that identify an Asset). My relational DB mind doesn't want that 
kind of duplication.

Some of the questions I want to answer:
1. Given a Consumer, what Assets does it subscribe to?
2. Given a Consumer, what Producers is it dependent upon?
3. Given a Producer, what Assets does it contribute to?
4. Given a Producer, what Consumers are dependent on it?
5. Given an Asset, what Producers contribute to it and what Consumers subscribe 
to it?

There's other node types and properties within the graph that aren't important 
or that I can't discuss (this is a mock model anyway).

What is the consensus about duplicating node data within a network? I can see 
how using indexing or a hierarchy, as you've pointed out, might help with that.
Part of my hangup is that I'm looking at using Neo4J as an adjunct to a RDBMS 
to store dependency relationships. Each node will have information that ties 
back to the RDBMS for lookups. But that doesn't require absolutely unique 
nodes. Keeping the two DBs in sync will be a challenge but I don't have the 
option to push all of the data into Neo4J and I'd rather not manage the 
dependencies in the RDBMS.

Actually, now that I've talked around it, I think I see how your model would 
work for what I want to do. I'd have to see how a Traverser would return the 
results.
Thanks for your time,
 Kalin

On Nov 24, 2010, at 4:50 PM, Craig Taverner wrote:

 Hi,
 
 I also do not like having the producers ids in the relationship. This is
 like having an non-indexed foreign key. I think the right solution is to
 change the database structure to match the intention of the model.
 
 I'll go out on a limb here and make some assumptions about what you really
 mean by your model. You want to ask 'who consumes assets by this producer'.
 Your problem is that what you are calling an Asset is actually a type of
 Asset, but some consumers are interested in a specific asset, not just all
 assets of that type, and so you are having to add extra data to resolve the
 type in the 'subscribes' relationship. A far better solution is to have a
 real asset node. By that I mean a 'ford mustan '68' node instead of a 'car'
 node.
 
 Then the case where the subscribes node has no producers property really
 means the Consumer subscribers to the AssetType, and the case where there
 are producers properties, the Consumer subscribers to those specific Assets
 (where Asset is now specific to a producer).
 
 Then the model will have no foreign keys, only relationships, and a plain
 old standard out-the-box traverser will give you your answer :-)
 
 Some further clarifications of the relationships I see in the graph:
 
   - Assets can have IS_A relationships to AssetTypes
   - Produces CONTRIBUTE_TO Assets, never AssetTypes (just like Ford
   produces the 'Ford Mustang', not 'cars')
   - Consumers can SUBSCRIBE_TO either Assets directly (ie. specific
   models), or AssetTypes (this deals with your two cases before, but no longer
   needs ids in properties)
 
 The traverser can traverse directly from the Producer to the Subscriber
 without any complications. Just follow the right relationships in the right
 order, and only return Subscribers.
 
 The paths could look like:
 
 Producer1 --(contributes_to)-- AssetX --(subscribes_to)-- Consumer1
   |
 (IS_A)
   |
   V
AssetTypeP --(subscribes_to)-- Consumer2
 
 Notice that traversing to consumers that are subscribing to specific assets
 (assets by specific producers) is a shorter path than traversing to
 consumers that subscribe to assets by any producer (asset types). This
 should have no impact on the traverser. Just remember to include the IS_A
 relationships type (with the right direction) to get the results you want.
 
 Cheers, Craig
 
 On Thu, Nov 25, 2010 at 12:00 AM, Mattias Persson matt...@neotechnology.com
 wrote:
 
 Hi Kalin,
 
 To begin with I'm not fond of storing ids as properties... that's what
 relationships are for. So I'd perhaps insert a middle node between Asset
 and Consumer which then also can have relationships to Producers.
 
 Anyways, to get that behaviour you can use a filter which will exclude
 unwanted paths.
 
 Traversal.description().uniqueness(RELATIONSHIP_PATH)
  

Re: [Neo4j] Traversal RelationshipExpander

2010-11-25 Thread Craig Taverner
Hi Kalin,

I'm not sure I follow about duplicating. The suggestion I made did not
involve any duplicating. The AssetType nodes would contain properties
appropriate to the type of asset, and the Asset node would contain
properties only appropriate to that instance (or none if none are
appropriate). Perhaps it is easier if I rather revert back to your original
names, and use Asset for the type again, and come up with a new name for the
specific instance? Then you get something like:

Producer[1] --(contributes_to)-- [X] --(subscribes_to)-- Consumer1
   |
 (IS_A)
   |
   V
   Asset[P] --(subscribes_to)-- Consumer2


Now you can see that the X is really just like a join table in a relational
database. In fact my suggestion is very similar to a common refactoring that
occurs in a relational database when you have a foreign key to another table
and then need to add properties to that relationship, you create an
intermediate 'join table' and add the properties there.

I definitely still think you will need to expand your
producer-asset-consumer triple to be a producer-x-asset-consumer, where the
'x' is the 'join table' that allows a consumer to subscribe to assets by a
particular producer. Which is what you want, right?

I will answer, based on my model suggestion, each of the queries you ask
below:

1. Given a Consumer, what Assets does it subscribe to?


Traverse from consumer along both outgoing 'subscribes_to' and outgoing
'is_a' relationships, and you will get all assets regardless of whether the
consumer subscribes to the asset in general, or the specific producer-asset.


 2. Given a Consumer, what Producers is it dependent upon?


Traverse from consumer along outgoing 'subscribes_to', incoming 'is_a' and
incoming 'contributes_to' relationships, and you will get all producers of
all assets (specific or otherwise) that the consumer subscribes to.


 3. Given a Producer, what Assets does it contribute to?


Traverse from producer along outgoing 'contributes_to', and outgoing 'is_a'
relationships, and you get the Assets they they contribute to.


 4. Given a Producer, what Consumers are dependent on it?


Traverse from producer along outgoing 'contributes_to', outgoing 'is_a'
relationships and incoming 'subscribes_to' and you get all Consumers that
depend on assets that producer contributes to. This one is more subtle,
though, because it depends on what you mean by 'dependent on'. Since the
traverser I described will not exclude consumers that subscribe_to assets
that are produced by both the given producer and others also (giving the
consumer the choice of producer). If you only want consumers that have no
choice, leave the 'is_a' relationship out of the traverser.


 5. Given an Asset, what Producers contribute to it and what Consumers
 subscribe to it?


For producers traverse on incoming 'is_a' and incoming *contributes_to'. For
consumers traverse on incoming 'is_a' and incoming 'subscribes_to'.

There's other node types and properties within the graph that aren't
 important or that I can't discuss (this is a mock model anyway).


Hopefully the fact that I renamed my AssetType back to your original Asset,
and have a new 'join table', possibly called 'producer-asset' for the
additional concept I added before will make it easier to see how to fit this
model into your existing structures.

What is the consensus about duplicating node data within a network? I can
 see how using indexing or a hierarchy, as you've pointed out, might help
 with that.
 Part of my hangup is that I'm looking at using Neo4J as an adjunct to a
 RDBMS to store dependency relationships. Each node will have information
 that ties back to the RDBMS for lookups. But that doesn't require absolutely
 unique nodes. Keeping the two DBs in sync will be a challenge but I don't
 have the option to push all of the data into Neo4J and I'd rather not manage
 the dependencies in the RDBMS.


I dislike duplicating data in neo4j as much as I dislike duplicating it in a
RDBMS. I'm sorry if I gave the impression in my previous email that I was
suggestion duplication. My suggestion was just the addition of this
'join-table' idea (an extra node in the graph), so make it possible to
capture the concept of a consumer subscribing to an asset and an asset by a
particular producer.

Actually, now that I've talked around it, I think I see how your model would
 work for what I want to do. I'd have to see how a Traverser would return the
 results.
 Thanks for your time,


Super. Perhaps I should have read your complete mail before writing this new
one or hopefully the new one, especially all the traverser suggestions,
has helped clarify further ? :-)

Regards, Craig
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Traversal RelationshipExpander

2010-11-24 Thread Kalin Wilson
   Im enjoying Neo4J so far. The new Traversal framework has a lot of
   potential. However, Id like to propose an extension to the
   RelationshipExpander interface or have someone tell me of another way
   to accomplish a task.


   Here is an outline of the basics of my network:


   Producer  contributes_to = Asset = subscribes_to Consumer (various
   properties on each type of Node and Relationship)


   A Consumer may subscribe to an Asset generically or the subscribes_to
   relationship may have a property, producers, which is an array of ids
   (long) of Producer nodes that specifically produce assets for that
   Consumer.


   Given a list of Producer nodes, I want to retrieve all paths from a
   Producer to all of its Consumers through Assets. When at an Asset node
   the subscribes_to relationship should only be traversed if the
   relationship has no producers property (meaning the related Consumer
   consumes from all producers of that asset) or if the producers property
   contains the id of the Producer that we started with.


   My first approach was to implement RelationshipExpander to determine
   which relationships from Asset to traverse. However, since all the
   Expand() method has to work with is the current Node, I dont have
   enough information to make the decision above. I would need the current
   Path in order to know what Producer related to the current Asset the
   Traversal came from.


   Right now Im using this description which will give me
   Producer-Asset-Consumer paths but wont exclude based on producers on
   subscribes_to:


   TraversalDescription producerToConsumer = Traversal.description()

   .depthFirst().uniqueness(Uniqueness.RELATIONSHIP_PATH)

   .relationships(RelTypes.Contributes_to, Direction.OUTGOING)

   .relationships(RelTypes.Subscribes_to,Direction.BOTH)

   .prune(Traversal.pruneAfterDepth(2));



   If there is a way to accomplish what I describe above with the current
   framework, Im open to suggestions, including a different network design
   to track the Producer to Consumer relationship as it relates to Asset.


   Alternatively, I suggest that there may be situations where the Path
   context is needed to make the decision on what relationships to
   traverse from a Node, hence perhaps add Expand(Path p) to the
   RelationshipExpander interface.


   Thanks,

   Kalin
   --
   Kalin Wilson
   http://www.kalinwilson.com
   
   Message sent using UebiMiau 2.7.9
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversal RelationshipExpander

2010-11-24 Thread Mattias Persson
Hi Kalin,

To begin with I'm not fond of storing ids as properties... that's what
relationships are for. So I'd perhaps insert a middle node between Asset
and Consumer which then also can have relationships to Producers.

Anyways, to get that behaviour you can use a filter which will exclude
unwanted paths.

Traversal.description().uniqueness(RELATIONSHIP_PATH)
   .relationships(Contributes_to, OUTGOING)
   .relationships(Subscribes_to, INCOMING)
   .filter(new PredicatePath() {
   public boolean accept(Path position) {
   if ( position.length() != 2 ) return false;
   Relationship subscribesToRel = position.lastRelationship();
   if ( /* check properties on subscribesToRel is OK */ ) return
true;
   return false;
   }
   });

do you need the pruneAfterDepth(2) here? I don't think so because I don't
think your traverser will be able to go deeper anyways, but that's just a
detail.

2010/11/24 Kalin Wilson d...@kalinwilson.com

   Im enjoying Neo4J so far. The new Traversal framework has a lot of
   potential. However, Id like to propose an extension to the
   RelationshipExpander interface or have someone tell me of another way
   to accomplish a task.


   Here is an outline of the basics of my network:


   Producer  contributes_to = Asset = subscribes_to Consumer (various
   properties on each type of Node and Relationship)


   A Consumer may subscribe to an Asset generically or the subscribes_to
   relationship may have a property, producers, which is an array of ids
   (long) of Producer nodes that specifically produce assets for that
   Consumer.


   Given a list of Producer nodes, I want to retrieve all paths from a
   Producer to all of its Consumers through Assets. When at an Asset node
   the subscribes_to relationship should only be traversed if the
   relationship has no producers property (meaning the related Consumer
   consumes from all producers of that asset) or if the producers property
   contains the id of the Producer that we started with.


   My first approach was to implement RelationshipExpander to determine
   which relationships from Asset to traverse. However, since all the
   Expand() method has to work with is the current Node, I dont have
   enough information to make the decision above. I would need the current
   Path in order to know what Producer related to the current Asset the
   Traversal came from.


   Right now Im using this description which will give me
   Producer-Asset-Consumer paths but wont exclude based on producers on
   subscribes_to:


   TraversalDescription producerToConsumer = Traversal.description()

   .depthFirst().uniqueness(Uniqueness.RELATIONSHIP_PATH)

   .relationships(RelTypes.Contributes_to, Direction.OUTGOING)

   .relationships(RelTypes.Subscribes_to,Direction.BOTH)

   .prune(Traversal.pruneAfterDepth(2));



   If there is a way to accomplish what I describe above with the current
   framework, Im open to suggestions, including a different network design
   to track the Producer to Consumer relationship as it relates to Asset.


   Alternatively, I suggest that there may be situations where the Path
   context is needed to make the decision on what relationships to
   traverse from a Node, hence perhaps add Expand(Path p) to the
   RelationshipExpander interface.


   Thanks,

   Kalin
   --
Kalin Wilson
   http://www.kalinwilson.com
   
   Message sent using UebiMiau 2.7.9
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversal RelationshipExpander

2010-11-24 Thread Craig Taverner
Hi,

I also do not like having the producers ids in the relationship. This is
like having an non-indexed foreign key. I think the right solution is to
change the database structure to match the intention of the model.

I'll go out on a limb here and make some assumptions about what you really
mean by your model. You want to ask 'who consumes assets by this producer'.
Your problem is that what you are calling an Asset is actually a type of
Asset, but some consumers are interested in a specific asset, not just all
assets of that type, and so you are having to add extra data to resolve the
type in the 'subscribes' relationship. A far better solution is to have a
real asset node. By that I mean a 'ford mustan '68' node instead of a 'car'
node.

Then the case where the subscribes node has no producers property really
means the Consumer subscribers to the AssetType, and the case where there
are producers properties, the Consumer subscribers to those specific Assets
(where Asset is now specific to a producer).

Then the model will have no foreign keys, only relationships, and a plain
old standard out-the-box traverser will give you your answer :-)

Some further clarifications of the relationships I see in the graph:

   - Assets can have IS_A relationships to AssetTypes
   - Produces CONTRIBUTE_TO Assets, never AssetTypes (just like Ford
   produces the 'Ford Mustang', not 'cars')
   - Consumers can SUBSCRIBE_TO either Assets directly (ie. specific
   models), or AssetTypes (this deals with your two cases before, but no longer
   needs ids in properties)

The traverser can traverse directly from the Producer to the Subscriber
without any complications. Just follow the right relationships in the right
order, and only return Subscribers.

The paths could look like:

Producer1 --(contributes_to)-- AssetX --(subscribes_to)-- Consumer1
   |
 (IS_A)
   |
   V
AssetTypeP --(subscribes_to)-- Consumer2

Notice that traversing to consumers that are subscribing to specific assets
(assets by specific producers) is a shorter path than traversing to
consumers that subscribe to assets by any producer (asset types). This
should have no impact on the traverser. Just remember to include the IS_A
relationships type (with the right direction) to get the results you want.

Cheers, Craig

On Thu, Nov 25, 2010 at 12:00 AM, Mattias Persson matt...@neotechnology.com
 wrote:

 Hi Kalin,

 To begin with I'm not fond of storing ids as properties... that's what
 relationships are for. So I'd perhaps insert a middle node between Asset
 and Consumer which then also can have relationships to Producers.

 Anyways, to get that behaviour you can use a filter which will exclude
 unwanted paths.

 Traversal.description().uniqueness(RELATIONSHIP_PATH)
   .relationships(Contributes_to, OUTGOING)
   .relationships(Subscribes_to, INCOMING)
   .filter(new PredicatePath() {
   public boolean accept(Path position) {
   if ( position.length() != 2 ) return false;
   Relationship subscribesToRel = position.lastRelationship();
   if ( /* check properties on subscribesToRel is OK */ ) return
 true;
   return false;
   }
   });

 do you need the pruneAfterDepth(2) here? I don't think so because I don't
 think your traverser will be able to go deeper anyways, but that's just a
 detail.

 2010/11/24 Kalin Wilson d...@kalinwilson.com

Im enjoying Neo4J so far. The new Traversal framework has a lot of
potential. However, Id like to propose an extension to the
RelationshipExpander interface or have someone tell me of another way
to accomplish a task.
 
 
Here is an outline of the basics of my network:
 
 
Producer  contributes_to = Asset = subscribes_to Consumer (various
properties on each type of Node and Relationship)
 
 
A Consumer may subscribe to an Asset generically or the subscribes_to
relationship may have a property, producers, which is an array of ids
(long) of Producer nodes that specifically produce assets for that
Consumer.
 
 
Given a list of Producer nodes, I want to retrieve all paths from a
Producer to all of its Consumers through Assets. When at an Asset node
the subscribes_to relationship should only be traversed if the
relationship has no producers property (meaning the related Consumer
consumes from all producers of that asset) or if the producers property
contains the id of the Producer that we started with.
 
 
My first approach was to implement RelationshipExpander to determine
which relationships from Asset to traverse. However, since all the
Expand() method has to work with is the current Node, I dont have
enough information to make the decision above. I would need the current
Path in order to know what Producer related to the current Asset the
Traversal 

Re: [Neo4j] Traversal RelationshipExpander

2010-11-24 Thread Kalin Wilson Development
Thanks Mattias. I agree about storing the IDs, it doesn't feel right but I'm 
still working out the best network model.
Thanks for the example, I guess I need to relook at when to use a filter vs 
control the traversal other ways.

On Nov 24, 2010, at 4:00 PM, Mattias Persson wrote:

 Hi Kalin,
 
 To begin with I'm not fond of storing ids as properties... that's what
 relationships are for. So I'd perhaps insert a middle node between Asset
 and Consumer which then also can have relationships to Producers.
 
 Anyways, to get that behaviour you can use a filter which will exclude
 unwanted paths.
 
 Traversal.description().uniqueness(RELATIONSHIP_PATH)
   .relationships(Contributes_to, OUTGOING)
   .relationships(Subscribes_to, INCOMING)
   .filter(new PredicatePath() {
   public boolean accept(Path position) {
   if ( position.length() != 2 ) return false;
   Relationship subscribesToRel = position.lastRelationship();
   if ( /* check properties on subscribesToRel is OK */ ) return
 true;
   return false;
   }
   });
 
 do you need the pruneAfterDepth(2) here? I don't think so because I don't
 think your traverser will be able to go deeper anyways, but that's just a
 detail.
 
 
 
 
 -- 
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user