Re: [Neo4j] [Neo] Newbie question

2010-05-25 Thread Peter Neubauer
Thomas,
if your dataset is that small, you could just start at any point, e.g.
iterating over all cars, and looking at every car node if it is
connected to the right nodes (only shown with the year node). I
updated the GIST at http://gist.github.com/411699 to reflect this.
This is a very crude example, but given your amount of data, you could
either start at the cars, or from e.g. a color or year node, that you
could find using Lucene(not included in this example, I am connecting
nodes to the reference node instead).

Does that help?

Cheers,

/peter neubauer

COO and Sales, Neo Technology

GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer

http://www.neo4j.org   - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Mon, May 24, 2010 at 11:18 PM, Thomas Sant'ana  wrote:
>> 3) Let say I have a Car graph, with:
>
>> >
>> > ReferenceNode (RN) --> Cars
>> > RN --->  Manufactures --> Ford / GM/ SAAB/ Volvo etc.
>> > RN --->  Colors ---> Grey, Silver, 
>> > RN ---> MakeYear ---> 2000, 2001, 2002
>> >
>> > A given car has a relaction to a Manufactures, Color, and Make year:
>> >
>> > Cars ---> aCar
>> > aCar --> Silver
>> > aCar ---> 2000
>> > aCar ---> SAAB
>> >
>> > How can I get all the 2000 - 2001, Silver, SAAB?
>>
>> I have made the basic structure in http://gist.github.com/411699 .
>> Now, the question is about how best to optimize set operations between
>> the different criteria. You could start at the year_2000 node (finding
>> it with an index lookup for the year 2000, to be added to the code) as
>> I did and iterate through all car nodes, and return everything that
>> has a COLOR connection to the Silver node, and a MANUFACTURER
>> connection to SAAB. However, having multiple years, you even could add
>> an index on top of e.g. the year nodes in order to be able to
>> effectively select multiple years. Do you have some more info on the
>> dataset sizes of your domain so I can flesh the GIST out a bit with
>> your search?
>>
>>
> I'm still looking at the information. I used the car example to make it
> simple do describe. My dataset should be small, should be some 5000 car like
> objects, and some 25000 related objects. I want to use relations so object
> like color, make, year etc to decorate the base objects. One reason for this
> is that I can incrementally decorate the objects as I release versions of my
> software.
>
> Thomas
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] [Neo] Newbie question

2010-05-25 Thread Craig Taverner
> How can I get all the 2000 - 2001, Silver, SAAB?
>

This is a query based on multiple properties and I think there are three
options for this:

   - Identify limited property and traverse on that, testing for the others
   in the traversal. I think this is what Peter was suggesting. So, for
   example, if we feel that selecting the age will limit the result set the
   most, traverse down the age tree to the cars in the correct ages, and for
   all cars found test if they have relations back to the 'silver' and 'saab'
   nodes.
   - Make a lucene index on the combinations of properties you are likely to
   search for. Essentially you are creating a composite key, like
   '2000-silver-saab', for each car and indexing that in lucene. This can make
   it slightly tricky to deal with ranges, but integer ranges like 2000-2001
   are simply two queries, so not too bad. Adding new properties to the cars
   requires creating new keys for those cars also.
   - Create composite index nodes for all unique combinations you are likely
   to search for. So instead of ref-->colors->silver-->carA and
   ref-->make-->saab-->carA, you would have
   ref-->colors-makes-->silver-saab-->carA. Since you are likely to have many
   'Silver SAAB's, this is still a decent index tree. If, however, the complete
   combination of properties becomes too diverse (so the index nodes are as
   plentify as the car nodes), this becomes less efficient. For your example,
   it looks good.

This last option is my personal favourite (as you can see be searching the
mailing lists :-)
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] [Neo] Newbie question

2010-05-25 Thread Thomas Sant'ana
On Tue, May 25, 2010 at 6:00 AM, Peter Neubauer <
peter.neuba...@neotechnology.com> wrote:

> Thomas,
> if your dataset is that small, you could just start at any point, e.g.
> iterating over all cars, and looking at every car node if it is
> connected to the right nodes (only shown with the year node). I
> updated the GIST at http://gist.github.com/411699 to reflect this.
> This is a very crude example, but given your amount of data, you could
> either start at the cars, or from e.g. a color or year node, that you
> could find using Lucene(not included in this example, I am connecting
> nodes to the reference node instead).
>

Does that help?
>
>
Sure does.

I know I have little objects to traverse, but I was more currious on how to
handle on a larger set of object. What has made me consider Neo is the
seamless way to enrich my data-set with new relations as I progress. My idea
is to add relations to represent aspects of the objects that can be searched
for. For example: one day I create a crawler that will update notes do map
the model number to some feature (say the number of doors the car has). Then
I can extend my UI to allow users to search for that criteria too (in an
ad-hoc manner).

In such a case adding lucene indexes may not be that easy. If the graph has
huge, maybe it would make since to start traversal from the nodes that will
yield the smallest starting graph.

Thomas
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] [Neo] Newbie question

2010-05-26 Thread Peter Neubauer
Thomas,
sure, the smallest starting graph is probably a viable approach
initially. If things get big, you can always add Lucene indexes as you
go, and search in them to get a smaller initial set of nodes.

HTH

Cheers,

/peter neubauer

COO and Sales, Neo Technology

GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer

http://www.neo4j.org   - Your high performance graph database.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Tue, May 25, 2010 at 3:39 PM, Thomas Sant'ana  wrote:
> On Tue, May 25, 2010 at 6:00 AM, Peter Neubauer <
> peter.neuba...@neotechnology.com> wrote:
>
>> Thomas,
>> if your dataset is that small, you could just start at any point, e.g.
>> iterating over all cars, and looking at every car node if it is
>> connected to the right nodes (only shown with the year node). I
>> updated the GIST at http://gist.github.com/411699 to reflect this.
>> This is a very crude example, but given your amount of data, you could
>> either start at the cars, or from e.g. a color or year node, that you
>> could find using Lucene(not included in this example, I am connecting
>> nodes to the reference node instead).
>>
>
> Does that help?
>>
>>
> Sure does.
>
> I know I have little objects to traverse, but I was more currious on how to
> handle on a larger set of object. What has made me consider Neo is the
> seamless way to enrich my data-set with new relations as I progress. My idea
> is to add relations to represent aspects of the objects that can be searched
> for. For example: one day I create a crawler that will update notes do map
> the model number to some feature (say the number of doors the car has). Then
> I can extend my UI to allow users to search for that criteria too (in an
> ad-hoc manner).
>
> In such a case adding lucene indexes may not be that easy. If the graph has
> huge, maybe it would make since to start traversal from the nodes that will
> yield the smallest starting graph.
>
> Thomas
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user