subject:"Re\: \[Neo\] Traversers in the REST API"

Re: [Neo] Traversers in the REST API

2010-04-09 Thread Alastair James

>
> Why not just slap memcached in the middle?  Would help with scalability
>   as well, plus you could keep cached results keyed by query params in
>   there if needed.  Just a thought...
>

Yes, but in my mind that says "you cant use neo without a 3rd party caching
layer" which seems a little crazy as it adds complexity and leads to
inevitable 'eventual consistency' as the many many different cached views
expire. Plus, there would still be the memory overhead of inflating 1000+
nodes from memcache.

Anyway, if you are caching, better to cache the output of the generated HTML
pages (or page fragments), something that any website of scale will do
already, so caching in the middleware wont help.

However, caching does not help situations where everybody sees a slightly
different view of the data or that the search state space is large enough to
make the chances of a cache hit quite small.

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-09 Thread rick . bullotta

   Why not just slap memcached in the middle?  Would help with scalability
   as well, plus you could keep cached results keyed by query params in
   there if needed.  Just a thought...



    Original Message 
   Subject: Re: [Neo] Traversers in the REST API
   From: Alastair James 
   Date: Fri, April 09, 2010 8:32 am
   To: Neo user discussions 
   >Since in manycases the results of a query will need to be reformed
   into
   > their associated domain objects
   Unlikely to be the case over the HTTP API. Its unlikely people will
   create
   domain objects in (e.g.) PHP they will just use the data directly.
   Pagination is kinda tricky if the data changes between subsequent
   > requests for "pages". Since pagination is generally used for UIs, a
   > common approach is to place the entire dataset (or a cursor,
   depending
   > on where the data is coming from) in a session object. Regardless of
   > where it is kept, if you want to deal with data changes, you either
   > have to a) invalidate the "cached" dataset if data changes or b) keep
   a
   > copy of the whole dataset around in its "as queried" state so that
   > subsequent paging requests are consistent. Either case involves
   > keeping a fairly big duplicate data structure on the server or middle
   > tier and violates one of the objectives of REST-ful APIs, which is
   that
   > of statelessness. For that reason, I personally think the REST-ful
   API
   > shouldn't deal with paging. It should probably be done at some
   > intermediate level as needed by applications. We can certainly build
   a
   > separate API that we can all leverage if needed, but I don't think it
   > should be in the core REST-ful layer.
   >
   Well, I think for my use cases (websites), its likely that users dont
   flick
   between pages that often. For example, on may sites, users will view
   page 1
   and select an item, any very view move on to page 2. Its a very
   different
   usage pattern compared to a traditional desktop UI, so there
   is absolutely no need to hold the sorted set on the server in a cursor
   type
   way.
   A typical use case for me would be 1000+ matching rows, with 90%+ of
   page
   views for the first 10, 5% for the next 10 etc...! You can clearly see
   that
   sending the entire results set of 1000+ rows over HTTP/JSON is
   inefficient.
   Of course, caching between the web server and the neo HTTP API can
   help, but
   not in all cases, and it seems silly to rely on this.
   Al
   --
   Dr Alastair James
   CTO James Publishing Ltd.
   [1]http://www.linkedin.com/pub/3/914/163
   [2]www.worldreviewer.com
   WINNER Travolution Awards Best Travel Information Website 2009
   WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
   WINNER Travolution Awards Best New Online Travel Company 2008
   WINNER Travel Weekly Magellan Award 2008
   WINNER Yahoo! Finds of the Year 2007
   "Noli nothis permittere te terere!"
   ___
   Neo mailing list
   User@lists.neo4j.org
   [3]https://lists.neo4j.org/mailman/listinfo/user

References

   1. http://www.linkedin.com/pub/3/914/163
   2. http://www.worldreviewer.com/
   3. https://lists.neo4j.org/mailman/listinfo/user
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-09 Thread Alastair James

>Since in manycases the results of a query will need to be reformed into
 > their associated domain objects

Unlikely to be the case over the HTTP API.  Its unlikely people will create
domain objects in (e.g.) PHP they will just use the data directly.

Pagination is kinda tricky if the data changes between subsequent
>   requests for "pages".  Since pagination is generally used for UIs, a
>   common approach is to place the entire dataset (or a cursor, depending
>   on where the data is coming from) in a session object.  Regardless of
>   where it is kept, if you want to deal with data changes, you either
>   have to a) invalidate the "cached" dataset if data changes or b) keep a
>   copy of the whole dataset around in its "as queried" state so that
>   subsequent paging requests are consistent.  Either case involves
>   keeping a fairly big duplicate data structure on the server or middle
>   tier and violates one of the objectives of REST-ful APIs, which is that
>   of statelessness.  For that reason, I personally think the REST-ful API
>   shouldn't deal with paging.  It should probably be done at some
>   intermediate level as needed by applications.  We can certainly build a
>   separate API that we can all leverage if needed, but I don't think it
>   should be in the core REST-ful layer.
>

Well, I think for my use cases (websites), its likely that users dont flick
between pages that often. For example, on may sites, users will view page 1
and select an item, any very view move on to page 2. Its a very different
usage pattern compared to a traditional desktop UI, so there
is absolutely no need to hold the sorted set on the server in a cursor type
way.

A typical use case for me would be 1000+ matching rows, with 90%+ of page
views for the first 10, 5% for the next 10 etc...! You can clearly see that
sending the entire results set of 1000+ rows over HTTP/JSON is inefficient.

Of course, caching between the web server and the neo HTTP API can help, but
not in all cases, and it seems silly to rely on this.

Al




-- 
Dr Alastair James
CTO James Publishing Ltd.
http://www.linkedin.com/pub/3/914/163

www.worldreviewer.com

WINNER Travolution Awards Best Travel Information Website 2009
WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
WINNER Travolution Awards Best New Online Travel Company 2008
WINNER Travel Weekly Magellan Award 2008
WINNER Yahoo! Finds of the Year 2007

"Noli nothis permittere te terere!"
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-09 Thread rick . bullotta

   Since in manycases the results of a query will need to be reformed into
   their associated domain objects, we've chosen to do our sorting at that
   point (and on the server).  We do our (primary) filtering within the
   traversal/DB->domain object processes.  That seems to work well.



   Pagination is kinda tricky if the data changes between subsequent
   requests for "pages".  Since pagination is generally used for UIs, a
   common approach is to place the entire dataset (or a cursor, depending
   on where the data is coming from) in a session object.  Regardless of
   where it is kept, if you want to deal with data changes, you either
   have to a) invalidate the "cached" dataset if data changes or b) keep a
   copy of the whole dataset around in its "as queried" state so that
   subsequent paging requests are consistent.  Either case involves
   keeping a fairly big duplicate data structure on the server or middle
   tier and violates one of the objectives of REST-ful APIs, which is that
   of statelessness.  For that reason, I personally think the REST-ful API
   shouldn't deal with paging.  It should probably be done at some
   intermediate level as needed by applications.  We can certainly build a
   separate API that we can all leverage if needed, but I don't think it
   should be in the core REST-ful layer.



   Just my $0.02, after taxes.







   -------- Original Message ----
   Subject: Re: [Neo] Traversers in the REST API
   From: Tobias Ivarsson 
   Date: Fri, April 09, 2010 4:00 am
   To: Neo user discussions 
   I definitely agree that limiting or paging a set of results is probably
   not
   very useful without some sort of sorting. The (only) benefit of pushing
   sorting to the client is that the client might be able to filter the
   result
   further before sorting it. Since sorting is generally the most
   expensive
   operation it should be done as late as possible. However the idea of
   semi-sorting, to get only one page of sorted results at each request,
   that
   was mentioned in some thread yesterday sounds quite compelling.
   I agree that an equivalent of LIMIT, OFFSET and ORDER BY is a good
   target.
   As to indexing: the structure of the graph IS the index to a large
   extent.
   This means that a well designed graph would often not need paging if
   the
   traversal is done right. There are however some cases where this is
   hard to
   accomplish and we need to work on supporting those cases better.
   Remember that a Graph Database is NOT a Relational Database. A lot of
   the
   ideas people have about databases are based on their knowledge of
   Relational
   Databases. I understand that it can be hard, but if that baggage could
   be
   left at the door it would make things a lot easier. Nobody is saying
   that
   Relational Databases are dead (except for some publicity stunts) far
   from
   it! What we (and a lot of other people) are saying is that the age of
   "one
   database to rule them all" is over. Different problems are best solved
   with
   different kinds of databases, RDBMSes are great at some, K/V stores
   some,
   and Graph Databases are great for some. Then there are some problems
   that
   are best solved with a combination of two or more (kinds of) databases,
   where each database brings its own strengths to the table, and is used
   only
   for the things it is good at.
   That's enough deviation from the topic, my conclusions remain the same
   as
   they were before this discussion started, I will state them in as few
   words
   as possible and in bullet point form to convey them as clearly as I
   can:
   * The REST API will probably need result set limiting or pagination.
   * Limiting and pagination will require (server side) sorting
   * Sorting can be better implemented if it's implemented in the core of
   the
   traversal framework
   * Limiting / Pagination can be deferred for a while until we know what
   it
   needs to look like (from looking at actual uses)
   * (Server side) Sorting can be deferred until we need it for limiting /
   pagination
   Peace,
   Tobias
   On Thu, Apr 8, 2010 at 10:17 PM, Michael Ludwig  wrote:
   > Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200)
   > [Re: [Neo] Traversers in the REST API]:
   >
   > > On Wed, Apr 7, 2010 at 3:05 PM, Alastair James 
   > > wrote:
   >
   > > > when we start talking about returning 1000s of nodes in JSON over
   > > > HTTP just to get the first 10 this is clearly sub-optimal (as I
   > > > build websites this is a very common use case). So, as you say,
   > > > sorting and limiting can wait, but I suspect the HTTP API would
   > > > benefit from offering it. Limiting need not require changes to
   the
   > > > core API, it could be implemented as a second stage in the HTTP
   API
   > > > c

Re: [Neo] Traversers in the REST API

2010-04-09 Thread Tobias Ivarsson

I definitely agree that limiting or paging a set of results is probably not
very useful without some sort of sorting. The (only) benefit of pushing
sorting to the client is that the client might be able to filter the result
further before sorting it. Since sorting is generally the most expensive
operation it should be done as late as possible. However the idea of
semi-sorting, to get only one page of sorted results at each request, that
was mentioned in some thread yesterday sounds quite compelling.

I agree that an equivalent of LIMIT, OFFSET and ORDER BY is a good target.

As to indexing: the structure of the graph IS the index to a large extent.
This means that a well designed graph would often not need paging if the
traversal is done right. There are however some cases where this is hard to
accomplish and we need to work on supporting those cases better.

Remember that a Graph Database is NOT a Relational Database. A lot of the
ideas people have about databases are based on their knowledge of Relational
Databases. I understand that it can be hard, but if that baggage could be
left at the door it would make things a lot easier. Nobody is saying that
Relational Databases are dead (except for some publicity stunts) far from
it! What we (and a lot of other people) are saying is that the age of "one
database to rule them all" is over. Different problems are best solved with
different kinds of databases, RDBMSes are great at some, K/V stores some,
and Graph Databases are great for some. Then there are some problems that
are best solved with a combination of two or more (kinds of) databases,
where each database brings its own strengths to the table, and is used only
for the things it is good at.

That's enough deviation from the topic, my conclusions remain the same as
they were before this discussion started, I will state them in as few words
as possible and in bullet point form to convey them as clearly as I can:
* The REST API will probably need result set limiting or pagination.
* Limiting and pagination will require (server side) sorting
* Sorting can be better implemented if it's implemented in the core of the
traversal framework
* Limiting / Pagination can be deferred for a while until we know what it
needs to look like (from looking at actual uses)
* (Server side) Sorting can be deferred until we need it for limiting /
pagination

Peace,
Tobias

On Thu, Apr 8, 2010 at 10:17 PM, Michael Ludwig  wrote:

> Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200)
> [Re: [Neo] Traversers in the REST API]:
>
> > On Wed, Apr 7, 2010 at 3:05 PM, Alastair James 
> > wrote:
>
> > > when we start talking about returning 1000s of nodes in JSON over
> > > HTTP just to get the first 10 this is clearly sub-optimal (as I
> > > build websites this is a very common use case). So, as you say,
> > > sorting and limiting can wait, but I suspect the HTTP API would
> > > benefit from offering it. Limiting need not require changes to the
> > > core API, it could be implemented as a second stage in the HTTP API
> > > code prior to output encoding.
> >
> > For paging / limiting: yes, you are absolutely right, this would not
> > effect the core API at all, only the REST API. Limiting/paging is
> > something we would probably add to the REST API before sorting.
>
> Limiting and paging usually go hand in hand with sorting, in my
> experience. Why would anyone want to page through an unsorted
> collection?
>
> > Sorting might be a similar case, but I still think the client would be
> > better fitted to do sorting well.
>
> The server has indexes to support the sorting. (If it doesn't, it has a
> problem anyway.) What does the client have to support sorting? So how
> would it be better fitted to do sorting well?
>
> > But once paging / limiting is added it would be quite natural / useful
> > to add sorting as well. What I want to avoid is keeping state on the
> > server while waiting for the client to request the next page.
>
> If you ensure a binary tree index is used to do the sorting, you should
> be fine.
>
> --
> Michael Ludwig
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

-- 
Tobias Ivarsson 
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-08 Thread Alastair James

On 8 April 2010 21:17, Michael Ludwig  wrote:

> Limiting and paging usually go hand in hand with sorting, in my
> experience. Why would anyone want to page through an unsorted
> collection?
>

Its quite possible that you might want the nodes in the order they were
found (e.g. the closest matching nodes first), however, I agree, sorting by
an arbitrary property is very useful!

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-08 Thread Michael Ludwig

Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200)
[Re: [Neo] Traversers in the REST API]:

> On Wed, Apr 7, 2010 at 3:05 PM, Alastair James 
> wrote:

> > when we start talking about returning 1000s of nodes in JSON over
> > HTTP just to get the first 10 this is clearly sub-optimal (as I
> > build websites this is a very common use case). So, as you say,
> > sorting and limiting can wait, but I suspect the HTTP API would
> > benefit from offering it. Limiting need not require changes to the
> > core API, it could be implemented as a second stage in the HTTP API
> > code prior to output encoding.
> 
> For paging / limiting: yes, you are absolutely right, this would not
> effect the core API at all, only the REST API. Limiting/paging is
> something we would probably add to the REST API before sorting.

Limiting and paging usually go hand in hand with sorting, in my
experience. Why would anyone want to page through an unsorted
collection?

> Sorting might be a similar case, but I still think the client would be
> better fitted to do sorting well.

The server has indexes to support the sorting. (If it doesn't, it has a
problem anyway.) What does the client have to support sorting? So how
would it be better fitted to do sorting well?

> But once paging / limiting is added it would be quite natural / useful
> to add sorting as well. What I want to avoid is keeping state on the
> server while waiting for the client to request the next page.

If you ensure a binary tree index is used to do the sorting, you should
be fine.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-08 Thread Alastair James

>What I want to avoid
>is keeping state on the server while waiting for the client to request the
>next page.

You are quite right. However, I think for many use cases (e.g. generating a
paginated list of results on a webpage) it would not be necessary to store
state on the server.

That would be more similar to a SQL cursor, what I am talking about is
simply SQL LIMIT, OFFSET and ORDER BY.

Cheers

Al

On 8 April 2010 17:23, Tobias Ivarsson wrote:

> What I want to avoid
> is keeping state on the server while waiting for the client to request the
> next page.
>

-- 
Dr Alastair James
CTO James Publishing Ltd.
http://www.linkedin.com/pub/3/914/163

www.worldreviewer.com

WINNER Travolution Awards Best Travel Information Website 2009
WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
WINNER Travolution Awards Best New Online Travel Company 2008
WINNER Travel Weekly Magellan Award 2008
WINNER Yahoo! Finds of the Year 2007

"Noli nothis permittere te terere!"
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-08 Thread Tobias Ivarsson

On Wed, Apr 7, 2010 at 3:05 PM, Alastair James  wrote:

> Cheers guys. All sounds good. One comment:
>
>
> > As for sorting: yes, that is a comment on the API as a whole. We have
> opted
> > at not providing sorting, since there are good sorting facilities
> available
> > in the JRE already. Since that makes it easy for the user to implement
> > their
> > own sorting it would be sub optimal for Neo4j to provide sorting. Since
> > sorting is a costly operation (both in time and space) it should be done
> as
> > late in the process as possible, probably with a lot of user code in
> > between
> > the traversal and the place where the sorting actually takes place. This
> > has
> > been our thinking in the REST API as well, meaning that sorting will be
> > left
> > to the client. It is possible that we will return to this decision and
> add
> > sorting to the REST API, and that it might trickle down to the core API.
> > Features like this are however much easier to add than to remove, which
> is
> > why it is not implemented at the moment.
> >
>
> Well, thats the case when Neo is running in the same JVM as the user code,
> but when we start talking about returning 1000s of nodes in JSON over HTTP
> just to get the first 10 this is clearly sub-optimal (as I build websites
> this is a very common use case). So, as you say, sorting and limiting can
> wait, but I suspect the HTTP API would benefit from offering it. Limiting
> need not require changes to the core API, it could be implemented as a
> second stage in the HTTP API code prior to output encoding.
>

For paging / limiting: yes, you are absolutely right, this would not effect
the core API at all, only the REST API. Limiting/paging is something we
would probably add to the REST API before sorting.

Sorting might be a similar case, but I still think the client would be
better fitted to do sorting well. But once paging / limiting is added it
would be quite natural / useful to add sorting as well. What I want to avoid
is keeping state on the server while waiting for the client to request the
next page.

Cheers,
-- 
Tobias Ivarsson 
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-07 Thread Alastair James

Cheers guys. All sounds good. One comment:


> As for sorting: yes, that is a comment on the API as a whole. We have opted
> at not providing sorting, since there are good sorting facilities available
> in the JRE already. Since that makes it easy for the user to implement
> their
> own sorting it would be sub optimal for Neo4j to provide sorting. Since
> sorting is a costly operation (both in time and space) it should be done as
> late in the process as possible, probably with a lot of user code in
> between
> the traversal and the place where the sorting actually takes place. This
> has
> been our thinking in the REST API as well, meaning that sorting will be
> left
> to the client. It is possible that we will return to this decision and add
> sorting to the REST API, and that it might trickle down to the core API.
> Features like this are however much easier to add than to remove, which is
> why it is not implemented at the moment.
>

Well, thats the case when Neo is running in the same JVM as the user code,
but when we start talking about returning 1000s of nodes in JSON over HTTP
just to get the first 10 this is clearly sub-optimal (as I build websites
this is a very common use case). So, as you say, sorting and limiting can
wait, but I suspect the HTTP API would benefit from offering it. Limiting
need not require changes to the core API, it could be implemented as a
second stage in the HTTP API code prior to output encoding.

Cheers

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-07 Thread Mattias Persson

2010/4/7 Alastair James 

> >
> > > These two ways of traversing a graph complement each other, it's not
> that
> > > one is "better" than the other. Would you agree on this?
> >
>
> I think I agree. I would hope to be able to use XPath/Gremlin style
> querying
> for most things, and a more programatic system for more complex ones.
>
Something like this will come along as well rather soon (I suspect)

>
> > a JSON document describing the traverser, like:
> >
> >   { "order": "depth first",
> > "uniquness": "node",
> > "return evaluator":
> >{ "language": "javascript",
> >  "body": "function shouldReturn( pos ) {...}" },
> > "prune evaluator":
> >{ "language": "javascript",
> >  "body": "function" },
> > "relationships": [
> >{ "direction": "outgoing",
> >  "type": "KNOWS" },
> >{ "type": "LOVES" }
> > ],
> > "max depth": 4 }
>
> Looks good for my needs. Using javascript in this form looks sensible.
>
> Any idea about the performance implications of using a javax.scripting
> language here? I guess not too severe.
>
As for javascript (Rhino engine at the moment) the code snippet will be
compiled into java code the first time and then just rerun for consecutive
calls.

>
> Is there any need for a shared context between calls to the evaluators? So
> I
> could store custom information and access it again when traversing further
> nodes. So, you could passing a 'context' object (with its initial values)
> that gets passed in as a second parameter to each evaluator function. Then
> again, this is probably bad practice
>
We'll see what can be done about context and storing information in between
calls, there's definately a need for that.

>
> Any idea how you will handle pagination? Obviously sorting is an issue as
> you are unlikely to want the nodes in the traversal order. In my mind it
> would be nice to allow the return evaluator to return a 'sorting value'
> that
> indicates that nodes rank in the result set. E.g. sorting on a score
> attribute of the node:
>
> function shouldReturn(pos)
> {
>if (!some_condition) return false;
>
>return pos.currentNode().score;
> }
>
> But I guess this is a comment on the Neo API as a whole?
>
> Cheers
>
> Al
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-07 Thread Tobias Ivarsson

Our thoughts on how to handle pagination is "not yet", we'll get something
that works first, and then add pagination (in a number of places) later on.

As for sorting: yes, that is a comment on the API as a whole. We have opted
at not providing sorting, since there are good sorting facilities available
in the JRE already. Since that makes it easy for the user to implement their
own sorting it would be sub optimal for Neo4j to provide sorting. Since
sorting is a costly operation (both in time and space) it should be done as
late in the process as possible, probably with a lot of user code in between
the traversal and the place where the sorting actually takes place. This has
been our thinking in the REST API as well, meaning that sorting will be left
to the client. It is possible that we will return to this decision and add
sorting to the REST API, and that it might trickle down to the core API.
Features like this are however much easier to add than to remove, which is
why it is not implemented at the moment.

Cheers,
Tobias

On Wed, Apr 7, 2010 at 9:47 AM, Alastair James  wrote:

> >
> > > These two ways of traversing a graph complement each other, it's not
> that
> > > one is "better" than the other. Would you agree on this?
> >
>
> I think I agree. I would hope to be able to use XPath/Gremlin style
> querying
> for most things, and a more programatic system for more complex ones.
>
> > a JSON document describing the traverser, like:
> >
> >   { "order": "depth first",
> > "uniquness": "node",
> > "return evaluator":
> >{ "language": "javascript",
> >  "body": "function shouldReturn( pos ) {...}" },
> > "prune evaluator":
> >{ "language": "javascript",
> >  "body": "function" },
> > "relationships": [
> >{ "direction": "outgoing",
> >  "type": "KNOWS" },
> >{ "type": "LOVES" }
> > ],
> > "max depth": 4 }
>
> Looks good for my needs. Using javascript in this form looks sensible.
>
> Any idea about the performance implications of using a javax.scripting
> language here? I guess not too severe.
>
> Is there any need for a shared context between calls to the evaluators? So
> I
> could store custom information and access it again when traversing further
> nodes. So, you could passing a 'context' object (with its initial values)
> that gets passed in as a second parameter to each evaluator function. Then
> again, this is probably bad practice
>
> Any idea how you will handle pagination? Obviously sorting is an issue as
> you are unlikely to want the nodes in the traversal order. In my mind it
> would be nice to allow the return evaluator to return a 'sorting value'
> that
> indicates that nodes rank in the result set. E.g. sorting on a score
> attribute of the node:
>
> function shouldReturn(pos)
> {
>if (!some_condition) return false;
>
>return pos.currentNode().score;
> }
>
> But I guess this is a comment on the Neo API as a whole?
>
> Cheers
>
> Al
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

-- 
Tobias Ivarsson 
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-04-07 Thread Alastair James

>
> > These two ways of traversing a graph complement each other, it's not that
> > one is "better" than the other. Would you agree on this?
>

I think I agree. I would hope to be able to use XPath/Gremlin style querying
for most things, and a more programatic system for more complex ones.

> a JSON document describing the traverser, like:
>
>   { "order": "depth first",
> "uniquness": "node",
> "return evaluator":
>{ "language": "javascript",
>  "body": "function shouldReturn( pos ) {...}" },
> "prune evaluator":
>{ "language": "javascript",
>  "body": "function" },
> "relationships": [
>{ "direction": "outgoing",
>  "type": "KNOWS" },
>{ "type": "LOVES" }
> ],
> "max depth": 4 }

Looks good for my needs. Using javascript in this form looks sensible.

Any idea about the performance implications of using a javax.scripting
language here? I guess not too severe.

Is there any need for a shared context between calls to the evaluators? So I
could store custom information and access it again when traversing further
nodes. So, you could passing a 'context' object (with its initial values)
that gets passed in as a second parameter to each evaluator function. Then
again, this is probably bad practice

Any idea how you will handle pagination? Obviously sorting is an issue as
you are unlikely to want the nodes in the traversal order. In my mind it
would be nice to allow the return evaluator to return a 'sorting value' that
indicates that nodes rank in the result set. E.g. sorting on a score
attribute of the node:

function shouldReturn(pos)
{
if (!some_condition) return false;

return pos.currentNode().score;
}

But I guess this is a comment on the Neo API as a whole?

Cheers

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-03-31 Thread Marko Rodriguez

Hi,

>> 
> As I see it traversers are fundamentally different from pipes/gremlin, where
> the pipes/gremlin style is very explicit in which steps to traverse. F.ex.
> you must specify that it must first traverse relationships of type X and
> then of type Y a.s.o. and have an exact knowledge of the graph you're
> traversing.

Yes--you can specify label-types in order, or you don't have to:

while null() != $_
   $_ := ./outE/inV[g:print(.)]
   end

or if you only want it to take LOVES or KNOWS relationships

while null() != $_
  $_ := ./ou...@label=g:list('LOVES','KNOWS')]/inV[g:print(.)]
  end

> The neo4j traversers doesn't have that (although they will have soon), but
> they instead can traverse any graph with arbitrary depth and all that...
> f.ex. if you are "standing" on a file in a tree on depth D (i.e. there are D
> parent folders above it) and you'd like to know the OWNER of that file. If
> the OWNER relationships sits on the top-level folders and want to find it
> you'd just tell the traverser to traverse incoming TREE_CHILD and outgoing
> OWNER relationships and return nodes/paths which ends with and OWNER
> relationship and you have your results, however deep the file is in the
> tree.

Let me see here

while null() != $_
  $_ := 
./i...@label='TREE_CHILD']/outV/ou...@label='OWNER']/inV[g:print(.)]/../..
  end

Little gross looking... but I think that is what you are saying...

> These two ways of traversing a graph complement each other, it's not that
> one is "better" than the other. Would you agree on this?

I will agree to nothing without my lawyers! :{P  
(note: I use { as I have a mustache these days.)

There are many ways to skin a cat---climb a mountain---using a mailing list. 
But in the end, there will only be 4 or so winners

Take care,
Marko.

http://markorodriguez.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-03-31 Thread Mattias Persson

2010/3/30 Marko Rodriguez 

> Hi guys,
>
> For what its worth
>
> I have yet to use the Neo4j traversal framework because it is simply is not
> expressive enough. The traverser framework is like a single-relational
> traverser on a multi-relational graph. You only allow or disallow certain
> edge labels--not the ordered concatenation of labels. Moreover, even with
> ordered labels defined, the choices that a traverser make at every element
> (edge and vertex) should be predicated on general purpose
> computing---predicated on the state/history of the walker, the properties of
> the elements,  ... anything.
>
> >
> >"relationships": [
> >   { "direction": "outgoing",
> > "type": "KNOWS" },
> >   { "type": "LOVES" }
> >],
> >"max depth": 4 }
>
> What if I want to find all the people that love my post popular (by
> eigenvector centrality) friends who also know me? Thus, simply taking
> "knows" and "loves" relationships arbitrarily doesn't tell me that. What
> comes into play in such situations are edge weights, ordered paths, loops,
> sampling, etc.
>
> A general purpose traverser framework requires that you be able to define
> adjacency arbitrarily. The traverser must be able to ask the engineer, at
> every step (edge and vertex [1]), what do you mean by "adjacent"  where
> do I go next?
>
> [1] edges should be treated just as vertices---they have properties that
> can be reasoned on. many times you want edges returned, not just vertices.
>
As I see it traversers are fundamentally different from pipes/gremlin, where
the pipes/gremlin style is very explicit in which steps to traverse. F.ex.
you must specify that it must first traverse relationships of type X and
then of type Y a.s.o. and have an exact knowledge of the graph you're
traversing.

The neo4j traversers doesn't have that (although they will have soon), but
they instead can traverse any graph with arbitrary depth and all that...
f.ex. if you are "standing" on a file in a tree on depth D (i.e. there are D
parent folders above it) and you'd like to know the OWNER of that file. If
the OWNER relationships sits on the top-level folders and want to find it
you'd just tell the traverser to traverse incoming TREE_CHILD and outgoing
OWNER relationships and return nodes/paths which ends with and OWNER
relationship and you have your results, however deep the file is in the
tree.

These two ways of traversing a graph complement each other, it's not that
one is "better" than the other. Would you agree on this?

>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-03-30 Thread Michael Ludwig

Mattias Persson schrieb am 30.03.2010 um 16:06:49 (+0200):

> a JSON document describing the traverser, like:
> 
>   { "order": "depth first",
> "uniquness": "node",
> "return evaluator":
>{ "language": "javascript",
>  "body": "function shouldReturn( pos ) {...}" },
> "prune evaluator":
>{ "language": "javascript",
>  "body": "function" },
> "relationships": [
>{ "direction": "outgoing",
>  "type": "KNOWS" },
>{ "type": "LOVES" }
> ],
> "max depth": 4 }

> Looking at the "prune evaluator" and "return evaluator" it'd be nice
> to define them in some language, f.ex javascript, ruby or python or
> whatever. We're initially thinking of using javax.script.* stuff
> (ScriptEngine) for that, it'd probably be enough, at least to get
> things going.

XSLT, which builds on XPath, works by having the processor traverse the
tree and the user define templates featuring a match pattern. For every
node, the processor dispatches to the best matching template, from where
you can control further processing.

Now those match patterns are a subset of XPath, and rightly so: If the
user were given the full power of XPath, it would easily get horribly
expensive to determine the best matching template for a given node.

Likewise in a graph traversal, wouldn't it be reasonable to only allow
something with restricted expressive and imperative power, like the
match patterns in XSLT?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-03-30 Thread Tobias Ivarsson

On Tue, Mar 30, 2010 at 4:20 PM, Marko Rodriguez wrote:

> Hi guys,
>
> For what its worth
>
> I have yet to use the Neo4j traversal framework because it is simply is not
> expressive enough. The traverser framework is like a single-relational
> traverser on a multi-relational graph. You only allow or disallow certain
> edge labels--not the ordered concatenation of labels. Moreover, even with
> ordered labels defined, the choices that a traverser make at every element
> (edge and vertex) should be predicated on general purpose
> computing---predicated on the state/history of the walker, the properties of
> the elements,  ... anything.
>

Good thing we are building this on the new traversal framework that we are
working on then ;)

Some of the features you are mentioning that the current/previous traversal
framework is lacking are supported in the new framework, and others are on
the roadmap. Those features will be exposed through the REST API as well
when they are ready. This will include revising the way you declare which
relationships to traverse. What we would like to be able to say is:
First expand relationships of type A,B or C, then of type T,U,V or W, then
if the previous was T or U, the next should be X, if the previous was V or
W, the next should be an arbitrary depth of Y relationships. And of course
be able to have different kinds of filters in each step (on both nodes and
relationships), not only selection based on relationship type.
This is however not implemented yet, but as the new traversal API evolves we
plan to let the REST API follow.

>
> >
> >"relationships": [
> >   { "direction": "outgoing",
> > "type": "KNOWS" },
> >   { "type": "LOVES" }
> >],
> >"max depth": 4 }
>
> What if I want to find all the people that love my post popular (by
> eigenvector centrality) friends who also know me? Thus, simply taking
> "knows" and "loves" relationships arbitrarily doesn't tell me that. What
> comes into play in such situations are edge weights, ordered paths, loops,
> sampling, etc.
>
> A general purpose traverser framework requires that you be able to define
> adjacency arbitrarily. The traverser must be able to ask the engineer, at
> every step (edge and vertex [1]), what do you mean by "adjacent"  where
> do I go next?
>

Exactly, depth first and breadth first are just two very basic
implementations of this. The new traversal framework will eventually have
support for letting the engineer provide the selector for where-to-go-next.
The first use case that comes to mind for me is for implementing best first
traversals, but I know that there are other things oen might want to write,
and I am sure there are even more that I haven't thought of. When this is
implemented we will add some means for exposing it to the users of the REST
API as well, but for now our idea is to make the REST API useful with what
we have, and to get the new traversal framework in front of users.

>
> [1] edges should be treated just as vertices---they have properties that
> can be reasoned on. many times you want edges returned, not just vertices.
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
> ___
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>

-- 
Tobias Ivarsson 
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

2010-03-30 Thread Marko Rodriguez

Hi guys,

For what its worth

I have yet to use the Neo4j traversal framework because it is simply is not 
expressive enough. The traverser framework is like a single-relational 
traverser on a multi-relational graph. You only allow or disallow certain edge 
labels--not the ordered concatenation of labels. Moreover, even with ordered 
labels defined, the choices that a traverser make at every element (edge and 
vertex) should be predicated on general purpose computing---predicated on the 
state/history of the walker, the properties of the elements,  ... anything.

> 
>"relationships": [
>   { "direction": "outgoing",
> "type": "KNOWS" },
>   { "type": "LOVES" }
>],
>"max depth": 4 }

What if I want to find all the people that love my post popular (by eigenvector 
centrality) friends who also know me? Thus, simply taking "knows" and "loves" 
relationships arbitrarily doesn't tell me that. What comes into play in such 
situations are edge weights, ordered paths, loops, sampling, etc.

A general purpose traverser framework requires that you be able to define 
adjacency arbitrarily. The traverser must be able to ask the engineer, at every 
step (edge and vertex [1]), what do you mean by "adjacent"  where do I go 
next?

[1] edges should be treated just as vertices---they have properties that can be 
reasoned on. many times you want edges returned, not just vertices.

Take care,
Marko.

http://markorodriguez.com

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

Re: [Neo] Traversers in the REST API

18 matches

Site Navigation

Mail list logo

Footer information