I think Jim makes a great point about the differences between paging and
streaming, being client or server controlled. I think there is a related
point to be made, and that is that paging does not, and cannot, guarantee a
consistent total result set. Since the database can change between pages
requests, they can be inconsistent. It is possible for the same record to
appear in two pages, or for a record to be missed. This is certainly how
relational databases work in this regard.

But in the streaming case, we expect a complete and consistent result set.
Unless, of course, the client cuts off the stream. The use case is very
different, while paging is about getting a peek at the data, and rarely
about paging all the way to the end, streaming is about getting the entire
result, but streamed for efficiency.

On Thu, Apr 21, 2011 at 5:00 PM, Jim Webber <j...@neotechnology.com> wrote:

> This is indeed a good dialogue. The pagination versus streaming was
> something I'd previously had in my mind as orthogonal issues, but I like the
> direction this is going. Let's break it down to fundamentals:
>
> As a remote client, I want to be just as rich and performant as a local
> client. Unfortunately,  Deutsch, Amdahl and Einstein are against me on that,
> and I don't think I am tough enough to defeat those guys.
>
> So what are my choices? I know I have to be more "granular" to try to
> alleviate some of the network penalty so doing operations bulkily sounds
> great.
>
> Now what I need to decide is whether I control the rate at which those bulk
> operations occur or whether the server does. If I want to control those
> operations, then paging seems sensible. Otherwise a streamed (chunked)
> encoding scheme would make sense if I'm happy for the server to throw
> results back at me at its own pace. Or indeed you can mix both so that pages
> are streamed.
>
> In either case if I get bored of those results, I'll stop paging or I'll
> terminate the connection.
>
> So what does this mean for implementation on the server? I guess this is
> important since it affects the likelihood of the Neo Tech team implementing
> it.
>
> If the server supports pagination, it means we need a paging controller in
> memory per paginated result set being created. If we assume that we'll only
> go forward in pages, that's effectively just a wrapper around the traversal
> that's been uploaded. The overhead should be modest, and apart from the
> paging controller and the traverser, it doesn't need much state. We would
> need to add some logic to the representation code to support "next" links,
> but that seems a modest task.
>
> If the server streams, we will need to decouple the representation
> generation from the existing representation logic since that builds an
> in-memory representation which is then flushed. Instead we'll need a
> streaming representation implementation which seems to be a reasonable
> amount of engineering. We'll also need a new streaming binding to the REST
> server in JAX-RS land.
>
> I'm still a bit concerned about how "rude" it is for a client to just drop
> a streaming connection. I've asked Mark Nottingham for his authoritative
> opinion on that. But still, this does seem popular and feasible.
>
> Jim
>
>
>
>
>
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to