Re: [DISCUSSION][API][PROPOSAL] Adding Stream generator to Sling API

Jason E Bailey Mon, 04 Jun 2018 10:22:06 -0700

Thanks Robert!

- Jason

On Mon, Jun 4, 2018, at 11:45 AM, Robert Munteanu wrote:

> There are two questions I want to raise:
> 
> 1. API related
> 
> We have two methods:
> 
> - stream()
> - stream(Predicate<Resource>)
> 
> which traverse the whole tree.

the base stream() would traverse the whole tree, the branch selector parameter 
(Predicate<Resource>) is used to limit the traversal. The most common use case 
I give for this if you have a site with 10,000 pages on it. The number of nodes 
or resources in that structure could come close to 1 million nodes. If, 
however, you can find if a page matches your needs from the page node itself . 
You don't have to descend into the jcr:content sub tree.  So instead of 
traversing a million nodes, you just traverse 10,000

As Dan mentioned, you could also just have that branch selector itself limit 
the depth of the search. 

> There were questions of whether this is OK or if we maybe need to
> stream the children only. Could the API additions allow that?
> 
> We can start with something like
> 
> - stream(Predicate<Resource>, Set<TraversalOption>... )
> - stream(Predicate<Resource>)
> - stream(Set<TraversalOption>...)

If the traversal option defines only the children of the existing resource, 
then there is no need to define a branch selector.  It's exclusionary. Which, 
makes the combinations kind of weird,  It would make more sense to just name 
the method for getting the children as a stream something unique.

> 2. Implementation related
> 
> I think that the 'hasNext' method is not lazy, eagerly loading the
> direct children of a resource when there are no more resources to load.
> 
> If the resource has a large number of children we can end up taking
> lots of memory, so I think we need to be more lazy with it, since the
> streams framework can take advantage of that (IIRC). Is it possible to
> ajust the implementation this way?

Maybe. You would still need to have the hasNext() method correctly identify 
whether there is a next or not, so you would need to have a loop iterate 
through and checking.  I would have to somehow find my way back to the correct 
location the next time the method is called. My biggest concern there is with 
the unordered children. Do I need to maintain state of the underling iteration 
that I'm checking?

I'd ask that we go ahead and get it in as is, and I could pursue being more 
lazy on that call. I don't know how long that would take me and I'm still 
waiting for the gods of the internet to provide me a job where I can work on 
Sling as my job full time.

> Thanks,
> 
> Robert

Re: [DISCUSSION][API][PROPOSAL] Adding Stream generator to Sling API

Reply via email to