Re: [jr3] Orderable child nodes: required (to be the default)?

Tobias Bocanegra Fri, 20 Jan 2012 18:01:41 -0800

nt:unstructured has orderable childnodes, and we also preach to use
nt:unstructured whereever you can. Also, i assume that a lot of
applications benefit from node ordering (like a CMS). So I think that
the majority of the repositories have a lot of nodes with ordered
childnodes. thus i would not put too much effort in having an
efficient non-ordered format, but making the ordered storage as fast
as possible. using lexicographical ordering for unordered child nodes
would certainly be a good option, as i can reduce the lookup (but not
insert). also, i think that re-orderings happen relatively rarely.


regards, toby

On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg
<[email protected]> wrote:
> On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting <[email protected]> 
> wrote:
>> Hi,
>>
>> Thinking about this more generally, it would definitely be useful to
>> be able to use a more efficient underlying storage for unorderable
>> nodes. However, there still are hard use cases that require us to
>> support orderable nodes as indicated by the type of the parent node.
>> Thus the implementation basically has two options:
>>
>> 1) Use a single data structure for all nodes, like Jackrabbit
>> currently does. This simplifies the implementation but prevents us
>> from enjoying the performance and scalability benefits of unorderable
>> nodes.
>>
>> 2) Use two data structures, one for orderable and another for
>> unorderable nodes. This is more complex (not least because it links
>> node types to the underlying storage model), but is probably required
>> if we want to efficiently support up to millions of child nodes per a
>> single parent.
>
> i agree that it's probably required for efficiently handling both large
> and small child node lists.
>
> i am not sure that it necessarily means linking node types to the
> underlying storage modal. the microkernel prototype currently has
> no notion of node types and that's IMO a good thing.
>
> node types have a way to dominant role in the current jackrabbit
> core implementation. jackrabbit started out as the official reference
> implementation for jsr-170, the focus was on supporting every
> feature of the spec.
>
> in jr3 i guess we can and should trade some 'note type correctness'
> for improved efficiency.
>
>>
>> It might also be possible to have the underlying storage model
>> unorderable, and just include extra ordering metadata at a higher
>> layer where also node types are handled. Option 2b, if you like, with
>> probably fairly significant overhead when iterating over nodes (the
>> implementation would probably need to pre-fetch all child nodes in
>> advance to access their ordering metadata).
>>
>> If we do have native support for unorderable nodes, then I wouldn't
>> promise any particular ordering (alphabetic, insertion order, etc.)
>> but rather leave it undefined like in Java's Map interface. The
>> underlying implementation is then free to use whatever ordering it
>> thinks is best.
>
> i agree.
>
> cheers
> stefan
>
>>
>> PS. Note that ordering affects not just node traversal, but also the
>> document ordering used in search results. Though in practice we
>> already now disable document ordering of search results by default.
>>
>> BR,
>>
>> Jukka Zitting

Re: [jr3] Orderable child nodes: required (to be the default)?

Reply via email to