nt:unstructured has orderable childnodes, and we also preach to use nt:unstructured whereever you can. Also, i assume that a lot of applications benefit from node ordering (like a CMS). So I think that the majority of the repositories have a lot of nodes with ordered childnodes. thus i would not put too much effort in having an efficient non-ordered format, but making the ordered storage as fast as possible. using lexicographical ordering for unordered child nodes would certainly be a good option, as i can reduce the lookup (but not insert). also, i think that re-orderings happen relatively rarely.
regards, toby On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg <[email protected]> wrote: > On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting <[email protected]> > wrote: >> Hi, >> >> Thinking about this more generally, it would definitely be useful to >> be able to use a more efficient underlying storage for unorderable >> nodes. However, there still are hard use cases that require us to >> support orderable nodes as indicated by the type of the parent node. >> Thus the implementation basically has two options: >> >> 1) Use a single data structure for all nodes, like Jackrabbit >> currently does. This simplifies the implementation but prevents us >> from enjoying the performance and scalability benefits of unorderable >> nodes. >> >> 2) Use two data structures, one for orderable and another for >> unorderable nodes. This is more complex (not least because it links >> node types to the underlying storage model), but is probably required >> if we want to efficiently support up to millions of child nodes per a >> single parent. > > i agree that it's probably required for efficiently handling both large > and small child node lists. > > i am not sure that it necessarily means linking node types to the > underlying storage modal. the microkernel prototype currently has > no notion of node types and that's IMO a good thing. > > node types have a way to dominant role in the current jackrabbit > core implementation. jackrabbit started out as the official reference > implementation for jsr-170, the focus was on supporting every > feature of the spec. > > in jr3 i guess we can and should trade some 'note type correctness' > for improved efficiency. > >> >> It might also be possible to have the underlying storage model >> unorderable, and just include extra ordering metadata at a higher >> layer where also node types are handled. Option 2b, if you like, with >> probably fairly significant overhead when iterating over nodes (the >> implementation would probably need to pre-fetch all child nodes in >> advance to access their ordering metadata). >> >> If we do have native support for unorderable nodes, then I wouldn't >> promise any particular ordering (alphabetic, insertion order, etc.) >> but rather leave it undefined like in Java's Map interface. The >> underlying implementation is then free to use whatever ordering it >> thinks is best. > > i agree. > > cheers > stefan > >> >> PS. Note that ordering affects not just node traversal, but also the >> document ordering used in search results. Though in practice we >> already now disable document ordering of search results by default. >> >> BR, >> >> Jukka Zitting
