Hi,
>So we can implement a "paginated tree traversal"
Yes, I thinks that's a first step, something for oak-core which can be
re-used in multiple places. It might make sense to also create a JCR
version, for other use cases.
Regards,
Thomas
Hi Thomas,
On Fri, Feb 24, 2017 at 1:09 PM, Thomas Mueller wrote:
> 9) Sorting of path is needed, so that the repository can be processed bit
> by bit by bit. For that, the following logic is used, recursively: read at
> most 1000 child nodes. If there are more than 1000, then
Hi,
My suggestion is to _not_ support "resumable" operations on a large tree,
but instead don't use large operations. But I wouldn't call my solution
"sharding", but more "bit-by-bit reindexing". Some more details: For
indexing (specially synchronous property indexes) I suggest to do the
Hi,
A quick side-question related to what Stefan mentioned earlier:
> A stable traversal order at a given revision + node seems like a
prerequisite to me.
Javadoc of NodeState#getChildNodeEntries says:
" Multiple iterations are guaranteed to return the child nodes in
the same order, but the
Hi,
For re-indexing, there are two problems actually:
* Indexing can take multiple days, so resume would be nice
* For synchronous indexes, indexing create a large commit, which is
problematic (specially for MongoDB)
To solve both problems ("kill two birds with one stone"), we could instead
try
On 20/02/17 15:27, Alex Parvulescu wrote:
What about walking the revision history in smaller chunks?
Given a repository history of revisions: r0, r1, , r100, the indexer
would now diff [r0, r100] which can be resource intensive. What if it diffs
by a window of size 10: [r0, r9] (mark), [r10,
On 20/02/2017 13:44, Marcel Reutegger wrote:
>
> Instead of the revision, the implementation can also rely on a
> checkpoint that marks the snapshot of the repository as the basis of
> the large-tree-operation.
I was thinking the same. We may rely on checkpoints and then store
additional info the