[ https://issues.apache.org/jira/browse/OAK-6081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chetan Mehrotra updated OAK-6081: --------------------------------- Description: To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run The tool would support # For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index -> importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much # For SegementNodeStore setup it would be possible to index on a cloned setup and then provide a way to copy the index back Future Enhancements # *Resumable tarversal* - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833) # *Multithreaded traversal* - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged was: To enable better management for indexing related operation specially around reindexing indexes on large repository setup we should implement some tooling as part of oak-run The tool would support # *Resumable tarversal* - It should be able to reindex large repo with resumable traversal such that even if indexing breaks due to some issue it can resume from last state (OAK-5833) # *Multithreaded traversal* - Current indexing is single threaded and hence for large repo it can take long time. Plan here is to support multi threaded indexing where each thread can be assigned a part of repository tree to index and in the end the indexes are merged # For DocumentNodeStore setup it would be possible to connect oak-run to a live cluster and it would take care of indexing -> storing index on disk -> merging index -> importing it back at end. This would ensure that live setup faces minimum disruption and is not loaded much # For SegementNodeStore setup it would be possible to index on a cloned setup and then provide a way to copy the index back > Indexing tooling via oak-run > ---------------------------- > > Key: OAK-6081 > URL: https://issues.apache.org/jira/browse/OAK-6081 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: indexing, run > Reporter: Chetan Mehrotra > Assignee: Chetan Mehrotra > Fix For: 1.8, 1.7.4 > > > To enable better management for indexing related operation specially around > reindexing indexes on large repository setup we should implement some tooling > as part of oak-run > The tool would support > # For DocumentNodeStore setup it would be possible to connect oak-run to a > live cluster and it would take care of indexing -> storing index on disk -> > merging index -> importing it back at end. This would ensure that live setup > faces minimum disruption and is not loaded much > # For SegementNodeStore setup it would be possible to index on a cloned setup > and then provide a way to copy the index back > Future Enhancements > # *Resumable tarversal* - It should be able to reindex large repo with > resumable traversal such that even if indexing breaks due to some issue it > can resume from last state (OAK-5833) > # *Multithreaded traversal* - Current indexing is single threaded and hence > for large repo it can take long time. Plan here is to support multi threaded > indexing where each thread can be assigned a part of repository tree to index > and in the end the indexes are merged -- This message was sent by Atlassian JIRA (v6.4.14#64029)