Hello all, while discussing the backup-strategies for our jackrabbit-repository the following "feature" came to our minds:
What about having the cluster-journal doing replication of the peristent-storage? The goal that we want to reach is having a copy of our repository in a second database. At best it would have a little "delay", meaning that it reflects the data from e.g. an hour ago. If we could instruct a specialized cluster-node to share the same journal, but use a different persistent storage that would be the first step. The second step would be to tell the cluster-sync-thread to sync revisions from a given offset to the current global revision. I read the source code an found the method "doExternal" in the SharedItemState manager which is responsible for applying changes from the journal. Inside i found the line "state.copy(currentState, true)" which applies the full state from the journal to the actual state. Is this correct? In the documentation for clustering it says that every clusternode must have access to the same persistent storage, because the property-values are not included in the journal. How does this relate to "state.copy" ? The backup-repository would of course share the same datastore as the main repository, so the journal only needs to have the values from the persistent storage. Is this really a big performance issue as said in the docs? What else would have to be done besides adding functionality for item-creation in "doExternal" to implement the cluster for replication feature? Are there any obstacles we don't see? I think such a feature would be a big leap to high availability of a JCR-Repository because it avoids the time consuming index-(re)creation when restoring a backup. Just have a second replicated repository and your done! With the delay function for replication we could also avoid data-corruption issues... regards Markus
