[ https://issues.apache.org/jira/browse/OAK-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julian Sedding updated OAK-2619: -------------------------------- Attachment: OAK-2619-patch Patch that provides merge support. In order to do so efficiently, the copy algorithm is changed to merge leaf nodes first and then work its way up to the root. This allows to efficiently determine with a single traversal, whether a node has changes. I ran some tests locally with the following results: *Source Repository* CRX2 repository with TarPM + FileDataStore, containing ~5mio nodes of production content. *Target Repository* Oak with TarMK + FileDataStore Upgrade run with 1GB heap space, merge enabled and binaries copied by reference. Also, I restricted the paths being copied using the feature from OAK-2573 to only copy ~500k nodes, of which ~70% are binaries. No versions were copied. Merge runs had no content changes in the source repository. {noformat} Results Run 1 Run 2 (merge) Without patch 1.008 min 1.037 min With patch 1.146 min 40.50 s {noformat} Also, in different test runs, I logged the diff seen by the commit hooks. This shows lots of changes in the copied content for the merge without the patch and no changes with the patch applied. > Support merging content during upgrade > -------------------------------------- > > Key: OAK-2619 > URL: https://issues.apache.org/jira/browse/OAK-2619 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: upgrade > Affects Versions: 1.1.7 > Reporter: Julian Sedding > Priority: Minor > Attachments: OAK-2619-patch > > > When upgrading from Jackrabbit 2 to Oak there are several scenarios that > could benefit from the ability to merge content rather than overwrite it. > Especially in combination with OAK-2586, i.e. support to include/exclude > selected paths from the copy operation, merging can become very useful. > # Start vanilla product with an empty repo that writes some initial content, > then copy content from a Jackrabbit 2 repo into this instance > # Unify content from several Jackrabbit 2 repositories into a single Oak repo > # Copy all content 1 week before the actual migration, then merge in the diff > on migration day -- This message was sent by Atlassian JIRA (v6.3.4#6332)