[ https://issues.apache.org/jira/browse/OAK-2663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552311#comment-14552311 ]
Thomas Mueller commented on OAK-2663: ------------------------------------- I made some changes to the patch, for example I split the combined method filterFailingUniques. I used a more conservative approach to detect duplicates: the node itself is checked (same as in the old code), not just the existence of the node (as in the patch). While adding entries, it is now checked that the same path is only indexed once. I made those changes because a test case showed problems with the existing approach: if an index is outdated (for example by removing nodes directly in MongoDB), then re-adding the nodes that are already indexed will result in a duplicate key exception. This is now prevented. I also removed some untested, and currently unused code (ContentMirrorStoreStrategy.exists), but added a comment to this issue if this code is ever needed. > Unique property index can trigger OOM during upgrade of large repository > ------------------------------------------------------------------------ > > Key: OAK-2663 > URL: https://issues.apache.org/jira/browse/OAK-2663 > Project: Jackrabbit Oak > Issue Type: Bug > Components: upgrade > Reporter: Chetan Mehrotra > Assignee: Thomas Mueller > Labels: performance > Fix For: 1.3.0, 1.2.3, 1.0.15 > > Attachments: OAK-2663.patch > > > {{PropertyIndexEditor}} when configured for unique index maintains an in > memory state of indexed property in {{keysToCheckForUniqueness}}. This set > would accumulate all the unique values being indexed. > In case of upgrade where the complete upgrade is performed in single commit > this state can become very large. Further later while exiting the editor > validates that all such values are actually unique by iterating over all such > values. > We should look into other possible ways to enforce uniqueness constraint -- This message was sent by Atlassian JIRA (v6.3.4#6332)