[ https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819969#comment-17819969 ]
Julian Reschke edited comment on OAK-10657 at 2/23/24 12:39 PM: ---------------------------------------------------------------- Trying to write down an alternate approach...: - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. EDIT: opened OAK-10660 to track this was (Author: reschke): Trying to write down an alternate approach...: - While persisting the branch commits, we are persisting large :childOrder properties repeatedly. In practice, only the last value is needed, so the previous ones could be cleaned up. - We currently do not keep information about when (revision) and where (_id) we have set :childOrder. - The "clean" approach would be to maintain a map of _id/revision that tells us in which revision we last set :childOrder. That could be used to pair the setting of the new value with a removal of the previous one. - But we may be able to simplify that: just maintain a list of _all_ revisions that changed :childOrder, and any time we need to set a new value for :childOrder, nuke the entries for all of these revisions. This would be harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, except fo ra small overhead in processing. > MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB > limit > ------------------------------------------------------------------------------- > > Key: OAK-10657 > URL: https://issues.apache.org/jira/browse/OAK-10657 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk, mongomk > Reporter: Julian Reschke > Assignee: Julian Reschke > Priority: Major > > To address the 16MB/childorder issue, there are many potential approaches: > - make GC more aggressive > - try to change updates to remove "in-between" changes of ":childOrder" > property > - change the data model of ":childOrder" > - try to shrink document in DB once size related exception happens > This ticket is about the last of these options. > Proposal: > - improve exception thrown by document store so that it can be acted upon > - in document store utils add a method that inspects a document and produces > UpdateOps suitable to shrink the document > - DocumentNodeStore commit could catch exception, obtain update ops, apply > them, and retry once (this should be dependant on a feature toggle) -- This message was sent by Atlassian Jira (v8.20.10#820010)