[ 
https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819969#comment-17819969
 ] 

Julian Reschke edited comment on OAK-10657 at 2/23/24 12:39 PM:
----------------------------------------------------------------

Trying to write down an alternate approach...:
 - While persisting the branch commits, we are persisting large :childOrder 
properties repeatedly. In practice, only the last value is needed, so the 
previous ones could be cleaned up.
 - We currently do not keep information about when (revision) and where (_id) 
we have set :childOrder.
 - The "clean" approach would be to maintain a map of _id/revision that tells 
us in which revision we last set :childOrder. That could be used to pair the 
setting of the new value with a removal of the previous one.
 - But we may be able to simplify that: just maintain a list of _all_ revisions 
that changed :childOrder, and any time we need to set a new value for 
:childOrder, nuke the entries for all of these revisions. This would be 
harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, 
except fo ra small overhead in processing.

EDIT: opened OAK-10660 to track this


was (Author: reschke):
Trying to write down an alternate approach...:
 - While persisting the branch commits, we are persisting large :childOrder 
properties repeatedly. In practice, only the last value is needed, so the 
previous ones could be cleaned up.
 - We currently do not keep information about when (revision) and where (_id) 
we have set :childOrder.
 - The "clean" approach would be to maintain a map of _id/revision that tells 
us in which revision we last set :childOrder. That could be used to pair the 
setting of the new value with a removal of the previous one.
 - But we may be able to simplify that: just maintain a list of _all_ revisions 
that changed :childOrder, and any time we need to set a new value for 
:childOrder, nuke the entries for all of these revisions. This would be 
harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, 
except fo ra small overhead in processing.

> MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB 
> limit
> -------------------------------------------------------------------------------
>
>                 Key: OAK-10657
>                 URL: https://issues.apache.org/jira/browse/OAK-10657
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: documentmk, mongomk
>            Reporter: Julian Reschke
>            Assignee: Julian Reschke
>            Priority: Major
>
> To address the 16MB/childorder issue, there are many potential approaches:
> - make GC more aggressive 
> - try to change updates to remove "in-between" changes of ":childOrder" 
> property
> - change the data model of ":childOrder"
> - try to shrink document in DB once size related exception happens
> This ticket is about the last of these options.
> Proposal:
> - improve exception thrown by document store so that it can be acted upon
> - in document store utils add a method that inspects a document and produces 
> UpdateOps suitable to shrink the document
> - DocumentNodeStore commit could catch exception, obtain update ops, apply 
> them, and retry once (this should be dependant on a feature toggle)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to