[jira] [Comment Edited] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit

2024-02-23 Thread Julian Reschke (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819969#comment-17819969
 ] 

Julian Reschke edited comment on OAK-10657 at 2/23/24 12:39 PM:


Trying to write down an alternate approach...:
 - While persisting the branch commits, we are persisting large :childOrder 
properties repeatedly. In practice, only the last value is needed, so the 
previous ones could be cleaned up.
 - We currently do not keep information about when (revision) and where (_id) 
we have set :childOrder.
 - The "clean" approach would be to maintain a map of _id/revision that tells 
us in which revision we last set :childOrder. That could be used to pair the 
setting of the new value with a removal of the previous one.
 - But we may be able to simplify that: just maintain a list of _all_ revisions 
that changed :childOrder, and any time we need to set a new value for 
:childOrder, nuke the entries for all of these revisions. This would be 
harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, 
except fo ra small overhead in processing.

EDIT: opened OAK-10660 to track this


was (Author: reschke):
Trying to write down an alternate approach...:
 - While persisting the branch commits, we are persisting large :childOrder 
properties repeatedly. In practice, only the last value is needed, so the 
previous ones could be cleaned up.
 - We currently do not keep information about when (revision) and where (_id) 
we have set :childOrder.
 - The "clean" approach would be to maintain a map of _id/revision that tells 
us in which revision we last set :childOrder. That could be used to pair the 
setting of the new value with a removal of the previous one.
 - But we may be able to simplify that: just maintain a list of _all_ revisions 
that changed :childOrder, and any time we need to set a new value for 
:childOrder, nuke the entries for all of these revisions. This would be 
harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, 
except fo ra small overhead in processing.

> MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB 
> limit
> ---
>
> Key: OAK-10657
> URL: https://issues.apache.org/jira/browse/OAK-10657
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> To address the 16MB/childorder issue, there are many potential approaches:
> - make GC more aggressive 
> - try to change updates to remove "in-between" changes of ":childOrder" 
> property
> - change the data model of ":childOrder"
> - try to shrink document in DB once size related exception happens
> This ticket is about the last of these options.
> Proposal:
> - improve exception thrown by document store so that it can be acted upon
> - in document store utils add a method that inspects a document and produces 
> UpdateOps suitable to shrink the document
> - DocumentNodeStore commit could catch exception, obtain update ops, apply 
> them, and retry once (this should be dependant on a feature toggle)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit

2024-02-22 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819633#comment-17819633
 ] 

Stefan Egli edited comment on OAK-10657 at 2/22/24 12:44 PM:
-

Reading commit root is at least one of the ways, yes. -Unless for example 
DocumentNodeStoreBranch starts to cache branch commit revisions (or something 
along those lines).. there might be some room for optimization perhaps.-  
(scratch that, Branch has all of that already ...) But resolving the commit 
value is the classic approach yes.


was (Author: egli):
Reading commit root is at least one of the ways, yes. Unless for example 
DocumentNodeStoreBranch starts to cache branch commit revisions (or something 
along those lines).. there might be some room for optimization perhaps. But 
resolving the commit value is the classic approach yes.

> MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB 
> limit
> ---
>
> Key: OAK-10657
> URL: https://issues.apache.org/jira/browse/OAK-10657
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> To address the 16MB/childorder issue, there are many potential approaches:
> - make GC more aggressive 
> - try to change updates to remove "in-between" changes of ":childOrder" 
> property
> - change the data model of ":childOrder"
> - try to shrink document in DB once size related exception happens
> This ticket is about the last of these options.
> Proposal:
> - improve exception thrown by document store so that it can be acted upon
> - in document store utils add a method that inspects a document and produces 
> UpdateOps suitable to shrink the document
> - DocumentNodeStore commit could catch exception, obtain update ops, apply 
> them, and retry once (this should be dependant on a feature toggle)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (OAK-10657) MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB limit

2024-02-21 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17819280#comment-17819280
 ] 

Stefan Egli edited comment on OAK-10657 at 2/21/24 2:46 PM:


Or to put a bit simpler:

* 5. would be after-commit large property checks - in the background
* 6. would be heuristic based before-commit large property checks

while
* 4. would be on-exception large property check - on the spot


was (Author: egli):
Or to put a bit simpler:

* 5. would be after-commit large property checks - in the background
* 6. would be heuristic based before-commit large property checks

> MongoDocumentStore: shrink in-DB documents after updates fail due to 16MB 
> limit
> ---
>
> Key: OAK-10657
> URL: https://issues.apache.org/jira/browse/OAK-10657
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> To address the 16MB/childorder issue, there are many potential approaches:
> - make GC more aggressive 
> - try to change updates to remove "in-between" changes of ":childOrder" 
> property
> - change the data model of ":childOrder"
> - try to shrink document in DB once size related exception happens
> This ticket is about the last of these options.
> Proposal:
> - improve exception thrown by document store so that it can be acted upon
> - in document store utils add a method that inspects a document and produces 
> UpdateOps suitable to shrink the document
> - DocumentNodeStore commit could catch exception, obtain update ops, apply 
> them, and retry once (this should be dependant on a feature toggle)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)