[jira] [Comment Edited] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17821059#comment-17821059 ] Julian Reschke edited comment on OAK-10660 at 2/27/24 9:05 AM: --- TODO: - retest with [~stefanegli]'s change - integrate feature toggle - add a DocumentStore test proving that we can remove map entries that do not exist (OAK-10673) was (Author: reschke): TODO: - retest with [~stefanegli]'s change - integrate feature toggle - add a DocumentStore test proving that we can remove map entries that do not exist > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820636#comment-17820636 ] Stefan Egli edited comment on OAK-10660 at 2/26/24 9:17 AM: In DocumentNodeStore the revisions are actually already tracked in the {{org.apache.jackrabbit.oak.plugins.document.Branch}} - which is reachable via the {{org.apache.jackrabbit.oak.plugins.document.UnmergedBranches}} - try {noformat} // rev being the branch commit revision vector dns.getBranches().getBranch(rev); {noformat} That contains the unmerged branches of the local instance - which I guess is what we want (IOW I don't think we need a new revisions set) was (Author: egli): In DocumentNodeStore the revisions are actually already tracked in the {{org.apache.jackrabbit.oak.plugins.document.Branch}} - which is reachable via the {{org.apache.jackrabbit.oak.plugins.document.UnmergedBranches}} - try {noformat} // rev being the branch commit revision vector dns.getBranches().getBranch(rev); {noformat} That contains the unmerged branches of the local instance - which I guess is what we want. > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10660) DocumentNodeStore: avoid repeated commits of :childOrder in branch commits
[ https://issues.apache.org/jira/browse/OAK-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17820045#comment-17820045 ] Julian Reschke edited comment on OAK-10660 at 2/23/24 12:42 PM: Here: {noformat} diff --git a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java index 1b04f62fa5..5cdf3901da 100644 --- a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java +++ b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java @@ -78,6 +78,9 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { /** The maximum number of updates to keep in memory */ private final int updateLimit;+ /** Revisions written by us */ + private final Set revisions = new HashSet<>(); + /** * State of the this branch. Either {@link Unmodified}, {@link InMemory}, {@link Persisted}, * {@link ResetFailed} or {@link Merged}. @@ -321,6 +324,7 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { c.apply(); rev = store.done(c, base.getRootRevision().isBranch(), info); success = true; + revisions.add(c.getRevision()); } finally { if (!success) { store.canceled(c); {noformat} would be a good place to track what revisions are relevant. Now we need to figure out how to pass this down to the place where we create the UpdateOps. was (Author: reschke): Here: {noformat} diff --git a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java index 1b04f62fa5..5cdf3901da 100644 --- a/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java +++ b/oak-store-document/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentNodeStoreBranch.java @@ -78,6 +78,9 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { /** The maximum number of updates to keep in memory */ private final int updateLimit;+ /** Revisions written by us */ + private final Set revisions = new HashSet<>(); + /** * State of the this branch. Either {@link Unmodified}, {@link InMemory}, {@link Persisted}, * {@link ResetFailed} or {@link Merged}. @@ -321,6 +324,7 @@ class DocumentNodeStoreBranch implements NodeStoreBranch { c.apply(); rev = store.done(c, base.getRootRevision().isBranch(), info); success = true; + revisions.add(c.getRevision()); } finally { if (!success) { store.canceled(c); {noformat} would be a good place to track what revisions are relevant. Now we need to figure out how to pass this down to the place where we create the {UpdateOp}s. > DocumentNodeStore: avoid repeated commits of :childOrder in branch commits > -- > > Key: OAK-10660 > URL: https://issues.apache.org/jira/browse/OAK-10660 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > > - While persisting the branch commits, we are persisting large :childOrder > properties repeatedly. In practice, only the last value is needed, so the > previous ones could be cleaned up. > - We currently do not keep information about when (revision) and where (_id) > we have set :childOrder. > - The "clean" approach would be to maintain a map of _id/revision that tells > us in which revision we last set :childOrder. That could be used to pair the > setting of the new value with a removal of the previous one. > - But we may be able to simplify that: just maintain a list of _all_ > revisions that changed :childOrder, and any time we need to set a new value > for :childOrder, nuke the entries for all of these revisions. This would be > harmless because an extra REMOVE_MAP_ENTRY operation is essentially free, > except fo ra small overhead in processing. -- This message was sent by Atlassian Jira (v8.20.10#820010)