[jira] [Commented] (OAK-2685) Track root state revision when reading the tree
[ https://issues.apache.org/jira/browse/OAK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383216#comment-14383216 ] Chetan Mehrotra commented on OAK-2685: -- This would be quite useful and would obviate the need for changes proposed in the observation logic as part of OAK-2669 > Track root state revision when reading the tree > --- > > Key: OAK-2685 > URL: https://issues.apache.org/jira/browse/OAK-2685 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.3.1 > > > Currently the DocumentNodeState has two revisions: > - {{getRevision()}} returns the read revision of this node state. This > revision was used to read the node state from the underlying {{NodeDocument}}. > - {{getLastRevision()}} returns the revision when this node state was last > modified. This revision also reflects changes done further below the tree > when the node state was not directly affected by a change. > The lastRevision of a state is then used as the read revision of the child > node states. This avoids reading the entire tree again with a different > revision after the head revision changed because of a commit. > This approach has at least two problems related to comparing node states: > - It does not work well with the current DiffCache implementation and affects > the hit rate of this cache. The DiffCache is pro-actively populated after a > commit. The key for a diff is a combination of previous and current commit > revision and the path. The value then tells what child nodes were > added/removed/changed. As the comparison of node states proceeds and > traverses the tree, the revision of a state may go back in time because the > lastRevision is used as the read revision of the child nodes. This will cause > misses in the diff cache, because the revisions do not match the previous and > current commit revisions as used to create the cache entries. OAK-2562 tried > to address this by keeping the read revision for child nodes at the read > revision of the parent in calls of compareAgainstBaseState() when there is a > diff cache hit. However, it turns out node state comparison does not always > start at the root state. The {{EventQueue}} implementation in oak-jcr will > start at the paths as indicated by the filter of the listener. This means, > OAK-2562 is not effective in this case and the diff needs to be calculated > again based on a set of revisions, which is different from the original > commit. > - When a diff is calculated for a parent with many child nodes, the > {{DocumentNodeStore}} will perform a query on the underlying > {{DocumentStore}} to get child nodes modified after a given timestamp. This > timestamp is derived from the lower revision of the two lastRevisions of the > parent node states to compare. The query gets problematic for the > {{DocumentStore}} if the timestamp is too far in the past. This will happen > when the parent node (and sub-tree) was not modified for some time. E.g. the > {{MongoDocumentStore}} has an index on the _id and the _modified field. But > if there are many child nodes the _id index will not be that helpful and if > the timestamp is too far in the past, the _modified index is not selective > either. This problem was already reported in OAK-1970 and linked issues. > Both of the above problems could be addressed by keeping track of the read > revision of the root node state in each of the node states as the tree is > traversed. The revision of the root state would then be used e.g. to derive > the timestamp for the _modified constraint in the query. Because the revision > of the root state is rather recent, the _modified constraint is very > selective and the index on it would be the preferred choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2349) DiffCache based on persistent cache
[ https://issues.apache.org/jira/browse/OAK-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14383211#comment-14383211 ] Chetan Mehrotra commented on OAK-2349: -- With OAK-2669 we would be using persistent cache so goal of this issue would be met. > DiffCache based on persistent cache > --- > > Key: OAK-2349 > URL: https://issues.apache.org/jira/browse/OAK-2349 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Thomas Mueller > Fix For: 1.3.0 > > > There is currently an in-memory and MongoDB based DiffCache implementation. > It would be good to replace them with an implementation based on the > persistent cache introduced with OAK-2191. This reduces traffic to MongoDB > and can also be used in combination with the RDB backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2592) Commit does not ensure w:majority
[ https://issues.apache.org/jira/browse/OAK-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2592: --- Fix Version/s: (was: 1.4) 1.3.1 > Commit does not ensure w:majority > - > > Key: OAK-2592 > URL: https://issues.apache.org/jira/browse/OAK-2592 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.3.1 > > > The MongoDocumentStore uses {{findAndModify()}} to commit a transaction. This > operation does not allow an application specified write concern and always > uses the MongoDB default write concern {{Acknowledged}}. This means a commit > may not make it to a majority of a replica set when the primary fails. From a > MongoDocumentStore perspective it may appear as if a write was successful and > later reverted. See also the test in OAK-1641. > To fix this, we'd probably have to change the MongoDocumentStore to avoid > {{findAndModify()}} and use {{update()}} instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-1980) Use index on non-root node
[ https://issues.apache.org/jira/browse/OAK-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-1980: --- Fix Version/s: (was: 1.4) 1.3.0 > Use index on non-root node > -- > > Key: OAK-1980 > URL: https://issues.apache.org/jira/browse/OAK-1980 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Marcel Reutegger >Assignee: Davide Giannella > Fix For: 1.3.0 > > Attachments: OAK-1980.patch > > > Oak is able to maintain indexes on any location in the hierarchy. However the > lookup for most index implementations only make use of an index under the > root node. There are various TODOs in the code regarding this, e.g. in > PropertyIndex. Looking up an index along the filter path adds some additional > cost, but should be within reasonable bounds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2556) do intermediate commit during async indexing
[ https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2556: --- Fix Version/s: (was: 1.4) 1.3.1 > do intermediate commit during async indexing > > > Key: OAK-2556 > URL: https://issues.apache.org/jira/browse/OAK-2556 > Project: Jackrabbit Oak > Issue Type: Bug > Components: oak-lucene >Affects Versions: 1.0.11 >Reporter: Stefan Egli > Fix For: 1.3.1 > > > A recent issue found at a customer unveils a potential issue with the async > indexer. Reading the AsyncIndexUpdate.updateIndex it looks like it is doing > the entire update of the async indexer *in one go*, ie in one commit. > When there is - for some reason - however, a huge diff that the async indexer > has to process, the 'one big commit' can become gigantic. There is no limit > to the size of the commit in fact. > So the suggestion is to do intermediate commits while the async indexer is > going on. The reason this is acceptable is the fact that by doing async > indexing, that index is anyway not 100% up-to-date - so it would not make > much of a difference if it would commit after every 100 or 1000 changes > either. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-1769) Better cooperation for conflicting updates across cluster nodes
[ https://issues.apache.org/jira/browse/OAK-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-1769: --- Fix Version/s: (was: 1.4) 1.3.1 > Better cooperation for conflicting updates across cluster nodes > --- > > Key: OAK-1769 > URL: https://issues.apache.org/jira/browse/OAK-1769 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.3.1 > > > Every now and then we see commit failures in a cluster when many sessions try > to update the same property or perform some other conflicting update. > The current implementation will retry the merge after a delay, but chances > are some session on another cluster node again changed the property in the > meantime. This will lead to yet another retry until the limit is reached and > the commit fails. The conflict logic is quite unfair, because it favors the > winning session. > The implementation should be improved to show a more fair behavior across > cluster nodes when there are conflicts caused by competing session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-1550) Incorrect handling of addExistingNode conflict in NodeStore
[ https://issues.apache.org/jira/browse/OAK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-1550: --- Fix Version/s: (was: 1.4) 1.3.1 > Incorrect handling of addExistingNode conflict in NodeStore > --- > > Key: OAK-1550 > URL: https://issues.apache.org/jira/browse/OAK-1550 > Project: Jackrabbit Oak > Issue Type: Bug > Components: mongomk >Reporter: Michael Dürig >Assignee: Marcel Reutegger > Fix For: 1.3.1 > > > {{MicroKernel.rebase}} says: "addExistingNode: node has been added that is > different from a node of them same name that has been added to the trunk." > However, the {{NodeStore}} implementation > # throws a {{CommitFailedException}} itself instead of annotating the > conflict, > # also treats the equal childs with the same name as a conflict. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-1553) More sophisticated conflict resolution when concurrently adding nodes
[ https://issues.apache.org/jira/browse/OAK-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-1553: --- Fix Version/s: (was: 1.4) 1.3.1 > More sophisticated conflict resolution when concurrently adding nodes > - > > Key: OAK-1553 > URL: https://issues.apache.org/jira/browse/OAK-1553 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mk, mongomk, segmentmk >Reporter: Michael Dürig >Assignee: Michael Dürig > Labels: concurrency > Fix For: 1.3.1 > > Attachments: OAK-1553.patch > > > {{MicroKernel.rebase}} currently specifies: "addExistingNode: A node has been > added that is different from a node of them same name that has been added to > the trunk." > This is somewhat troublesome in the case where the same node with different > but non conflicting child items is added concurrently: > {code} > f.add("fo").add("u1"); commit(); > f.add("fo").add("u2"); commit(); > {code} > currently fails with a conflict because {{fo}} is not the same node for the > both cases. See discussion http://markmail.org/message/flst4eiqvbp4gi3z -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2685) Track root state revision when reading the tree
[ https://issues.apache.org/jira/browse/OAK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2685: --- Fix Version/s: 1.3.1 > Track root state revision when reading the tree > --- > > Key: OAK-2685 > URL: https://issues.apache.org/jira/browse/OAK-2685 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger > Fix For: 1.3.1 > > > Currently the DocumentNodeState has two revisions: > - {{getRevision()}} returns the read revision of this node state. This > revision was used to read the node state from the underlying {{NodeDocument}}. > - {{getLastRevision()}} returns the revision when this node state was last > modified. This revision also reflects changes done further below the tree > when the node state was not directly affected by a change. > The lastRevision of a state is then used as the read revision of the child > node states. This avoids reading the entire tree again with a different > revision after the head revision changed because of a commit. > This approach has at least two problems related to comparing node states: > - It does not work well with the current DiffCache implementation and affects > the hit rate of this cache. The DiffCache is pro-actively populated after a > commit. The key for a diff is a combination of previous and current commit > revision and the path. The value then tells what child nodes were > added/removed/changed. As the comparison of node states proceeds and > traverses the tree, the revision of a state may go back in time because the > lastRevision is used as the read revision of the child nodes. This will cause > misses in the diff cache, because the revisions do not match the previous and > current commit revisions as used to create the cache entries. OAK-2562 tried > to address this by keeping the read revision for child nodes at the read > revision of the parent in calls of compareAgainstBaseState() when there is a > diff cache hit. However, it turns out node state comparison does not always > start at the root state. The {{EventQueue}} implementation in oak-jcr will > start at the paths as indicated by the filter of the listener. This means, > OAK-2562 is not effective in this case and the diff needs to be calculated > again based on a set of revisions, which is different from the original > commit. > - When a diff is calculated for a parent with many child nodes, the > {{DocumentNodeStore}} will perform a query on the underlying > {{DocumentStore}} to get child nodes modified after a given timestamp. This > timestamp is derived from the lower revision of the two lastRevisions of the > parent node states to compare. The query gets problematic for the > {{DocumentStore}} if the timestamp is too far in the past. This will happen > when the parent node (and sub-tree) was not modified for some time. E.g. the > {{MongoDocumentStore}} has an index on the _id and the _modified field. But > if there are many child nodes the _id index will not be that helpful and if > the timestamp is too far in the past, the _modified index is not selective > either. This problem was already reported in OAK-1970 and linked issues. > Both of the above problems could be addressed by keeping track of the read > revision of the root node state in each of the node states as the tree is > traversed. The revision of the root state would then be used e.g. to derive > the timestamp for the _modified constraint in the query. Because the revision > of the root state is rather recent, the _modified constraint is very > selective and the index on it would be the preferred choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2349) DiffCache based on persistent cache
[ https://issues.apache.org/jira/browse/OAK-2349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2349: --- Fix Version/s: (was: 1.4) 1.3.0 > DiffCache based on persistent cache > --- > > Key: OAK-2349 > URL: https://issues.apache.org/jira/browse/OAK-2349 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Thomas Mueller > Fix For: 1.3.0 > > > There is currently an in-memory and MongoDB based DiffCache implementation. > It would be good to replace them with an implementation based on the > persistent cache introduced with OAK-2191. This reduces traffic to MongoDB > and can also be used in combination with the RDB backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2622) dynamic cache allocation
[ https://issues.apache.org/jira/browse/OAK-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Marth updated OAK-2622: --- Fix Version/s: (was: 1.4) 1.3.1 > dynamic cache allocation > > > Key: OAK-2622 > URL: https://issues.apache.org/jira/browse/OAK-2622 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Affects Versions: 1.0.12 >Reporter: Stefan Egli > Fix For: 1.3.1 > > > At the moment mongoMk's various caches are configurable (OAK-2546) but other > than that static in terms of size. Different use-cases might require > different allocations of the sub caches though. And it might not always be > possible to find a good configuration upfront for all use cases. > We might be able to come up with dynamically allocating the overall cache > size to the different sub-caches, based on which cache is how heavily loaded > or how well performing for example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2689) Test failure: QueryResultTest.testGetSize
Michael Dürig created OAK-2689: -- Summary: Test failure: QueryResultTest.testGetSize Key: OAK-2689 URL: https://issues.apache.org/jira/browse/OAK-2689 Project: Jackrabbit Oak Issue Type: Bug Components: core Environment: Jenkins, Ubuntu: https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ Reporter: Michael Dürig Fix For: 1.2 {{org.apache.jackrabbit.core.query.QueryResultTest.testGetSize}} fails every couple of builds: {noformat} junit.framework.AssertionFailedError: Wrong size of NodeIterator in result expected:<48> but was:<-1> at junit.framework.Assert.fail(Assert.java:50) at junit.framework.Assert.failNotEquals(Assert.java:287) at junit.framework.Assert.assertEquals(Assert.java:67) at junit.framework.Assert.assertEquals(Assert.java:134) at org.apache.jackrabbit.core.query.QueryResultTest.testGetSize(QueryResultTest.java:47) {noformat} Failure seen at builds: 29, 39, 59 See e.g. https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/59/jdk=jdk-1.6u45,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/testReport/junit/org.apache.jackrabbit.core.query/QueryResultTest/testGetSize/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2596) more (jmx) instrumentation for observation queue
[ https://issues.apache.org/jira/browse/OAK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig resolved OAK-2596. Resolution: Fixed Fixed at http://svn.apache.org/r1669362. Sorry for the slightly misleading commit message. The fix is based on Jackrabbit 2.10. No snapshots involved. > more (jmx) instrumentation for observation queue > > > Key: OAK-2596 > URL: https://issues.apache.org/jira/browse/OAK-2596 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Affects Versions: 1.0.12 >Reporter: Stefan Egli >Assignee: Michael Dürig >Priority: Blocker > Labels: monitoring, observation > Fix For: 1.1.8 > > > While debugging issues with the observation queue it would be handy to have > more detailed information available. At the moment you can only see one value > wrt length of the queue: that is the maximum of all queues. It is unclear if > the queue is that long for only one or many listeners. And it is unclear from > that if the listener is slow or the engine that produces the events for the > listener. > So I'd suggest to add the following details - possible exposed via JMX? : > # add queue length details to each of the observation listeners > # have a history of the last, eg 1000 events per listener showing a) how long > the event took to be created/generated and b) how long the listener took to > process. Sometimes averages are not detailed enough so such a in-depth > information might become useful. (Not sure about the feasibility of '1000' > here - maybe that could be configurable though - just putting the idea out > here). > # have some information about whether a listener is currently 'reading events > from the cache' or whether it has to go to eg mongo > # maybe have a 'top 10' listeners that have the largest queue at the moment > to easily allow navigation instead of having to go through all (eg 200) > listeners manually each time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela resolved OAK-2661. - Resolution: Fixed > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig >Assignee: angela > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382125#comment-14382125 ] angela commented on OAK-2661: - IMHO, the changes committed revision 1669363 should fix the problem. please reopen if they still appear on jenkins. > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig >Assignee: angela > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382122#comment-14382122 ] angela commented on OAK-2661: - o, and one of the 3 affected tests didn't save the group-cleanup. > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig >Assignee: angela > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2669) Use Consolidated diff for local changes with persistent cache to avoid calculating diff again
[ https://issues.apache.org/jira/browse/OAK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382060#comment-14382060 ] Marcel Reutegger commented on OAK-2669: --- I also don't particularly like the {{ContentChangeInfo}} hidden behind the {{ContentChangeInfoProvider}} interface. But I think it is a good prototype to validate the general approach with the local diff cache. At least for the root node revision part of the {{ContentChangeInfo}} we could get rid of it by implementing OAK-2685. The root revision would then be available in any {{DocumentNodeState}}. The isLocalChange() can probably also be replace with a check for a matching base revision in the consolidated diff. I'll come up with a patch for OAK-2685 and then we may be able to merge the two ideas and remove the {{ContentChangeInfo}}. > Use Consolidated diff for local changes with persistent cache to avoid > calculating diff again > - > > Key: OAK-2669 > URL: https://issues.apache.org/jira/browse/OAK-2669 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Reporter: Chetan Mehrotra > Fix For: 1.3.0 > > Attachments: OAK-2669-A.patch > > > Currently the diff logic in DocumentMK makes use of DiffCache which has an in > memory implementation and a Mongo based implementation. Given that we need to > have a fast observation support for local changes it would be better to make > use of persistent cache. After discussing with [~mreutegg] following changes > need to be done in current logic > # Have the Commit#applyChanges push the commit diff to persistent cache with > current commit revision as key > # In compare pull out the diff from persistent cache and if present use that. > Note that this diff is for complete tree compared to current JSOP diff used > which is only per node level. So need to change the way diff is pushed back > to NodeStateDiff > Above change should avoid hitting mongo all together for determining the > diff. Only extra work performed in diff calculation would be determining the > node state view for the base revision. Later we can think of also include > node state base revision as part of diff so as to avoid this extra work all > together and rely on node state from persistent cache for that work also > See also http://markmail.org/thread/bzmwcp7k4wmtw6od -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382047#comment-14382047 ] angela commented on OAK-2661: - well... that looks like the result of a lazy test-writer that doesn't make sure she creates a unique group :-) > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela reassigned OAK-2661: --- Assignee: angela > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig >Assignee: angela > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-1826) Empty directories not cleaned up when gc run on FileDataStore
[ https://issues.apache.org/jira/browse/OAK-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-1826. --- Resolution: Fixed Jackrabbit dependency in oak trunk is now 2.10.0. Resolving as fixed. > Empty directories not cleaned up when gc run on FileDataStore > - > > Key: OAK-1826 > URL: https://issues.apache.org/jira/browse/OAK-1826 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob >Reporter: Amit Jain >Assignee: Amit Jain >Priority: Minor > Fix For: 1.1.8 > > > The garbage collection only deletes the particular files identified as > garbage. Any empty directories remaining after this operation are not cleaned > up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2632) Upgrade Jackrabbit dependency to 2.10.0
[ https://issues.apache.org/jira/browse/OAK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-2632. --- Resolution: Fixed Updated in trunk: http://svn.apache.org/r1669356 > Upgrade Jackrabbit dependency to 2.10.0 > --- > > Key: OAK-2632 > URL: https://issues.apache.org/jira/browse/OAK-2632 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Michael Dürig >Assignee: Marcel Reutegger >Priority: Blocker > Fix For: 1.1.8 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2661) Glob restriction test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382024#comment-14382024 ] Michael Dürig commented on OAK-2661: [~anchela], do you know what could be the cause here? Currently this fails roughly 1 out of 5 builds. > Glob restriction test failures on Jenkins > - > > Key: OAK-2661 > URL: https://issues.apache.org/jira/browse/OAK-2661 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail often on Jenkins: > {noformat} > testGlobRestriction2(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > testGlobRestriction3(org.apache.jackrabbit.oak.jcr.security.authorization.ReadTest): > Authorizable with ID group2 already exists > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/33/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2679) Query engine: cache execution plans
[ https://issues.apache.org/jira/browse/OAK-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-2679: Attachment: OAK-2679.patch OAK-2679.patch should be slightly better, and allow for clearing the cache via JMX. The cache size is still unlimited, and plans for joins are not cached. > Query engine: cache execution plans > --- > > Key: OAK-2679 > URL: https://issues.apache.org/jira/browse/OAK-2679 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, query >Reporter: Thomas Mueller >Assignee: Thomas Mueller > Fix For: 1.3.0 > > Attachments: OAK-2679.patch, executionplancache.patch > > > If there are many indexes, preparing a query can take a long time, in > relation to executing the query. > The query execution plans can be cached. The cache should be invalidated if > there are new indexes, or indexes are changed; a simple solution might be to > use a timeout, and / or a manual cache clean via JMX or so. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2688) Segment.readString optimization
[ https://issues.apache.org/jira/browse/OAK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-2688: Attachment: OAK-2688.patch A first patch (I did not yet measure performance) > Segment.readString optimization > --- > > Key: OAK-2688 > URL: https://issues.apache.org/jira/browse/OAK-2688 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: oak-core >Reporter: Thomas Mueller >Assignee: Thomas Mueller > Fix For: 1.3.0 > > Attachments: OAK-2688.patch > > > The method Segment.readString is called a lot, and even a small optimization > would improve performance for some use cases. Currently it uses a concurrent > hash map to cache strings. It might be possible to speed up this cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2688) Segment.readString optimization
Thomas Mueller created OAK-2688: --- Summary: Segment.readString optimization Key: OAK-2688 URL: https://issues.apache.org/jira/browse/OAK-2688 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-core Reporter: Thomas Mueller Assignee: Thomas Mueller Fix For: 1.3.0 The method Segment.readString is called a lot, and even a small optimization would improve performance for some use cases. Currently it uses a concurrent hash map to cache strings. It might be possible to speed up this cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-2246) UUID collision check is not does not work in transient space
[ https://issues.apache.org/jira/browse/OAK-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381977#comment-14381977 ] angela edited comment on OAK-2246 at 3/26/15 2:52 PM: -- i partially reverted the changes made with OAK-1037 in my local checkout and the commented 'transient' test passes. so, there it really looks like something is wrong with the changes made in OAK-1037. was (Author: anchela): i partially reverted the changes made with OAK-1037 and the commented 'transient' test passes. so, there it really looks like something is wrong with the changes made in OAK-1037. > UUID collision check is not does not work in transient space > > > Key: OAK-2246 > URL: https://issues.apache.org/jira/browse/OAK-2246 > Project: Jackrabbit Oak > Issue Type: Bug > Components: jcr >Affects Versions: 1.1.1 >Reporter: Tobias Bocanegra >Assignee: Chetan Mehrotra > Fix For: 1.4 > > > I think OAK-1037 broke the system view import. > test case: > 1. create a new node with a uuid (referenceable, or new user) > 2. import systemview with IMPORT_UUID_COLLISION_REPLACE_EXISTING > 3. save() > result: > {noformat} > javax.jcr.nodetype.ConstraintViolationException: OakConstraint0030: > Uniqueness constraint violated at path [/] for one of the property in > [jcr:uuid] having value e358efa4-89f5-3062-b10d-d7316b65649e > {noformat} > expected: > * imported content should replace the existing node - even in transient space. > note: > * if you perform a save() after step 1, everything works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2246) UUID collision check is not does not work in transient space
[ https://issues.apache.org/jira/browse/OAK-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381977#comment-14381977 ] angela commented on OAK-2246: - i partially reverted the changes made with OAK-1037 and the commented 'transient' test passes. so, there it really looks like something is wrong with the changes made in OAK-1037. > UUID collision check is not does not work in transient space > > > Key: OAK-2246 > URL: https://issues.apache.org/jira/browse/OAK-2246 > Project: Jackrabbit Oak > Issue Type: Bug > Components: jcr >Affects Versions: 1.1.1 >Reporter: Tobias Bocanegra >Assignee: Chetan Mehrotra > Fix For: 1.4 > > > I think OAK-1037 broke the system view import. > test case: > 1. create a new node with a uuid (referenceable, or new user) > 2. import systemview with IMPORT_UUID_COLLISION_REPLACE_EXISTING > 3. save() > result: > {noformat} > javax.jcr.nodetype.ConstraintViolationException: OakConstraint0030: > Uniqueness constraint violated at path [/] for one of the property in > [jcr:uuid] having value e358efa4-89f5-3062-b10d-d7316b65649e > {noformat} > expected: > * imported content should replace the existing node - even in transient space. > note: > * if you perform a save() after step 1, everything works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2476) Move our CI to Jenkins
[ https://issues.apache.org/jira/browse/OAK-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381905#comment-14381905 ] Tommaso Teofili commented on OAK-2476: -- with the help of [~mduerig] we enabled notifications to oak-dev@ :) > Move our CI to Jenkins > -- > > Key: OAK-2476 > URL: https://issues.apache.org/jira/browse/OAK-2476 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Critical > Labels: CI, build, infrastructure > Fix For: 1.2 > > > We should strive for stabilization of our CI setup, as of now we had Buildbot > and Travis. > It seems ASF Jenkins can perform jobs on different environments (*nix, > Windows and others) so we can evaluate that and check if it better address > our needs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2658) Test failures in TarMK standby: Address already in use
[ https://issues.apache.org/jira/browse/OAK-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig resolved OAK-2658. Resolution: Fixed There was a typo on the setup. Fixed at http://svn.apache.org/r1669323 > Test failures in TarMK standby: Address already in use > -- > > Key: OAK-2658 > URL: https://issues.apache.org/jira/browse/OAK-2658 > Project: Jackrabbit Oak > Issue Type: Bug > Components: oak-tarmk-standby > Environment: Jenkins, Ubuntu: > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ >Reporter: Michael Dürig >Assignee: Michael Dürig > Labels: CI, jenkins > Fix For: 1.2 > > > The following tests fail probably all for the same reason: > {noformat} > testProxySkippedBytes(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxySkippedBytesIntermediateChange(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxyFlippedStartByte(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxyFlippedIntermediateByte(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxyFlippedIntermediateByte2(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxyFlippedIntermediateByteChange(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > testProxyFlippedIntermediateByteChange2(org.apache.jackrabbit.oak.plugins.segment.standby.ExternalSharedStoreIT): > proxy not started > {noformat} > Stacktraces always look something like: > {noformat} > java.lang.Exception: proxy not started > at > org.apache.jackrabbit.oak.plugins.segment.NetworkErrorProxy.reset(NetworkErrorProxy.java:87) > at > org.apache.jackrabbit.oak.plugins.segment.standby.DataStoreTestBase.useProxy(DataStoreTestBase.java:176) > at > org.apache.jackrabbit.oak.plugins.segment.standby.DataStoreTestBase.testProxySkippedBytes(DataStoreTestBase.java:118) > {noformat} > See > https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/31/jdk=latest1.7,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=integrationTesting/console -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2246) UUID collision check is not does not work in transient space
[ https://issues.apache.org/jira/browse/OAK-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381754#comment-14381754 ] angela commented on OAK-2246: - added (commented) test case in rev. 1669327 > UUID collision check is not does not work in transient space > > > Key: OAK-2246 > URL: https://issues.apache.org/jira/browse/OAK-2246 > Project: Jackrabbit Oak > Issue Type: Bug > Components: jcr >Affects Versions: 1.1.1 >Reporter: Tobias Bocanegra >Assignee: Chetan Mehrotra > Fix For: 1.4 > > > I think OAK-1037 broke the system view import. > test case: > 1. create a new node with a uuid (referenceable, or new user) > 2. import systemview with IMPORT_UUID_COLLISION_REPLACE_EXISTING > 3. save() > result: > {noformat} > javax.jcr.nodetype.ConstraintViolationException: OakConstraint0030: > Uniqueness constraint violated at path [/] for one of the property in > [jcr:uuid] having value e358efa4-89f5-3062-b10d-d7316b65649e > {noformat} > expected: > * imported content should replace the existing node - even in transient space. > note: > * if you perform a save() after step 1, everything works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2687) Introduce Dynamic Groups
[ https://issues.apache.org/jira/browse/OAK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381726#comment-14381726 ] angela commented on OAK-2687: - the concept of dynamic groups might be convenient in similar cases, where groups contain "everyone-matching-certain-characteristics" as members and those characteristics can easily be determined during the authentication step (e.g. specific credentials, credentials attributes, login-name matching certain patterns etc). > Introduce Dynamic Groups > > > Key: OAK-2687 > URL: https://issues.apache.org/jira/browse/OAK-2687 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, jcr >Reporter: angela >Assignee: angela > Fix For: 1.4 > > > we may consider extending the jackrabbit user management API by the concept > of dynamic groups that would have the following characteristics: > - the group in the repository is just a marker > - the group members are not stored with the group and are not revealed by > regular membership operations such as 'getMembers', 'getDeclaredMembers', > 'memberOf', 'declaredMemberOf' > - the dynamic group membership is only evaluated upon authentication (e.g. in > the principal provider implementation) based on implementation details both > in the principal provider and the login module. > one example to illustrate the concept of the dynamic groups is the 'Everyone' > principal where every principal of the default principal management > implementation is member of. for consistency, this group principal already > requires special treatment in the user management implementation in case > there exists an 'everyone' group (match by principal name only). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2687) Introduce Dynamic Groups
angela created OAK-2687: --- Summary: Introduce Dynamic Groups Key: OAK-2687 URL: https://issues.apache.org/jira/browse/OAK-2687 Project: Jackrabbit Oak Issue Type: New Feature Components: core, jcr Reporter: angela Fix For: 1.4 we may consider extending the jackrabbit user management API by the concept of dynamic groups that would have the following characteristics: - the group in the repository is just a marker - the group members are not stored with the group and are not revealed by regular membership operations such as 'getMembers', 'getDeclaredMembers', 'memberOf', 'declaredMemberOf' - the dynamic group membership is only evaluated upon authentication (e.g. in the principal provider implementation) based on implementation details both in the principal provider and the login module. one example to illustrate the concept of the dynamic groups is the 'Everyone' principal where every principal of the default principal management implementation is member of. for consistency, this group principal already requires special treatment in the user management implementation in case there exists an 'everyone' group (match by principal name only). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OAK-2687) Introduce Dynamic Groups
[ https://issues.apache.org/jira/browse/OAK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela reassigned OAK-2687: --- Assignee: angela > Introduce Dynamic Groups > > > Key: OAK-2687 > URL: https://issues.apache.org/jira/browse/OAK-2687 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, jcr >Reporter: angela >Assignee: angela > Fix For: 1.4 > > > we may consider extending the jackrabbit user management API by the concept > of dynamic groups that would have the following characteristics: > - the group in the repository is just a marker > - the group members are not stored with the group and are not revealed by > regular membership operations such as 'getMembers', 'getDeclaredMembers', > 'memberOf', 'declaredMemberOf' > - the dynamic group membership is only evaluated upon authentication (e.g. in > the principal provider implementation) based on implementation details both > in the principal provider and the login module. > one example to illustrate the concept of the dynamic groups is the 'Everyone' > principal where every principal of the default principal management > implementation is member of. for consistency, this group principal already > requires special treatment in the user management implementation in case > there exists an 'everyone' group (match by principal name only). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2476) Move our CI to Jenkins
[ https://issues.apache.org/jira/browse/OAK-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381723#comment-14381723 ] Tommaso Teofili commented on OAK-2476: -- yay! :) > Move our CI to Jenkins > -- > > Key: OAK-2476 > URL: https://issues.apache.org/jira/browse/OAK-2476 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Critical > Labels: CI, build, infrastructure > Fix For: 1.2 > > > We should strive for stabilization of our CI setup, as of now we had Buildbot > and Travis. > It seems ASF Jenkins can perform jobs on different environments (*nix, > Windows and others) so we can evaluate that and check if it better address > our needs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2686) Persistent cache: log activity and timing data, and possible optimizations
Thomas Mueller created OAK-2686: --- Summary: Persistent cache: log activity and timing data, and possible optimizations Key: OAK-2686 URL: https://issues.apache.org/jira/browse/OAK-2686 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Thomas Mueller Assignee: Thomas Mueller Fix For: 1.3.0 The persistent cache most likely reduce performance in some uses cases, but currently it's hard to find out if that's the case or not. Activity should be captured (and logged with debug level) if possible, for example writing, reading, writing in the foreground / background, opening and closing, switching the generation, moving entries from old to new generation. Adding entries to the cache could be completely decoupled from the foreground thread, if they are added to the persistent cache in a separate thread. It might be better to only write entries if they were accessed often. To do this, entries could be put in the persistent cache once they are evicted from the in-memory cache, instead of when they are added to the cache. If that's done, we would maintain some data (for example access count) on which we can filter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2476) Move our CI to Jenkins
[ https://issues.apache.org/jira/browse/OAK-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381704#comment-14381704 ] Michael Dürig commented on OAK-2476: First build passed, finally! https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/54/ > Move our CI to Jenkins > -- > > Key: OAK-2476 > URL: https://issues.apache.org/jira/browse/OAK-2476 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Critical > Labels: CI, build, infrastructure > Fix For: 1.2 > > > We should strive for stabilization of our CI setup, as of now we had Buildbot > and Travis. > It seems ASF Jenkins can perform jobs on different environments (*nix, > Windows and others) so we can evaluate that and check if it better address > our needs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2669) Use Consolidated diff for local changes with persistent cache to avoid calculating diff again
[ https://issues.apache.org/jira/browse/OAK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381683#comment-14381683 ] Michael Dürig commented on OAK-2669: I'd prefer if we could avoid passing the {{ContentChangeInfo}} instances through all the calls. Maybe OAK-2685 opens new approaches here? Otherwise we should aim to make it more general so other implementations could plugin whatever need comes up for them in the future. OTOH, let's first evaluate this to see whether it helps addressing the current performance problems. Once we have a clearer picture on that we can take up the design discussions again. > Use Consolidated diff for local changes with persistent cache to avoid > calculating diff again > - > > Key: OAK-2669 > URL: https://issues.apache.org/jira/browse/OAK-2669 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Reporter: Chetan Mehrotra > Fix For: 1.3.0 > > Attachments: OAK-2669-A.patch > > > Currently the diff logic in DocumentMK makes use of DiffCache which has an in > memory implementation and a Mongo based implementation. Given that we need to > have a fast observation support for local changes it would be better to make > use of persistent cache. After discussing with [~mreutegg] following changes > need to be done in current logic > # Have the Commit#applyChanges push the commit diff to persistent cache with > current commit revision as key > # In compare pull out the diff from persistent cache and if present use that. > Note that this diff is for complete tree compared to current JSOP diff used > which is only per node level. So need to change the way diff is pushed back > to NodeStateDiff > Above change should avoid hitting mongo all together for determining the > diff. Only extra work performed in diff calculation would be determining the > node state view for the base revision. Later we can think of also include > node state base revision as part of diff so as to avoid this extra work all > together and rely on node state from persistent cache for that work also > See also http://markmail.org/thread/bzmwcp7k4wmtw6od -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2680) Report a full observation queue situation to the logfile
[ https://issues.apache.org/jira/browse/OAK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381670#comment-14381670 ] Marc Pfaff commented on OAK-2680: - IMHO this hook is already available, with the BackgroundObserver's abstract added() method, no? It does allow clients to know about the current queue size and do logging and the like based on that, like the ChangeProcessor for example. > Report a full observation queue situation to the logfile > > > Key: OAK-2680 > URL: https://issues.apache.org/jira/browse/OAK-2680 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: oak-core >Affects Versions: 1.1.7 >Reporter: Marc Pfaff >Assignee: Michael Dürig >Priority: Minor > Fix For: 1.1.8 > > Attachments: OAK-2680.patch > > > This is a improvement request for having an explicit warning in the log file, > when the BackgroundObserver's queue maximum is reached. > Currently, in that case, a warning is logged from the ChangeProcessor > observer only. But as each observer has it's own queue, a warning from a more > central place covering all observers would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2685) Track root state revision when reading the tree
Marcel Reutegger created OAK-2685: - Summary: Track root state revision when reading the tree Key: OAK-2685 URL: https://issues.apache.org/jira/browse/OAK-2685 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Marcel Reutegger Assignee: Marcel Reutegger Currently the DocumentNodeState has two revisions: - {{getRevision()}} returns the read revision of this node state. This revision was used to read the node state from the underlying {{NodeDocument}}. - {{getLastRevision()}} returns the revision when this node state was last modified. This revision also reflects changes done further below the tree when the node state was not directly affected by a change. The lastRevision of a state is then used as the read revision of the child node states. This avoids reading the entire tree again with a different revision after the head revision changed because of a commit. This approach has at least two problems related to comparing node states: - It does not work well with the current DiffCache implementation and affects the hit rate of this cache. The DiffCache is pro-actively populated after a commit. The key for a diff is a combination of previous and current commit revision and the path. The value then tells what child nodes were added/removed/changed. As the comparison of node states proceeds and traverses the tree, the revision of a state may go back in time because the lastRevision is used as the read revision of the child nodes. This will cause misses in the diff cache, because the revisions do not match the previous and current commit revisions as used to create the cache entries. OAK-2562 tried to address this by keeping the read revision for child nodes at the read revision of the parent in calls of compareAgainstBaseState() when there is a diff cache hit. However, it turns out node state comparison does not always start at the root state. The {{EventQueue}} implementation in oak-jcr will start at the paths as indicated by the filter of the listener. This means, OAK-2562 is not effective in this case and the diff needs to be calculated again based on a set of revisions, which is different from the original commit. - When a diff is calculated for a parent with many child nodes, the {{DocumentNodeStore}} will perform a query on the underlying {{DocumentStore}} to get child nodes modified after a given timestamp. This timestamp is derived from the lower revision of the two lastRevisions of the parent node states to compare. The query gets problematic for the {{DocumentStore}} if the timestamp is too far in the past. This will happen when the parent node (and sub-tree) was not modified for some time. E.g. the {{MongoDocumentStore}} has an index on the _id and the _modified field. But if there are many child nodes the _id index will not be that helpful and if the timestamp is too far in the past, the _modified index is not selective either. This problem was already reported in OAK-1970 and linked issues. Both of the above problems could be addressed by keeping track of the read revision of the root node state in each of the node states as the tree is traversed. The revision of the root state would then be used e.g. to derive the timestamp for the _modified constraint in the query. Because the revision of the root state is rather recent, the _modified constraint is very selective and the index on it would be the preferred choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2680) Report a full observation queue situation to the logfile
[ https://issues.apache.org/jira/browse/OAK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381649#comment-14381649 ] Michael Dürig commented on OAK-2680: I agree with the problem but not with the proposed solution. I think we are starting to mix concerns too much here. Maybe we should better remove the logging again and provide better hooks for clients to do the logging themselves. > Report a full observation queue situation to the logfile > > > Key: OAK-2680 > URL: https://issues.apache.org/jira/browse/OAK-2680 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: oak-core >Affects Versions: 1.1.7 >Reporter: Marc Pfaff >Assignee: Michael Dürig >Priority: Minor > Fix For: 1.1.8 > > Attachments: OAK-2680.patch > > > This is a improvement request for having an explicit warning in the log file, > when the BackgroundObserver's queue maximum is reached. > Currently, in that case, a warning is logged from the ChangeProcessor > observer only. But as each observer has it's own queue, a warning from a more > central place covering all observers would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-1327) Cleanup NodeStore and MK implementations
[ https://issues.apache.org/jira/browse/OAK-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-1327: --- Labels: modularization (was: ) > Cleanup NodeStore and MK implementations > > > Key: OAK-1327 > URL: https://issues.apache.org/jira/browse/OAK-1327 > Project: Jackrabbit Oak > Issue Type: Wish > Components: core, mk, segmentmk >Reporter: angela > Labels: modularization > Fix For: 1.4 > > Attachments: OAK-1327.patch > > > as discussed during the oak-call today, i would like to cleanup the code base > before we officially release OAK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-2669) Use Consolidated diff for local changes with persistent cache to avoid calculating diff again
[ https://issues.apache.org/jira/browse/OAK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381524#comment-14381524 ] Chetan Mehrotra edited comment on OAK-2669 at 3/26/15 7:56 AM: --- attaching the [initial patch|^OAK-2669-A.patch] to get a review done for the approach being taken *Changes related to Observation* * Introduces a new {{ContentChangeInfoProvider}} which provides a {{ContentChangeInfo}} which encapsulates the before, after and commit info * Observation logic (i.e. {{Continuation}} would ensure that {{NodeStateDiff}} implementation passed to the {{NodeState#compareAgainstBaseState}} implements the new provider interface and thus provides access to the root states and commit info details * {{DocumentNodeStore}} compare logic would check if the diff is of new type and then extracts the root nodeState for the commit. Using the root nodeState revision and path for which diff needs to be performed it looks up in the {{LocalDiffCache}} *Changes related to diff handling* * Introduces a {{LocalDiffCache}} which captures the local changes provided during commit and consolidates the diff across various changed path and caches it with key being the commit revision * It also support {{PersistentCache}} * If required this feature can be disabled. However once enabled there would be two diff related caches ** diffCache - This is used as per current usage. However if local diff cache is enabled then local diff would *not be pushed* to this cache. However while doing a diff calculation this cache would be used if there is a miss in localDiffCache or the change is external ** localDiffCache - Cache solely dedicated to capture the local changes diff *ToDo* * Need to determine the best way to serialize the consolidated diff as string. The diff string are again json encoded its not possible to serialize the consolidated diff as JSON. Current code uses a crude encoding and decoding logic [~mreutegg] [~mduerig] Can you review the approach taken. In the meantime I am working on adding more testcases [~tmueller] Any thoughts on best way to serialize the {{ConsolidatedDiff}} for persistent cache. Also do have a look on persistent cache integration was (Author: chetanm): attaching the [initial patch|^OAK-2669-A.patch] to get a review done for the approach being taken *Changes related to Observation* * Introduces a new {{ContentChangeInfoProvider}} which provides a {{ContentChangeInfo}} which encapsulates the before, after and commit info * Observation logic (i.e. {{Continuation}} would ensure that {{NodeStateDiff}} implementation passed to the {{NodeState#compareAgainstBaseState}} implements the new provider interface and thus provides access to the root states and commit info details * {{DocumentNodeStore}} compare logic would check if the diff is of new type and then extracts the root nodeState for the commit. Using the root nodeState revision and path for which diff needs to be performed it looks up in the {{LocalDiffCache}} *Changes related to diff handling* * Introduces a {{LocalDiffCache}} which captures the local changes provided during commit and consolidates the diff across various changed path and caches it with key being the commit revision * It also support {{PersistentCache}} * If required this feature can be disabled. However once enabled there would be two diff related caches ** diffCache - This is used as per current usage. However if local diff cache is enabled then local diff would *not be pushed* to this cache. However while doing a diff calculation this cache would be used if there is a miss in localDiffCache or the change is external ** localDiffCache - Cache solely dedicated to capture the local changes diff *ToDo* * Need to determine the best way to serialize the consolidated diff as string. The diff string are again json encoded its not possible to serialize the consolidated diff as JSON. Current code uses a crude encoding and decoding logic [~mreutegg] [~mduerig] Can you review the approach taken. In the meantime I am working on adding more testcases [~tmueller] Any thoughts on best way to serialize the {{ConsolidatedDiff}} for persistent cache > Use Consolidated diff for local changes with persistent cache to avoid > calculating diff again > - > > Key: OAK-2669 > URL: https://issues.apache.org/jira/browse/OAK-2669 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Reporter: Chetan Mehrotra > Fix For: 1.3.0 > > Attachments: OAK-2669-A.patch > > > Currently the diff logic in DocumentMK makes use of DiffCache which has an in > memory implementation and a Mongo based implementation. Given that we need to > have a fast obse
[jira] [Updated] (OAK-2669) Use Consolidated diff for local changes with persistent cache to avoid calculating diff again
[ https://issues.apache.org/jira/browse/OAK-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-2669: - Attachment: OAK-2669-A.patch attaching the [initial patch|^OAK-2669-A.patch] to get a review done for the approach being taken *Changes related to Observation* * Introduces a new {{ContentChangeInfoProvider}} which provides a {{ContentChangeInfo}} which encapsulates the before, after and commit info * Observation logic (i.e. {{Continuation}} would ensure that {{NodeStateDiff}} implementation passed to the {{NodeState#compareAgainstBaseState}} implements the new provider interface and thus provides access to the root states and commit info details * {{DocumentNodeStore}} compare logic would check if the diff is of new type and then extracts the root nodeState for the commit. Using the root nodeState revision and path for which diff needs to be performed it looks up in the {{LocalDiffCache}} *Changes related to diff handling* * Introduces a {{LocalDiffCache}} which captures the local changes provided during commit and consolidates the diff across various changed path and caches it with key being the commit revision * It also support {{PersistentCache}} * If required this feature can be disabled. However once enabled there would be two diff related caches ** diffCache - This is used as per current usage. However if local diff cache is enabled then local diff would *not be pushed* to this cache. However while doing a diff calculation this cache would be used if there is a miss in localDiffCache or the change is external ** localDiffCache - Cache solely dedicated to capture the local changes diff *ToDo* * Need to determine the best way to serialize the consolidated diff as string. The diff string are again json encoded its not possible to serialize the consolidated diff as JSON. Current code uses a crude encoding and decoding logic [~mreutegg] [~mduerig] Can you review the approach taken. In the meantime I am working on adding more testcases [~tmueller] Any thoughts on best way to serialize the {{ConsolidatedDiff}} for persistent cache > Use Consolidated diff for local changes with persistent cache to avoid > calculating diff again > - > > Key: OAK-2669 > URL: https://issues.apache.org/jira/browse/OAK-2669 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Reporter: Chetan Mehrotra > Fix For: 1.3.0 > > Attachments: OAK-2669-A.patch > > > Currently the diff logic in DocumentMK makes use of DiffCache which has an in > memory implementation and a Mongo based implementation. Given that we need to > have a fast observation support for local changes it would be better to make > use of persistent cache. After discussing with [~mreutegg] following changes > need to be done in current logic > # Have the Commit#applyChanges push the commit diff to persistent cache with > current commit revision as key > # In compare pull out the diff from persistent cache and if present use that. > Note that this diff is for complete tree compared to current JSOP diff used > which is only per node level. So need to change the way diff is pushed back > to NodeStateDiff > Above change should avoid hitting mongo all together for determining the > diff. Only extra work performed in diff calculation would be determining the > node state view for the base revision. Later we can think of also include > node state base revision as part of diff so as to avoid this extra work all > together and rely on node state from persistent cache for that work also > See also http://markmail.org/thread/bzmwcp7k4wmtw6od -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2641) FilterImpl violates nullability contract
[ https://issues.apache.org/jira/browse/OAK-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381499#comment-14381499 ] Thomas Mueller commented on OAK-2641: - Both patches are OK for me, the second one is probably a bit better. Should I commit it? > FilterImpl violates nullability contract > - > > Key: OAK-2641 > URL: https://issues.apache.org/jira/browse/OAK-2641 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Reporter: Michael Dürig > Fix For: 1.2 > > Attachments: OAK-2641.patch, OAK-2641_2.patch > > > {{FilterImpl#getSupertypes}}, {{FilterImpl#getPrimaryTypes}} and > {{FilterImpl#getMixinTypes}} might all return {{null}} although {{Filter}}'s > contract mandates \@Nonull. -- This message was sent by Atlassian JIRA (v6.3.4#6332)