[jira] [Updated] (OAK-3529) NodeStore API should expose an Instance ID
[ https://issues.apache.org/jira/browse/OAK-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella updated OAK-3529: -- Attachment: OAK-3529-3.patch Attaching a [third patch|^OAK-3529-3.patch]. Forgot the OSGi aspects of the DocumentNodeStore registration and baseline plugin complaints [~mduerig] please review. > NodeStore API should expose an Instance ID > -- > > Key: OAK-3529 > URL: https://issues.apache.org/jira/browse/OAK-3529 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Davide Giannella >Assignee: Davide Giannella > Fix For: 1.4 > > Attachments: OAK-3529-1.patch, OAK-3529-2.patch, OAK-3529-3.patch > > > For better leveraging cluster oriented algorithm: discovery, atomic > operations; it would be very helpful if the NodeStore could expose a > unique instance id. > This can be the same as a cluster ID. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3637: --- Attachment: (was: OAK-3637.patch) > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information
[ https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-3714: Fix Version/s: 1.2.9 > RDBDocumentStore diagnostics for Oracle might not contain index information > --- > > Key: OAK-3714 > URL: https://issues.apache.org/jira/browse/OAK-3714 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk, rdbmk >Affects Versions: 1.3.11, 1.2.8, 1.0.24 >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.2.9, 1.3.12 > > > ...when the table name contains lowercase characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3716) Rewrite change on split
Marcel Reutegger created OAK-3716: - Summary: Rewrite change on split Key: OAK-3716 URL: https://issues.apache.org/jira/browse/OAK-3716 Project: Jackrabbit Oak Issue Type: Improvement Components: core, documentmk Reporter: Marcel Reutegger For continuous revision GC and performance reasons it may be beneficial to rewrite changes when they are split to a previous document. A _commitRoot entry should be replace with a corresponding _revisions entry and the revision of a branch change replaced with the corresponding merge revision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3611) upgrade H2DB dependency to 1.4.190
[ https://issues.apache.org/jira/browse/OAK-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035862#comment-15035862 ] Julian Reschke commented on OAK-3611: - ...if this is the right thing for trunk, the same should be true for 1.2 and 1.0, no? > upgrade H2DB dependency to 1.4.190 > -- > > Key: OAK-3611 > URL: https://issues.apache.org/jira/browse/OAK-3611 > Project: Jackrabbit Oak > Issue Type: Task > Components: core >Reporter: Julian Reschke >Assignee: Thomas Mueller > Fix For: 1.3.12 > > > (we are currently at 1.4.185) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3701) Exception in JcrRemotingServlet at startup
[ https://issues.apache.org/jira/browse/OAK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra resolved OAK-3701. -- Resolution: Fixed Backing fix in Jackrabbit would be part of 2.11.3. On Oak side switched to xml config with 1717633 > Exception in JcrRemotingServlet at startup > -- > > Key: OAK-3701 > URL: https://issues.apache.org/jira/browse/OAK-3701 > Project: Jackrabbit Oak > Issue Type: Bug > Components: examples >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra >Priority: Minor > Fix For: 1.3.12 > > > On startup currently an exception is seen related to protected handlers > loading > {noformat} > 01.12.2015 11:10:36.194 *ERROR* [main] > org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager > /WEB-INF/protectedHandlers.properties > java.lang.ClassNotFoundException: /WEB-INF/protectedHandlers.properties > at java.lang.Class.forName0(Native Method) ~[na:1.7.0_55] > at java.lang.Class.forName(Class.java:190) ~[na:1.7.0_55] > at > org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager.createHandler(ProtectedRemoveManager.java:84) > [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na] > at > org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager.(ProtectedRemoveManager.java:56) > [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na] > at > org.apache.jackrabbit.server.remoting.davex.JcrRemotingServlet.init(JcrRemotingServlet.java:276) > [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na] > at javax.servlet.GenericServlet.init(GenericServlet.java:244) > [javax.servlet-api-3.1.0.jar:3.1.0] > at > org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:612) > [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:395) > [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:871) > [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:298) > [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1349) > [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.maven.plugin.JettyWebAppContext.startWebapp(JettyWebAppContext.java:296) > [jetty-maven-plugin-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1342) > [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741) > [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:505) > [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.maven.plugin.JettyWebAppContext.doStart(JettyWebAppContext.java:365) > [jetty-maven-plugin-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) > [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:163) > [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61) > [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217] > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) > [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217] > at >
[jira] [Commented] (OAK-3716) Rewrite change on split
[ https://issues.apache.org/jira/browse/OAK-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035865#comment-15035865 ] Marcel Reutegger commented on OAK-3716: --- We'd have to be careful when such a change is introduced because previous documents with _revisions entries are currently not garbage collected, whereas documents that only contain _commitRoot entries are garbage collected. > Rewrite change on split > --- > > Key: OAK-3716 > URL: https://issues.apache.org/jira/browse/OAK-3716 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, documentmk >Reporter: Marcel Reutegger > > For continuous revision GC and performance reasons it may be beneficial to > rewrite changes when they are split to a previous document. A _commitRoot > entry should be replace with a corresponding _revisions entry and the > revision of a branch change replaced with the corresponding merge revision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022202#comment-15022202 ] Julian Reschke edited comment on OAK-3637 at 12/2/15 2:49 PM: -- I've run the {{CreateManyChildNodesTest}} benchmark, using the MySQL. The number of children created in one iteration is set to 100 (I wanted to have at least a few iterations during a 5-minute test). Results are as follows: {noformat} ### latency: 0ms, bulk size: 100 ### C min 10% 50% 90% max N bulk (OAK-3637) 1 87 92 98 118 4501577 sequential (SNAPSHOT) 1 368 375 397 461 862 470 ### latency: 20ms, bulk size: 100 ### C min 10% 50% 90% max N bulk (OAK-3637) 177267754787279477973 22 sequential (SNAPSHOT) 1 42639 42639 42686 42769 42769 5 {noformat} was (Author: tomek.rekawek): I've run the {{CreateManyChildNodesTest}} benchmark, using the MySQL. The numer of children created in one iteration is set to 100 (I wanted to have at least a few iterations during a 5-minute test). Results are as follows: {noformat} ### latency: 0ms, bulk size: 100 ### C min 10% 50% 90% max N bulk (OAK-3637) 1 87 92 98 118 4501577 sequential (SNAPSHOT) 1 368 375 397 461 862 470 ### latency: 20ms, bulk size: 100 ### C min 10% 50% 90% max N bulk (OAK-3637) 177267754787279477973 22 sequential (SNAPSHOT) 1 42639 42639 42686 42769 42769 5 {noformat} > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036018#comment-15036018 ] Julian Reschke commented on OAK-3637: - Tried the patches without changing the test...: {{code}} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB1 19490 19490 19675 21248 21248 7 {{code}} {{code}} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB155935672579358786223 30 {{code}} So yes, a nice win! > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036018#comment-15036018 ] Julian Reschke edited comment on OAK-3637 at 12/2/15 4:01 PM: -- Tried the patches without changing the test...: {noformat} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB1 19490 19490 19675 21248 21248 7 {noformat} {noformat} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB155935672579358786223 30 {noformat} So yes, a nice win! was (Author: reschke): Tried the patches without changing the test...: {{code}} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB1 19490 19490 19675 21248 21248 7 {{code}} {{code}} # CreateManyChildNodesTest C min 10% 50% 90% max N Oak-RDB155935672579358786223 30 {{code}} So yes, a nice win! > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing
[ https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036063#comment-15036063 ] Julian Sedding commented on OAK-3436: - Cons for cluster node specific async indexing - Diffs need to be generated on all nodes -> increased load on mongo/rdb - More checkpoints -> more garbage that cannot be collected -> larger db size Also - Would each cluster node have its own index data? -- If no: how would the data be merged? -- If yes: --- con: storage overhead --- con: new cluster nodes need to index everything after joining the cluster (can possibly be remedied fairly easily) If this doesn't make any sense, I probably misunderstood the suggestion. > Prevent missing checkpoint due to unstable topology from causing complete > reindexing > > > Key: OAK-3436 > URL: https://issues.apache.org/jira/browse/OAK-3436 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Labels: resilience > Fix For: 1.2.9, 1.0.25, 1.3.12 > > Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch > > > Async indexing logic relies on embedding application to ensure that async > indexing job is run as a singleton in a cluster. For Sling based apps it > depends on Sling Discovery support. At times it is being seen that if > topology is not stable then different cluster nodes can consider them as > leader and execute the async indexing job concurrently. > This can cause problem as both cluster node might not see same repository > state (due to write skew and eventual consistency) and might remove the > checkpoint which other cluster node is still relying upon. For e.g. consider > a 2 node cluster N1 and N2 where both are performing async indexing. > # Base state - CP1 is the checkpoint for "async" job > # N2 starts indexing and removes changes CP1 to CP2. For Mongo the > checkpoints are saved in {{settings}} collection > # N1 also decides to execute indexing but has yet not seen the latest > repository state so still thinks that CP1 is the base checkpoint and tries to > read it. However CP1 is already removed from {{settings}} and this makes N1 > think that checkpoint is missing and it decides to reindex everything! > To avoid this topology must be stable but at Oak level we should still handle > such a case and avoid doing a full reindexing. So we would need to have a > {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as > done in OAK-2203 > Possible approaches > # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being > missing can have valid reason and invalid reason. Need to see what are valid > scenarios where a checkpoint can go missing > # A2 - When a checkpoint is created also store the creation time. When a > checkpoint is found to be missing and its a *recent* checkpoint then fail the > run. For e.g. we would fail the run till checkpoint found to be missing is > less than an hour old (for just started take startup time into account) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3692) java.lang.NoClassDefFoundError: org/apache/lucene/index/sorter/Sorter$DocComparator
[ https://issues.apache.org/jira/browse/OAK-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035875#comment-15035875 ] Tommaso Teofili commented on OAK-3692: -- thanks a lot [~catholicon] and [~chetanm] for looking into this and fixing it :) > java.lang.NoClassDefFoundError: > org/apache/lucene/index/sorter/Sorter$DocComparator > --- > > Key: OAK-3692 > URL: https://issues.apache.org/jira/browse/OAK-3692 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Blocker > Fix For: 1.3.12 > > Attachments: OAK-3692.patch > > > I'm getting following exception while trying to include oak trunk build into > AEM: > {noformat} > 27.11.2015 20:41:25.946 *ERROR* [oak-lucene-2] > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider Uncaught > exception in > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider@36ba558b > java.lang.NoClassDefFoundError: > org/apache/lucene/index/sorter/Sorter$DocComparator > at > org.apache.jackrabbit.oak.plugins.index.lucene.util.SuggestHelper.getLookup(SuggestHelper.java:108) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.(IndexNode.java:106) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:69) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:98) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:444) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:487) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compareBranch(MapRecord.java:565) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:470) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:436) > at > org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord$2.childNodeChanged(MapRecord.java:403) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:444) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:487) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:436) > at > org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:394) > at > org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52) > at > org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:108) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:73) > at > org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:131) > at > org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:125) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: > org.apache.lucene.index.sorter.Sorter$DocComparator not found by > org.apache.jackrabbit.oak-lucene [95] > at > org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1573) > at > org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:79) > at > org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:2018) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > ... 27 common frames omitted > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3717) Make it possible to declare SynonyFilter within Analyzer with WN dictionary
Tommaso Teofili created OAK-3717: Summary: Make it possible to declare SynonyFilter within Analyzer with WN dictionary Key: OAK-3717 URL: https://issues.apache.org/jira/browse/OAK-3717 Project: Jackrabbit Oak Issue Type: Improvement Components: lucene Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 1.3.12 Currently one can compose Lucene Analyzers via [composition|http://jackrabbit.apache.org/oak/docs/query/lucene.html#Create_analyzer_via_composition] within an index definition. It'd be good to be able to also use {{SynonymFIlter}} in there, eventually decorated with {{WordNetSynonymParser}} to leverage WordNet synonym files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3715) SegmentWriter reduce buffer size for reading binaries
[ https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035878#comment-15035878 ] Thomas Mueller commented on OAK-3715: - It looks good to me. Where the difference will probably show up is on Java GC logs, and in heap allocations statistics, when using {noformat} -Xrunhprof:heap=sites,depth=4 {noformat} > SegmentWriter reduce buffer size for reading binaries > - > > Key: OAK-3715 > URL: https://issues.apache.org/jira/browse/OAK-3715 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu >Priority: Minor > Fix For: 1.3.12 > > Attachments: OAK-3715.patch > > > The SegmentWriter uses an initial buffer size of 256k for reading input > streams binaries that need to be persisted, then it checks if the input is > smaller than 16k to verify if it can be inlined or not. [0] > In the case the input binary is small and can be inlined (<16k), the initial > buffer size is too wasteful and could be reduced to 16k, and if needed > increased to 256k after the threshold check is passed. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3637: --- Attachment: OAK-3637.patch > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3594) Consider using LuceneDictionary in suggester
[ https://issues.apache.org/jira/browse/OAK-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili resolved OAK-3594. -- Resolution: Fixed Fix Version/s: (was: 1.4) 1.3.11 implemented in OAK-3149 > Consider using LuceneDictionary in suggester > > > Key: OAK-3594 > URL: https://issues.apache.org/jira/browse/OAK-3594 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili > Fix For: 1.3.11 > > > Currently Lucene suggester is based on {{DocumentDictionary}} which builds > suggestions upon stored values of a certain field (in this case _:suggest_), > however it may be better to stick to plain indexed terms via a > {{LuceneDictionary}} as this would allow to save some space in the index > (:suggest field wouldn't have to be stored) and we can leverage per index > (configurable) analyzer in order to tweak how suggestions will be returned: > using a _KeywordAnalyzer_ would result in same behaviour we currently have, > using a tokenizing Analyzer will result in term level suggestions (tokens > instead of field values). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information
[ https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-3714: Fix Version/s: 1.3.12 > RDBDocumentStore diagnostics for Oracle might not contain index information > --- > > Key: OAK-3714 > URL: https://issues.apache.org/jira/browse/OAK-3714 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk, rdbmk >Affects Versions: 1.3.11, 1.2.8, 1.0.24 >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.3.12 > > > ...when the table name contains lowercase characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035958#comment-15035958 ] Tomek Rękawek commented on OAK-3637: Thanks for the comment. I removed the changes related to ResultSets outside the new bulk methods. > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3715) SegmentWriter reduce buffer size for reading binaries
[ https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Parvulescu resolved OAK-3715. -- Resolution: Fixed Fix Version/s: 1.3.12 collected some offline feedback from Thomas, pushed the patch in with http://svn.apache.org/viewvc?rev=1717636=rev > SegmentWriter reduce buffer size for reading binaries > - > > Key: OAK-3715 > URL: https://issues.apache.org/jira/browse/OAK-3715 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu >Priority: Minor > Fix For: 1.3.12 > > Attachments: OAK-3715.patch > > > The SegmentWriter uses an initial buffer size of 256k for reading input > streams binaries that need to be persisted, then it checks if the input is > smaller than 16k to verify if it can be inlined or not. [0] > In the case the input binary is small and can be inlined (<16k), the initial > buffer size is too wasteful and could be reduced to 16k, and if needed > increased to 256k after the threshold check is passed. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3686) Solr suggestion results should have 1 row per suggestion with appropriate column names
[ https://issues.apache.org/jira/browse/OAK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036029#comment-15036029 ] Tommaso Teofili commented on OAK-3686: -- your patch looks good [~catholicon], thanks! > Solr suggestion results should have 1 row per suggestion with appropriate > column names > -- > > Key: OAK-3686 > URL: https://issues.apache.org/jira/browse/OAK-3686 > Project: Jackrabbit Oak > Issue Type: Task > Components: solr >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Minor > Fix For: 1.3.12 > > Attachments: OAK-3686.patch > > > Currently suggest query returns just one row with {{rep:suggest()}} column > containing a string that needs to be parsed. > It'd better if each suggestion is returned as individual row with column > names such as {{suggestion}}, {{weight}}(???), etc. > (This is essentially the same issue as OAK-3509 but for solr) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3686) Solr suggestion results should have 1 row per suggestion with appropriate column names
[ https://issues.apache.org/jira/browse/OAK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tommaso Teofili resolved OAK-3686. -- Resolution: Fixed fixed in r1717656. > Solr suggestion results should have 1 row per suggestion with appropriate > column names > -- > > Key: OAK-3686 > URL: https://issues.apache.org/jira/browse/OAK-3686 > Project: Jackrabbit Oak > Issue Type: Task > Components: solr >Reporter: Vikas Saurabh >Assignee: Vikas Saurabh >Priority: Minor > Fix For: 1.3.12 > > Attachments: OAK-3686.patch > > > Currently suggest query returns just one row with {{rep:suggest()}} column > containing a string that needs to be parsed. > It'd better if each suggestion is returned as individual row with column > names such as {{suggestion}}, {{weight}}(???), etc. > (This is essentially the same issue as OAK-3509 but for solr) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035941#comment-15035941 ] Julian Reschke commented on OAK-3637: - That's a lot to review. It would be helpful if this patch only contained stuff that's actually related to the change. For instance, I see quite a few changes that move the ResultSet out of the try block so it can be closed in "finally". Unless I'm missing something that's pointless as that's already implied by closing the Statement object. > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information
[ https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-3714. - Resolution: Fixed Fix Version/s: 1.0.25 trunk: http://svn.apache.org/r1717632 1.2: http://svn.apache.org/r1717635 1.0: http://svn.apache.org/r1717640 > RDBDocumentStore diagnostics for Oracle might not contain index information > --- > > Key: OAK-3714 > URL: https://issues.apache.org/jira/browse/OAK-3714 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk, rdbmk >Affects Versions: 1.3.11, 1.2.8, 1.0.24 >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.2.9, 1.0.25, 1.3.12 > > > ...when the table name contains lowercase characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing
[ https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035899#comment-15035899 ] Manfred Baedke commented on OAK-3436: - bq. What would it be the possible cons for having each node async indexing itself with its own set of checkpoints? So far I don't see relevant cons. [~chetanm], [~alex.parvulescu], wdyt? bq. To generate cluster node specific checkpoints OAK-3529 could help. Thx. > Prevent missing checkpoint due to unstable topology from causing complete > reindexing > > > Key: OAK-3436 > URL: https://issues.apache.org/jira/browse/OAK-3436 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Labels: resilience > Fix For: 1.2.9, 1.0.25, 1.3.12 > > Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch > > > Async indexing logic relies on embedding application to ensure that async > indexing job is run as a singleton in a cluster. For Sling based apps it > depends on Sling Discovery support. At times it is being seen that if > topology is not stable then different cluster nodes can consider them as > leader and execute the async indexing job concurrently. > This can cause problem as both cluster node might not see same repository > state (due to write skew and eventual consistency) and might remove the > checkpoint which other cluster node is still relying upon. For e.g. consider > a 2 node cluster N1 and N2 where both are performing async indexing. > # Base state - CP1 is the checkpoint for "async" job > # N2 starts indexing and removes changes CP1 to CP2. For Mongo the > checkpoints are saved in {{settings}} collection > # N1 also decides to execute indexing but has yet not seen the latest > repository state so still thinks that CP1 is the base checkpoint and tries to > read it. However CP1 is already removed from {{settings}} and this makes N1 > think that checkpoint is missing and it decides to reindex everything! > To avoid this topology must be stable but at Oak level we should still handle > such a case and avoid doing a full reindexing. So we would need to have a > {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as > done in OAK-2203 > Possible approaches > # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being > missing can have valid reason and invalid reason. Need to see what are valid > scenarios where a checkpoint can go missing > # A2 - When a checkpoint is created also store the creation time. When a > checkpoint is found to be missing and its a *recent* checkpoint then fail the > run. For e.g. we would fail the run till checkpoint found to be missing is > less than an hour old (for just started take startup time into account) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3710) Continuous revision GC
[ https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036140#comment-15036140 ] Vikas Saurabh commented on OAK-3710: [~mreutegg], I was further discussing this with [~chetanm] and it seemed that we might be able to 'reduce' number of document writes during 'rewrite commit entries (step 3.2)' if we introduce some sort of early document re-write attached to lastRev updates. Chetan had concerns around slowing down background-write so we might want to do in a separate thread with queue of docs similar to pending-last-revs. The idea is to clean up document for which last rev is to updated to also be scanned for revisions from same cluster node older than last-rev being updated from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with revisions r-Y-2 where Y Continuous revision GC > -- > > Key: OAK-3710 > URL: https://issues.apache.org/jira/browse/OAK-3710 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, documentmk >Reporter: Marcel Reutegger > > Implement continuous revision GC cleaning up documents older than a given > threshold (e.g. one day). This issue is related to OAK-3070 where each GC run > starts where the last one finished. > This will avoid peak load on the system as we see it right now, when GC is > triggered once a day. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3710) Continuous revision GC
[ https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036140#comment-15036140 ] Vikas Saurabh edited comment on OAK-3710 at 12/2/15 5:36 PM: - [~mreutegg], I was further discussing this with [~chetanm] and it seemed that we might be able to 'reduce' number of document writes during 'rewrite commit entries (step 3.2)' if we introduce some sort of early document re-write attached to lastRev updates. Chetan had concerns around slowing down background-write so we might want to do in a separate thread with queue of docs similar to pending-last-revs. The idea is to clean up document for which last rev is to updated to also be scanned for revisions from same cluster node older than last-rev being updated from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with revisions r-Y-2 where Y Continuous revision GC > -- > > Key: OAK-3710 > URL: https://issues.apache.org/jira/browse/OAK-3710 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, documentmk >Reporter: Marcel Reutegger > > Implement continuous revision GC cleaning up documents older than a given > threshold (e.g. one day). This issue is related to OAK-3070 where each GC run > starts where the last one finished. > This will avoid peak load on the system as we see it right now, when GC is > triggered once a day. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3662) Create bulk createOrUpdate method and use it in Commit
[ https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036103#comment-15036103 ] Julian Reschke commented on OAK-3662: - We don't have any unit test coverage for the new DocumentStore method, right? We really need that, for instance in BasicDocumentStoreTest. > Create bulk createOrUpdate method and use it in Commit > -- > > Key: OAK-3662 > URL: https://issues.apache.org/jira/browse/OAK-3662 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3662.patch > > > The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked > in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed > node. Investigate if it's possible to implement a batch version of the > createOrUpdate method. It should return all documents before they are > modified, so the Commit class can discover conflicts (if they are any). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing
[ https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035499#comment-15035499 ] Davide Giannella commented on OAK-3436: --- I'm jumping in and I may have not properly understood the issue so I may be wrong. Nevertheless here's my idea. What would it be the possible cons for having each node async indexing itself with its own set of checkpoints? Something like Manfred said. To generate cluster node specific checkpoints OAK-3529 could help. > Prevent missing checkpoint due to unstable topology from causing complete > reindexing > > > Key: OAK-3436 > URL: https://issues.apache.org/jira/browse/OAK-3436 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Labels: resilience > Fix For: 1.2.9, 1.0.25, 1.3.12 > > Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch > > > Async indexing logic relies on embedding application to ensure that async > indexing job is run as a singleton in a cluster. For Sling based apps it > depends on Sling Discovery support. At times it is being seen that if > topology is not stable then different cluster nodes can consider them as > leader and execute the async indexing job concurrently. > This can cause problem as both cluster node might not see same repository > state (due to write skew and eventual consistency) and might remove the > checkpoint which other cluster node is still relying upon. For e.g. consider > a 2 node cluster N1 and N2 where both are performing async indexing. > # Base state - CP1 is the checkpoint for "async" job > # N2 starts indexing and removes changes CP1 to CP2. For Mongo the > checkpoints are saved in {{settings}} collection > # N1 also decides to execute indexing but has yet not seen the latest > repository state so still thinks that CP1 is the base checkpoint and tries to > read it. However CP1 is already removed from {{settings}} and this makes N1 > think that checkpoint is missing and it decides to reindex everything! > To avoid this topology must be stable but at Oak level we should still handle > such a case and avoid doing a full reindexing. So we would need to have a > {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as > done in OAK-2203 > Possible approaches > # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being > missing can have valid reason and invalid reason. Need to see what are valid > scenarios where a checkpoint can go missing > # A2 - When a checkpoint is created also store the creation time. When a > checkpoint is found to be missing and its a *recent* checkpoint then fail the > run. For e.g. we would fail the run till checkpoint found to be missing is > less than an hour old (for just started take startup time into account) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3140) DataStore / BlobStore: add a method to pass a "type" when writing
[ https://issues.apache.org/jira/browse/OAK-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035523#comment-15035523 ] Thomas Mueller commented on OAK-3140: - I would make sense to also add the "path", if available. This is to support OAK-3402 (Multiplexing DocumentStore support in Oak layer). > DataStore / BlobStore: add a method to pass a "type" when writing > - > > Key: OAK-3140 > URL: https://issues.apache.org/jira/browse/OAK-3140 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: blob >Reporter: Thomas Mueller >Assignee: Thomas Mueller > Labels: performance > > Currently, the BlobStore interface has a method "String writeBlob(InputStream > in)". This issue is about adding a new method "String writeBlob(String type, > InputStream in)", for the following reasons (in no particular order): > * Store some binaries (for example Lucene index files) in a different place, > in order to safely and quickly run garbage collection just on those files. > * Store some binaries in a slow, some in a fast storage or location. > * Disable calculating the content hash (de-duplication) for some binaries. > * Store some binaries in a shared storage (for fast cross-repository > copying), and some in local storage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3709) CugValidator should ignore node type definitions
angela created OAK-3709: --- Summary: CugValidator should ignore node type definitions Key: OAK-3709 URL: https://issues.apache.org/jira/browse/OAK-3709 Project: Jackrabbit Oak Issue Type: Bug Components: authorization-cug Reporter: angela Assignee: angela -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3718) JCR observation should be visible in SessionMBean
Jörg Hoh created OAK-3718: - Summary: JCR observation should be visible in SessionMBean Key: OAK-3718 URL: https://issues.apache.org/jira/browse/OAK-3718 Project: Jackrabbit Oak Issue Type: Improvement Affects Versions: 1.0.24 Reporter: Jörg Hoh I am looking for long-running sessions, which are not (explicitly or implicitly) refreshed. As the a session, which has a registered JCR observation event listener, gets refreshed implicitly, it would be good to see in the Session MBean, if a JCR observation event handler is registered for this session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)
[ https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Saurabh updated OAK-3494: --- Fix Version/s: 1.0.25 > MemoryDiffCache should also check parent paths before falling to Loader (or > returning null) > --- > > Key: OAK-3494 > URL: https://issues.apache.org/jira/browse/OAK-3494 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Vikas Saurabh >Assignee: Marcel Reutegger > Labels: candidate_oak_1_0, candidate_oak_1_2, performance > Fix For: 1.3.10, 1.2.9, 1.0.25 > > Attachments: OAK-3494-1.patch, OAK-3494-2.patch, > OAK-3494-TestCase.patch, OAK-3494.patch > > > Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} > for the list of modified children at {{path}}. A diff calcualted by > {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or > {{JournalEntry.applyTo}} (actively) fill each path for which there are > modified children (including the hierarchy) > But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, > the observer will still go down to {{diffImpl}} although cached parent entry > can be used to answer the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)
[ https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036670#comment-15036670 ] Vikas Saurabh edited comment on OAK-3494 at 12/2/15 10:24 PM: -- Backported r1707509 and r1710800 into: * 1.2 at http://svn.apache.org/r1717683 * 1.0 at http://svn.apache.org/r1717690 was (Author: catholicon): Backported into 1.2 at http://svn.apache.org/r1717683 > MemoryDiffCache should also check parent paths before falling to Loader (or > returning null) > --- > > Key: OAK-3494 > URL: https://issues.apache.org/jira/browse/OAK-3494 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Vikas Saurabh >Assignee: Marcel Reutegger > Labels: candidate_oak_1_0, candidate_oak_1_2, performance > Fix For: 1.3.10, 1.2.9, 1.0.25 > > Attachments: OAK-3494-1.patch, OAK-3494-2.patch, > OAK-3494-TestCase.patch, OAK-3494.patch > > > Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} > for the list of modified children at {{path}}. A diff calcualted by > {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or > {{JournalEntry.applyTo}} (actively) fill each path for which there are > modified children (including the hierarchy) > But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, > the observer will still go down to {{diffImpl}} although cached parent entry > can be used to answer the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3223) Remove MongoDiffCache
[ https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Saurabh updated OAK-3223: --- Fix Version/s: 1.0.25 > Remove MongoDiffCache > - > > Key: OAK-3223 > URL: https://issues.apache.org/jira/browse/OAK-3223 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.3.4, 1.2.9, 1.0.25 > > > The MongoDiffCache is not used anymore and can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3223) Remove MongoDiffCache
[ https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Saurabh updated OAK-3223: --- Fix Version/s: 1.2.9 > Remove MongoDiffCache > - > > Key: OAK-3223 > URL: https://issues.apache.org/jira/browse/OAK-3223 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.3.4, 1.2.9 > > > The MongoDiffCache is not used anymore and can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3223) Remove MongoDiffCache
[ https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036668#comment-15036668 ] Vikas Saurabh commented on OAK-3223: Backported into 1.2 at http://svn.apache.org/r1717683 > Remove MongoDiffCache > - > > Key: OAK-3223 > URL: https://issues.apache.org/jira/browse/OAK-3223 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.3.4, 1.2.9 > > > The MongoDiffCache is not used anymore and can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3223) Remove MongoDiffCache
[ https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036668#comment-15036668 ] Vikas Saurabh edited comment on OAK-3223 at 12/2/15 10:23 PM: -- Backported r1695571 into: * 1.2 at http://svn.apache.org/r1717683 * 1.0 at http://svn.apache.org/r1717690 was (Author: catholicon): Backported into 1.2 at http://svn.apache.org/r1717683 > Remove MongoDiffCache > - > > Key: OAK-3223 > URL: https://issues.apache.org/jira/browse/OAK-3223 > Project: Jackrabbit Oak > Issue Type: Task > Components: core, mongomk >Reporter: Marcel Reutegger >Assignee: Marcel Reutegger >Priority: Minor > Fix For: 1.3.4, 1.2.9 > > > The MongoDiffCache is not used anymore and can be removed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3720) Update script console bundle version to 1.0.2
Chetan Mehrotra created OAK-3720: Summary: Update script console bundle version to 1.0.2 Key: OAK-3720 URL: https://issues.apache.org/jira/browse/OAK-3720 Project: Jackrabbit Oak Issue Type: Task Components: examples Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.3.12 Script Console bundle version 1.0.2 is released which has fix for FELIX-5120. This was causing exception stacktrace on startup. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3719) Test failure: ManyChildNodesTest
Chetan Mehrotra created OAK-3719: Summary: Test failure: ManyChildNodesTest Key: OAK-3719 URL: https://issues.apache.org/jira/browse/OAK-3719 Project: Jackrabbit Oak Issue Type: Task Components: documentmk Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.3.12 At times {{ManyChildNodesTest#manyChildNodes}} fails like {noformat} Stack Trace: java.lang.AssertionError: 2147483647 > 1601 at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.assertTrue(Assert.java:41) at org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) {noformat} This happens because during the test Cluster lease logic can also make call which get included in test result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3719) Test failure: ManyChildNodesTest
[ https://issues.apache.org/jira/browse/OAK-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037261#comment-15037261 ] Chetan Mehrotra edited comment on OAK-3719 at 12/3/15 5:01 AM: --- Fixed with 1717712 by only intercepting calls for Nodes collection in TestStore was (Author: chetanm): Fixed with 1717712 > Test failure: ManyChildNodesTest > - > > Key: OAK-3719 > URL: https://issues.apache.org/jira/browse/OAK-3719 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra >Priority: Minor > Fix For: 1.3.12 > > > At times {{ManyChildNodesTest#manyChildNodes}} fails like > {noformat} > Stack Trace: > java.lang.AssertionError: 2147483647 > 1601 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > {noformat} > This happens because during the test Cluster lease logic can also make call > which get included in test result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2110) performance issues with VersionGarbageCollector
[ https://issues.apache.org/jira/browse/OAK-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-2110: Component/s: rdbmk doc > performance issues with VersionGarbageCollector > --- > > Key: OAK-2110 > URL: https://issues.apache.org/jira/browse/OAK-2110 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: doc, mongomk, rdbmk >Reporter: Julian Reschke >Assignee: Julian Reschke > Fix For: 1.4 > > > This one currently special-cases Mongo. For other persistences, it > - fetches *all* documents > - filters by SD_TYPE > - filters by lastmod of versions > - deletes what remains > This is not only inefficient but also fails with OutOfMemory for any larger > repo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)
[ https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-3494: - Labels: performance (was: candidate_oak_1_0 candidate_oak_1_2 performance) > MemoryDiffCache should also check parent paths before falling to Loader (or > returning null) > --- > > Key: OAK-3494 > URL: https://issues.apache.org/jira/browse/OAK-3494 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Vikas Saurabh >Assignee: Marcel Reutegger > Labels: performance > Fix For: 1.3.10, 1.2.9, 1.0.25 > > Attachments: OAK-3494-1.patch, OAK-3494-2.patch, > OAK-3494-TestCase.patch, OAK-3494.patch > > > Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} > for the list of modified children at {{path}}. A diff calcualted by > {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or > {{JournalEntry.applyTo}} (actively) fill each path for which there are > modified children (including the hierarchy) > But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, > the observer will still go down to {{diffImpl}} although cached parent entry > can be used to answer the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3719) Test failure: ManyChildNodesTest
[ https://issues.apache.org/jira/browse/OAK-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra resolved OAK-3719. -- Resolution: Fixed Fixed with 1717712 > Test failure: ManyChildNodesTest > - > > Key: OAK-3719 > URL: https://issues.apache.org/jira/browse/OAK-3719 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra >Priority: Minor > Fix For: 1.3.12 > > > At times {{ManyChildNodesTest#manyChildNodes}} fails like > {noformat} > Stack Trace: > java.lang.AssertionError: 2147483647 > 1601 > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > {noformat} > This happens because during the test Cluster lease logic can also make call > which get included in test result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3709) CugValidator should ignore node type definitions
[ https://issues.apache.org/jira/browse/OAK-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela updated OAK-3709: Description: since the node type definitions contain the reserved names as part of the nt definitions, the validator must omit that part during the verification. > CugValidator should ignore node type definitions > > > Key: OAK-3709 > URL: https://issues.apache.org/jira/browse/OAK-3709 > Project: Jackrabbit Oak > Issue Type: Bug > Components: authorization-cug >Reporter: angela >Assignee: angela > > since the node type definitions contain the reserved names as part of the nt > definitions, the validator must omit that part during the verification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3709) CugValidator should ignore node type definitions
[ https://issues.apache.org/jira/browse/OAK-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela resolved OAK-3709. - Resolution: Fixed Fix Version/s: 1.3.12 Committed revision 1717596. > CugValidator should ignore node type definitions > > > Key: OAK-3709 > URL: https://issues.apache.org/jira/browse/OAK-3709 > Project: Jackrabbit Oak > Issue Type: Bug > Components: authorization-cug >Reporter: angela >Assignee: angela > Fix For: 1.3.12 > > > since the node type definitions contain the reserved names as part of the nt > definitions, the validator must omit that part during the verification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-1981) Implement full scale Revision GC for DocumentNodeStore
[ https://issues.apache.org/jira/browse/OAK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-1981. --- Resolution: Duplicate Fix Version/s: (was: 1.4) Resolving this issue as duplicate of the DocumentMK Revision GC epic introduced a while ago. Missing GC features are listed in the epic and more up-to-date there. > Implement full scale Revision GC for DocumentNodeStore > -- > > Key: OAK-1981 > URL: https://issues.apache.org/jira/browse/OAK-1981 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: mongomk >Reporter: Chetan Mehrotra >Assignee: Marcel Reutegger > Labels: resilience, scalability > > So far we have implemented garbage collection in some form with OAK-1341. > Those approaches help us remove quite a bit of garbage (mostly due to deleted > nodes) but till some part is left > However full GC is still not performed due to which some of the old revision > related data cannot be GCed like > * Revision info present in revision maps of various commit roots > * Revision related to unmerged branches (OAK-1926) > * Revision data created to property being modified by different cluster nodes > So having a tool which can perform above GC would be helpful. For start we > can have an implementation which takes a brute force approach and scans whole > repo (would take quite a bit of time) and later we can evolve it. Or allow > system admins to determine to what level GC has to be done -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3710) Continuous revision GC
Marcel Reutegger created OAK-3710: - Summary: Continuous revision GC Key: OAK-3710 URL: https://issues.apache.org/jira/browse/OAK-3710 Project: Jackrabbit Oak Issue Type: New Feature Components: core, documentmk Reporter: Marcel Reutegger Implement continuous revision GC cleaning up documents older than a given threshold (e.g. one day). This issue is related to OAK-3070 where each GC run starts where the last one finished. This will avoid peak load on the system as we see it right now, when GC is triggered once a day. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3712) Clean up old and uncommitted changes
Marcel Reutegger created OAK-3712: - Summary: Clean up old and uncommitted changes Key: OAK-3712 URL: https://issues.apache.org/jira/browse/OAK-3712 Project: Jackrabbit Oak Issue Type: New Feature Components: core, documentmk Reporter: Marcel Reutegger Clean up old and uncommitted changes in the main document. This issue is related to OAK-2392, which is specifically about changes on binary properties and effect on blob GC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2860) RDBBlobStore: seen insert failures due to duplicate keys
[ https://issues.apache.org/jira/browse/OAK-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035713#comment-15035713 ] Julian Reschke commented on OAK-2860: - trunk: http://svn.apache.org/r1678938 1.2: http://svn.apache.org/r1679216 1.0: http://svn.apache.org/r1678951 > RDBBlobStore: seen insert failures due to duplicate keys > > > Key: OAK-2860 > URL: https://issues.apache.org/jira/browse/OAK-2860 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob, rdbmk >Affects Versions: 1.0.13, 1.2.2 >Reporter: Julian Reschke >Assignee: Julian Reschke > Labels: resilience > Fix For: 1.3.1, 1.0.14, 1.2.3 > > Attachments: OAK-2860.diff > > > In production, we've seen exceptions like this: > {noformat} > org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore insert document > failed for id > bd89b0745aa22429234f17dfc3e2a35b744dc6e86f5e8094a4153b2366c4d822 w > ith length 14691 (check max size of datastore_data.data) > com.ibm.db2.jcc.am.SqlIntegrityConstraintViolationException: DB2 SQL Error: > SQLCODE=-803, SQLSTATE=23505, SQLERRMC=1;DB2INST1.DATASTORE_DATA, > DRIVER=4.16.53 > at com.ibm.db2.jcc.am.fd.a(fd.java:735) > at com.ibm.db2.jcc.am.fd.a(fd.java:60) > at com.ibm.db2.jcc.am.fd.a(fd.java:127) > at com.ibm.db2.jcc.am.to.b(to.java:2422) > at com.ibm.db2.jcc.am.to.c(to.java:2405) > at com.ibm.db2.jcc.t4.ab.l(ab.java:408) > at com.ibm.db2.jcc.t4.ab.a(ab.java:62) > at com.ibm.db2.jcc.t4.o.a(o.java:50) > at com.ibm.db2.jcc.t4.ub.b(ub.java:220) > at com.ibm.db2.jcc.am.uo.sc(uo.java:3526) > at com.ibm.db2.jcc.am.uo.b(uo.java:4489) > at com.ibm.db2.jcc.am.uo.mc(uo.java:2833) > at com.ibm.db2.jcc.am.uo.execute(uo.java:2808) > at sun.reflect.GeneratedMethodAccessor941.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:600) > at > org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:235) > at com.sun.proxy.$Proxy259.execute(Unknown Source) > at sun.reflect.GeneratedMethodAccessor941.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:600) > at > org.apache.tomcat.jdbc.pool.interceptor.StatementDecoratorInterceptor$StatementProxy.invoke(StatementDecoratorInterceptor.java:252) > at com.sun.proxy.$Proxy259.execute(Unknown Source) > at > org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore.storeBlockInDatabase(RDBBlobStore.java:374) > at > org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore.storeBlock(RDBBlobStore.java:340) > {noformat} > This seems to indicate that they key is present in _data but not in _meta. We > need to find out whether that's caused by an earlier problem, or whether > storeInBlock is supposed to handle this. > (Note that the actual exception message about "check max size of > datastore_data.data" is misleading; it's due to an earlier attempt to > diagnose DB config problems) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3711) Clean up _revision entries on commit root documents
Marcel Reutegger created OAK-3711: - Summary: Clean up _revision entries on commit root documents Key: OAK-3711 URL: https://issues.apache.org/jira/browse/OAK-3711 Project: Jackrabbit Oak Issue Type: New Feature Components: core, documentmk Reporter: Marcel Reutegger The _revisions entries on commit root documents are currently not cleaned up and accumulate in split documents. One possible solution may be to ensure that there are no uncommitted changes up to certain revisions. Older revisions would then be considered valid and commit information on the commit root document wouldn't be needed anymore. For regular commits this is probably not that difficult. However, changes from branch commits require the merge revision set in the commit entry on the commit root to decide when those changes were made visible to other sessions. A simple solution could be to rewrite such changes with the merge revision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2843) Broadcasting cache
[ https://issues.apache.org/jira/browse/OAK-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035683#comment-15035683 ] Thomas Mueller commented on OAK-2843: - I could now verify the cache works as expected. My test was: * Two cluster nodes, using the MongoDB document store. * Delete the persistent cache files. * Using the persistent cache setting is follows (OSGi configuration): {noformat} persistentCache="crx-quickstart/repository/cache,size\=1024,binary\=0,broadcast\=tcp:key 123" {noformat} * Read all nodes of the repository (called "traversal check" in our application). * This took 20 seconds (because it had to load all nodes from MongoDB). * Do the same on the other cluster node, which only took 5 seconds. I ran the same test without the broadcasting cache enabled, that is just with {noformat} persistentCache="crx-quickstart/repository/cache,size\=1024,binary\=0" {noformat} The first time it took 24 seconds on _each_ cluster node (because both cluster nodes have to load all data from MongoDB, if the persistent cache is empty). The second time it took 5 seconds. After a restart (but without deleting the local persistent cache), it also took 5 seconds. > Broadcasting cache > -- > > Key: OAK-2843 > URL: https://issues.apache.org/jira/browse/OAK-2843 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: mongomk >Reporter: Thomas Mueller >Assignee: Thomas Mueller > Fix For: 1.3.12 > > > In a cluster environment, we could speed up reading if the cache(s) broadcast > data to other instances. This would avoid bottlenecks at the storage layer > (MongoDB, RDBMs). > The configuration metadata (IP addresses and ports of where to send data to, > a unique identifier of the repository and the cluster nodes, possibly > encryption key) rarely changes and can be stored in the same place as we > store cluster metadata (cluster info collection). That way, in many cases no > manual configuration is needed. We could use TCP/IP and / or UDP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Moved] (OAK-3713) Remove dep cycle between plugins/tree and spi.security
[ https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela moved JCR-3936 to OAK-3713: -- Component/s: (was: core) core Workflow: no-reopen-closed (was: no-reopen-closed, patch-avail) Key: OAK-3713 (was: JCR-3936) Project: Jackrabbit Oak (was: Jackrabbit Content Repository) > Remove dep cycle between plugins/tree and spi.security > -- > > Key: OAK-3713 > URL: https://issues.apache.org/jira/browse/OAK-3713 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: angela >Assignee: angela > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3662) Create bulk createOrUpdate method and use it in Commit
[ https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3662: --- Attachment: (was: OAK-3662.patch) > Create bulk createOrUpdate method and use it in Commit > -- > > Key: OAK-3662 > URL: https://issues.apache.org/jira/browse/OAK-3662 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3662.patch > > > The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked > in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed > node. Investigate if it's possible to implement a batch version of the > createOrUpdate method. It should return all documents before they are > modified, so the Commit class can discover conflicts (if they are any). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3559) Bulk document updates in MongoDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3559: --- Attachment: OAK-3559.patch > Bulk document updates in MongoDocumentStore > --- > > Key: OAK-3559 > URL: https://issues.apache.org/jira/browse/OAK-3559 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: mongomk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3559.patch > > > Using the MongoDB [Bulk > API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement > the [batch version of createOrUpdate method|OAK-3662]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3559) Bulk document updates in MongoDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3559: --- Attachment: (was: OAK-3559.patch) > Bulk document updates in MongoDocumentStore > --- > > Key: OAK-3559 > URL: https://issues.apache.org/jira/browse/OAK-3559 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: mongomk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3559.patch > > > Using the MongoDB [Bulk > API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement > the [batch version of createOrUpdate method|OAK-3662]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3662) Create bulk createOrUpdate method and use it in Commit
[ https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3662: --- Attachment: OAK-3662.patch > Create bulk createOrUpdate method and use it in Commit > -- > > Key: OAK-3662 > URL: https://issues.apache.org/jira/browse/OAK-3662 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: documentmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3662.patch > > > The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked > in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed > node. Investigate if it's possible to implement a batch version of the > createOrUpdate method. It should return all documents before they are > modified, so the Commit class can discover conflicts (if they are any). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3637: --- Attachment: (was: OAK-3637.patch) > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-3559) Bulk document updates in MongoDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986971#comment-14986971 ] Tomek Rękawek edited comment on OAK-3559 at 12/2/15 1:27 PM: - h4. New bulk update method The patch adds new {{createOrUpdate(Collection collection, List updateOps)}} method to the {{DocumentStore}} interface. The MongoDB implementation uses Bulk API. RDB and Memory document stores has been extended with a naive implementation iterating over {{updateOps}}. The Mongo implementation works as follows: 1. For each {{UpdateOp}} try to read the assigned document from the cache. Add them to {{oldDocs}}. 2. Prepare a list of all {{UpdateOps}} that doesn't have their documents and read them in one {{find()}} call. Add results to {{oldDocs}}. 3. Prepare a bulk update. For each remaining {{UpdateOp}} add following operation: * Find document with the same id and the same {{mod_count}} as in the {{oldDocs}}. * Apply changes from the {{UpdateOps}}. 4. Execute the bulk update. If some other process modifies the target documents between points 2 and 3, the {{mod_count}} will be increased as well and the bulk update will fail for the concurrently modified docs. The method will then remove the failed documents from the {{oldDocs}} and restart the process from point 2. It will stop after 3rd iteration. h4. Changes in the Commit class The new method has been used in the {{Commit#applyToDocumentStore}}. If it fails (eg. there has been more than 3 unsuccessful retries in the Mongo implementation), there will be fallback to the classic approach, applying one update after another. h4. Changes in the CommitQueue and ConflictException Introducing bulk updates means that we may have conflicts in many revisions at the same time. That's the reason why the {{ConflictException}} now contains the revision list, rather than a single revision number. In order to resolve conflicts in the {{DocumentNodeStoreBranch#merge0}} method, the {{CommitQueue#suspendUntil()}} has been extended as well. Now it allows to pass a list of revisions and suspends execution until all of them are visible. was (Author: tomek.rekawek): The pull request has been created here: https://github.com/apache/jackrabbit-oak/pull/43 The patch can be downloaded from: https://patch-diff.githubusercontent.com/raw/apache/jackrabbit-oak/pull/43.diff h4. New bulk update method The patch adds new {{createOrUpdate(Collection collection, List updateOps)}} method to the {{DocumentStore}} interface. The MongoDB implementation uses Bulk API. RDB and Memory document stores has been extended with a naive implementation iterating over {{updateOps}}. The Mongo implementation works as follows: 1. For each {{UpdateOp}} try to read the assigned document from the cache. Add them to {{oldDocs}}. 2. Prepare a list of all {{UpdateOps}} that doesn't have their documents and read them in one {{find()}} call. Add results to {{oldDocs}}. 3. Prepare a bulk update. For each remaining {{UpdateOp}} add following operation: * Find document with the same id and the same {{mod_count}} as in the {{oldDocs}}. * Apply changes from the {{UpdateOps}}. 4. Execute the bulk update. If some other process modifies the target documents between points 2 and 3, the {{mod_count}} will be increased as well and the bulk update will fail for the concurrently modified docs. The method will then remove the failed documents from the {{oldDocs}} and restart the process from point 2. It will stop after 3rd iteration. h4. Changes in the Commit class The new method has been used in the {{Commit#applyToDocumentStore}}. If it fails (eg. there has been more than 3 unsuccessful retries in the Mongo implementation), there will be fallback to the classic approach, applying one update after another. h4. Changes in the CommitQueue and ConflictException Introducing bulk updates means that we may have conflicts in many revisions at the same time. That's the reason why the {{ConflictException}} now contains the revision list, rather than a single revision number. In order to resolve conflicts in the {{DocumentNodeStoreBranch#merge0}} method, the {{CommitQueue#suspendUntil()}} has been extended as well. Now it allows to pass a list of revisions and suspends execution until all of them are visible. > Bulk document updates in MongoDocumentStore > --- > > Key: OAK-3559 > URL: https://issues.apache.org/jira/browse/OAK-3559 > Project: Jackrabbit Oak > Issue Type: Sub-task > Components: mongomk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3559.patch > > > Using the MongoDB [Bulk > API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement > the [batch version of createOrUpdate
[jira] [Updated] (OAK-3713) Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security
[ https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela updated OAK-3713: Summary: Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security (was: Remove dep cycle between plugins/tree and spi.security) > Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security > --- > > Key: OAK-3713 > URL: https://issues.apache.org/jira/browse/OAK-3713 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: angela >Assignee: angela > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore
[ https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomek Rękawek updated OAK-3637: --- Attachment: OAK-3637.patch > Bulk document updates in RDBDocumentStore > - > > Key: OAK-3637 > URL: https://issues.apache.org/jira/browse/OAK-3637 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: rdbmk >Reporter: Tomek Rękawek > Fix For: 1.4 > > Attachments: OAK-3637.patch > > > Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)
[ https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036670#comment-15036670 ] Vikas Saurabh commented on OAK-3494: Backported into 1.2 at http://svn.apache.org/r1717683 > MemoryDiffCache should also check parent paths before falling to Loader (or > returning null) > --- > > Key: OAK-3494 > URL: https://issues.apache.org/jira/browse/OAK-3494 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Vikas Saurabh >Assignee: Marcel Reutegger > Labels: candidate_oak_1_0, candidate_oak_1_2, performance > Fix For: 1.3.10, 1.2.9 > > Attachments: OAK-3494-1.patch, OAK-3494-2.patch, > OAK-3494-TestCase.patch, OAK-3494.patch > > > Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} > for the list of modified children at {{path}}. A diff calcualted by > {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or > {{JournalEntry.applyTo}} (actively) fill each path for which there are > modified children (including the hierarchy) > But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, > the observer will still go down to {{diffImpl}} although cached parent entry > can be used to answer the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)
[ https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Saurabh updated OAK-3494: --- Fix Version/s: 1.2.9 > MemoryDiffCache should also check parent paths before falling to Loader (or > returning null) > --- > > Key: OAK-3494 > URL: https://issues.apache.org/jira/browse/OAK-3494 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, mongomk >Reporter: Vikas Saurabh >Assignee: Marcel Reutegger > Labels: candidate_oak_1_0, candidate_oak_1_2, performance > Fix For: 1.3.10, 1.2.9 > > Attachments: OAK-3494-1.patch, OAK-3494-2.patch, > OAK-3494-TestCase.patch, OAK-3494.patch > > > Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} > for the list of modified children at {{path}}. A diff calcualted by > {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or > {{JournalEntry.applyTo}} (actively) fill each path for which there are > modified children (including the hierarchy) > But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, > the observer will still go down to {{diffImpl}} although cached parent entry > can be used to answer the query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information
Julian Reschke created OAK-3714: --- Summary: RDBDocumentStore diagnostics for Oracle might not contain index information Key: OAK-3714 URL: https://issues.apache.org/jira/browse/OAK-3714 Project: Jackrabbit Oak Issue Type: Technical task Affects Versions: 1.0.24, 1.2.8, 1.3.11 Reporter: Julian Reschke Assignee: Julian Reschke Priority: Minor ...when the table name contains lowercase characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-3713) Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security
[ https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] angela resolved OAK-3713. - Resolution: Fixed Fix Version/s: 1.3.12 > Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security > --- > > Key: OAK-3713 > URL: https://issues.apache.org/jira/browse/OAK-3713 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: angela >Assignee: angela > Fix For: 1.3.12 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3715) SegmentWriter reduce buffer size for reading binaries
Alex Parvulescu created OAK-3715: Summary: SegmentWriter reduce buffer size for reading binaries Key: OAK-3715 URL: https://issues.apache.org/jira/browse/OAK-3715 Project: Jackrabbit Oak Issue Type: Improvement Components: segmentmk Reporter: Alex Parvulescu Assignee: Alex Parvulescu Priority: Minor The SegmentWriter uses an initial buffer size of 256k for reading input streams binaries that need to be persisted, then it checks if the input is smaller than 16k to verify if it can be inlined or not. In the case the input binary is small and can be inlined (<16k), the initial buffer size is too wasteful and could be reduced to 16k, and _if needed_ increased to 256k after the threshold check is passed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3715) SegmentWriter reduce buffer size for reading binaries
[ https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Parvulescu updated OAK-3715: - Attachment: OAK-3715.patch proposed patch. fyi [~mduerig], [~tmueller] > SegmentWriter reduce buffer size for reading binaries > - > > Key: OAK-3715 > URL: https://issues.apache.org/jira/browse/OAK-3715 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu >Priority: Minor > Attachments: OAK-3715.patch > > > The SegmentWriter uses an initial buffer size of 256k for reading input > streams binaries that need to be persisted, then it checks if the input is > smaller than 16k to verify if it can be inlined or not. > In the case the input binary is small and can be inlined (<16k), the initial > buffer size is too wasteful and could be reduced to 16k, and _if needed_ > increased to 256k after the threshold check is passed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3715) SegmentWriter reduce buffer size for reading binaries
[ https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Parvulescu updated OAK-3715: - Description: The SegmentWriter uses an initial buffer size of 256k for reading input streams binaries that need to be persisted, then it checks if the input is smaller than 16k to verify if it can be inlined or not. [0] In the case the input binary is small and can be inlined (<16k), the initial buffer size is too wasteful and could be reduced to 16k, and if needed increased to 256k after the threshold check is passed. [0] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495 was: The SegmentWriter uses an initial buffer size of 256k for reading input streams binaries that need to be persisted, then it checks if the input is smaller than 16k to verify if it can be inlined or not. In the case the input binary is small and can be inlined (<16k), the initial buffer size is too wasteful and could be reduced to 16k, and _if needed_ increased to 256k after the threshold check is passed. > SegmentWriter reduce buffer size for reading binaries > - > > Key: OAK-3715 > URL: https://issues.apache.org/jira/browse/OAK-3715 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu >Priority: Minor > Attachments: OAK-3715.patch > > > The SegmentWriter uses an initial buffer size of 256k for reading input > streams binaries that need to be persisted, then it checks if the input is > smaller than 16k to verify if it can be inlined or not. [0] > In the case the input binary is small and can be inlined (<16k), the initial > buffer size is too wasteful and could be reduced to 16k, and if needed > increased to 256k after the threshold check is passed. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3710) Continuous revision GC
[ https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035849#comment-15035849 ] Marcel Reutegger commented on OAK-3710: --- Had an offline discussion with Chetan and Vikas about how to implement this feature. The basic ideas are: - Remember T' as lowest revision time of _lastRev entries on the root document. - Scan through documents that have a _modified >= T read from settings collection. Use a value of 0 if T is undefined. - For each document: -- remove changes (committed and uncommitted) that are older than {{maxRevisionAge}} (see also OAK-3712) -- rewrite commit entries of remaining committed changes and set local _revisions entries accordingly (may collide with split operations!) - Store T' in settings collection for starting point of next cycle - Remove split documents with {{_sdMaxRevTime}} < T (see also OAK-3711) In addition it would also be good to change the way documents are split. Currently _commitRoot entries are moved to previous documents. I think it would be better to rewrite the change on split and replace _commitRoot with _revisions entries with the correct commit value. This reduces dependency on the commit root document. > Continuous revision GC > -- > > Key: OAK-3710 > URL: https://issues.apache.org/jira/browse/OAK-3710 > Project: Jackrabbit Oak > Issue Type: New Feature > Components: core, documentmk >Reporter: Marcel Reutegger > > Implement continuous revision GC cleaning up documents older than a given > threshold (e.g. one day). This issue is related to OAK-3070 where each GC run > starts where the last one finished. > This will avoid peak load on the system as we see it right now, when GC is > triggered once a day. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-3611) upgrade H2DB dependency to 1.4.190
[ https://issues.apache.org/jira/browse/OAK-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-3611: Fix Version/s: 1.3.12 > upgrade H2DB dependency to 1.4.190 > -- > > Key: OAK-3611 > URL: https://issues.apache.org/jira/browse/OAK-3611 > Project: Jackrabbit Oak > Issue Type: Task > Components: core >Reporter: Julian Reschke >Assignee: Thomas Mueller > Fix For: 1.3.12 > > > (we are currently at 1.4.185) -- This message was sent by Atlassian JIRA (v6.3.4#6332)