[jira] [Updated] (OAK-3529) NodeStore API should expose an Instance ID

2015-12-02 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3529:
--
Attachment: OAK-3529-3.patch

Attaching a [third patch|^OAK-3529-3.patch]. Forgot the OSGi aspects of the 
DocumentNodeStore registration and baseline plugin complaints

[~mduerig] please review.

> NodeStore API should expose an Instance ID
> --
>
> Key: OAK-3529
> URL: https://issues.apache.org/jira/browse/OAK-3529
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Davide Giannella
>Assignee: Davide Giannella
> Fix For: 1.4
>
> Attachments: OAK-3529-1.patch, OAK-3529-2.patch, OAK-3529-3.patch
>
>
> For better leveraging cluster oriented algorithm: discovery, atomic
> operations; it would be very helpful if the NodeStore could expose a
> unique instance id.
> This can be the same as a cluster ID.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3637:
---
Attachment: (was: OAK-3637.patch)

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information

2015-12-02 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3714:

Fix Version/s: 1.2.9

> RDBDocumentStore diagnostics for Oracle might not contain index information
> ---
>
> Key: OAK-3714
> URL: https://issues.apache.org/jira/browse/OAK-3714
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk, rdbmk
>Affects Versions: 1.3.11, 1.2.8, 1.0.24
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.2.9, 1.3.12
>
>
> ...when the table name contains lowercase characters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3716) Rewrite change on split

2015-12-02 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3716:
-

 Summary: Rewrite change on split
 Key: OAK-3716
 URL: https://issues.apache.org/jira/browse/OAK-3716
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, documentmk
Reporter: Marcel Reutegger


For continuous revision GC and performance reasons it may be beneficial to 
rewrite changes when they are split to a previous document. A _commitRoot entry 
should be replace with a corresponding _revisions entry and the revision of a 
branch change replaced with the corresponding merge revision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3611) upgrade H2DB dependency to 1.4.190

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035862#comment-15035862
 ] 

Julian Reschke commented on OAK-3611:
-

...if this is the right thing for trunk, the same should be true for 1.2 and 
1.0, no?

> upgrade H2DB dependency to 1.4.190
> --
>
> Key: OAK-3611
> URL: https://issues.apache.org/jira/browse/OAK-3611
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core
>Reporter: Julian Reschke
>Assignee: Thomas Mueller
> Fix For: 1.3.12
>
>
> (we are currently at 1.4.185)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3701) Exception in JcrRemotingServlet at startup

2015-12-02 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-3701.
--
Resolution: Fixed

Backing fix in Jackrabbit would be part of 2.11.3. On Oak side switched to xml 
config with 1717633

> Exception in JcrRemotingServlet at startup
> --
>
> Key: OAK-3701
> URL: https://issues.apache.org/jira/browse/OAK-3701
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: examples
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.12
>
>
> On startup currently an exception is seen related to protected handlers 
> loading
> {noformat}
> 01.12.2015 11:10:36.194 *ERROR* [main] 
> org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager 
> /WEB-INF/protectedHandlers.properties
> java.lang.ClassNotFoundException: /WEB-INF/protectedHandlers.properties
> at java.lang.Class.forName0(Native Method) ~[na:1.7.0_55]
> at java.lang.Class.forName(Class.java:190) ~[na:1.7.0_55]
> at 
> org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager.createHandler(ProtectedRemoveManager.java:84)
>  [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na]
> at 
> org.apache.jackrabbit.server.remoting.davex.ProtectedRemoveManager.(ProtectedRemoveManager.java:56)
>  [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na]
> at 
> org.apache.jackrabbit.server.remoting.davex.JcrRemotingServlet.init(JcrRemotingServlet.java:276)
>  [jackrabbit-jcr-server-2.11.3-SNAPSHOT.jar:na]
> at javax.servlet.GenericServlet.init(GenericServlet.java:244) 
> [javax.servlet-api-3.1.0.jar:3.1.0]
> at 
> org.eclipse.jetty.servlet.ServletHolder.initServlet(ServletHolder.java:612) 
> [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.servlet.ServletHolder.initialize(ServletHolder.java:395) 
> [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:871) 
> [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:298)
>  [jetty-servlet-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1349) 
> [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.maven.plugin.JettyWebAppContext.startWebapp(JettyWebAppContext.java:296)
>  [jetty-maven-plugin-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1342) 
> [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:741)
>  [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:505) 
> [jetty-webapp-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.maven.plugin.JettyWebAppContext.doStart(JettyWebAppContext.java:365)
>  [jetty-maven-plugin-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
>  [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:163)
>  [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:132)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:114)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:61)
>  [jetty-server-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
>  [jetty-util-9.2.8.v20150217.jar:9.2.8.v20150217]
> at 
> 

[jira] [Commented] (OAK-3716) Rewrite change on split

2015-12-02 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035865#comment-15035865
 ] 

Marcel Reutegger commented on OAK-3716:
---

We'd have to be careful when such a change is introduced because previous 
documents with _revisions entries are currently not garbage collected, whereas 
documents that only contain _commitRoot entries are garbage collected.

> Rewrite change on split
> ---
>
> Key: OAK-3716
> URL: https://issues.apache.org/jira/browse/OAK-3716
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>
> For continuous revision GC and performance reasons it may be beneficial to 
> rewrite changes when they are split to a previous document. A _commitRoot 
> entry should be replace with a corresponding _revisions entry and the 
> revision of a branch change replaced with the corresponding merge revision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15022202#comment-15022202
 ] 

Julian Reschke edited comment on OAK-3637 at 12/2/15 2:49 PM:
--

I've run the {{CreateManyChildNodesTest}} benchmark, using the MySQL. The 
number of children created in one iteration is set to 100 (I wanted to have at 
least a few iterations during a 5-minute test). Results are as follows:

{noformat}
###  latency: 0ms, bulk size: 100 ###

C min 10% 50% 90% max   N
bulk (OAK-3637) 1  87  92  98 118 4501577
sequential (SNAPSHOT)   1 368 375 397 461 862 470

### latency: 20ms, bulk size: 100 ###

C min 10% 50% 90% max   N
bulk (OAK-3637) 177267754787279477973  22
sequential (SNAPSHOT)   1   42639   42639   42686   42769   42769   5
{noformat}


was (Author: tomek.rekawek):
I've run the {{CreateManyChildNodesTest}} benchmark, using the MySQL. The numer 
of children created in one iteration is set to 100 (I wanted to have at least a 
few iterations during a 5-minute test). Results are as follows:

{noformat}
###  latency: 0ms, bulk size: 100 ###

C min 10% 50% 90% max   N
bulk (OAK-3637) 1  87  92  98 118 4501577
sequential (SNAPSHOT)   1 368 375 397 461 862 470

### latency: 20ms, bulk size: 100 ###

C min 10% 50% 90% max   N
bulk (OAK-3637) 177267754787279477973  22
sequential (SNAPSHOT)   1   42639   42639   42686   42769   42769   5
{noformat}

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036018#comment-15036018
 ] 

Julian Reschke commented on OAK-3637:
-

Tried the patches without changing the test...:
{{code}}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB1   19490   19490   19675   21248   21248
   7
{{code}}
{{code}}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB155935672579358786223
  30
{{code}}

So yes, a nice win!


> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036018#comment-15036018
 ] 

Julian Reschke edited comment on OAK-3637 at 12/2/15 4:01 PM:
--

Tried the patches without changing the test...:
{noformat}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB1   19490   19490   19675   21248   21248
   7
{noformat}
{noformat}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB155935672579358786223
  30
{noformat}

So yes, a nice win!



was (Author: reschke):
Tried the patches without changing the test...:
{{code}}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB1   19490   19490   19675   21248   21248
   7
{{code}}
{{code}}
# CreateManyChildNodesTest C min 10% 50% 90% max
   N
Oak-RDB155935672579358786223
  30
{{code}}

So yes, a nice win!


> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing

2015-12-02 Thread Julian Sedding (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036063#comment-15036063
 ] 

Julian Sedding commented on OAK-3436:
-

Cons for cluster node specific async indexing
- Diffs need to be generated on all nodes -> increased load on mongo/rdb
- More checkpoints -> more garbage that cannot be collected -> larger db size

Also
- Would each cluster node have its own index data?
-- If no: how would the data be merged?
-- If yes:
--- con: storage overhead
--- con: new cluster nodes need to index everything after joining the cluster 
(can possibly be remedied fairly easily)

If this doesn't make any sense, I probably misunderstood the suggestion.


> Prevent missing checkpoint due to unstable topology from causing complete 
> reindexing
> 
>
> Key: OAK-3436
> URL: https://issues.apache.org/jira/browse/OAK-3436
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: resilience
> Fix For: 1.2.9, 1.0.25, 1.3.12
>
> Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch
>
>
> Async indexing logic relies on embedding application to ensure that async 
> indexing job is run as a singleton in a cluster. For Sling based apps it 
> depends on Sling Discovery support. At times it is being seen that if 
> topology is not stable then different cluster nodes can consider them as 
> leader and execute the async indexing job concurrently.
> This can cause problem as both cluster node might not see same repository 
> state (due to write skew and eventual consistency) and might remove the 
> checkpoint which other cluster node is still relying upon. For e.g. consider 
> a 2 node cluster N1 and N2 where both are performing async indexing.
> # Base state - CP1 is the checkpoint for "async" job
> # N2 starts indexing and removes changes CP1 to CP2. For Mongo the 
> checkpoints are saved in {{settings}} collection
> # N1 also decides to execute indexing but has yet not seen the latest 
> repository state so still thinks that CP1 is the base checkpoint and tries to 
> read it. However CP1 is already removed from {{settings}} and this makes N1 
> think that checkpoint is missing and it decides to reindex everything!
> To avoid this topology must be stable but at Oak level we should still handle 
> such a case and avoid doing a full reindexing. So we would need to have a 
> {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as 
> done in OAK-2203 
> Possible approaches
> # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being 
> missing can have valid reason and invalid reason. Need to see what are valid 
> scenarios where a checkpoint can go missing
> # A2 - When a checkpoint is created also store the creation time. When a 
> checkpoint is found to be missing and its a *recent* checkpoint then fail the 
> run. For e.g. we would fail the run till checkpoint found to be missing is 
> less than an hour old (for just started take startup time into account)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3692) java.lang.NoClassDefFoundError: org/apache/lucene/index/sorter/Sorter$DocComparator

2015-12-02 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035875#comment-15035875
 ] 

Tommaso Teofili commented on OAK-3692:
--

thanks a lot [~catholicon] and [~chetanm] for looking into this and fixing it :)

> java.lang.NoClassDefFoundError: 
> org/apache/lucene/index/sorter/Sorter$DocComparator
> ---
>
> Key: OAK-3692
> URL: https://issues.apache.org/jira/browse/OAK-3692
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Blocker
> Fix For: 1.3.12
>
> Attachments: OAK-3692.patch
>
>
> I'm getting following exception while trying to include oak trunk build into 
> AEM:
> {noformat}
> 27.11.2015 20:41:25.946 *ERROR* [oak-lucene-2] 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider Uncaught 
> exception in 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider@36ba558b
> java.lang.NoClassDefFoundError: 
> org/apache/lucene/index/sorter/Sorter$DocComparator
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.util.SuggestHelper.getLookup(SuggestHelper.java:108)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.(IndexNode.java:106)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexNode.open(IndexNode.java:69)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker$1.leave(IndexTracker.java:98)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:153)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:444)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:487)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compareBranch(MapRecord.java:565)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:470)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:436)
> at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.childNodeChanged(EditorDiff.java:148)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord$2.childNodeChanged(MapRecord.java:403)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord$3.childNodeChanged(MapRecord.java:444)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:487)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:436)
> at 
> org.apache.jackrabbit.oak.plugins.segment.MapRecord.compare(MapRecord.java:394)
> at 
> org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.compareAgainstBaseState(SegmentNodeState.java:583)
> at 
> org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:52)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexTracker.update(IndexTracker.java:108)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProvider.contentChanged(LuceneIndexProvider.java:73)
> at 
> org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:131)
> at 
> org.apache.jackrabbit.oak.spi.commit.BackgroundObserver$1$1.call(BackgroundObserver.java:125)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.lucene.index.sorter.Sorter$DocComparator not found by 
> org.apache.jackrabbit.oak-lucene [95]
> at 
> org.apache.felix.framework.BundleWiringImpl.findClassOrResourceByDelegation(BundleWiringImpl.java:1573)
> at 
> org.apache.felix.framework.BundleWiringImpl.access$400(BundleWiringImpl.java:79)
> at 
> org.apache.felix.framework.BundleWiringImpl$BundleClassLoader.loadClass(BundleWiringImpl.java:2018)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> ... 27 common frames omitted
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3717) Make it possible to declare SynonyFilter within Analyzer with WN dictionary

2015-12-02 Thread Tommaso Teofili (JIRA)
Tommaso Teofili created OAK-3717:


 Summary: Make it possible to declare SynonyFilter within Analyzer 
with WN dictionary
 Key: OAK-3717
 URL: https://issues.apache.org/jira/browse/OAK-3717
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: lucene
Reporter: Tommaso Teofili
Assignee: Tommaso Teofili
 Fix For: 1.3.12


Currently one can compose Lucene Analyzers via 
[composition|http://jackrabbit.apache.org/oak/docs/query/lucene.html#Create_analyzer_via_composition]
 within an index definition. It'd be good to be able to also use 
{{SynonymFIlter}} in there, eventually decorated with {{WordNetSynonymParser}} 
to leverage WordNet synonym files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3715) SegmentWriter reduce buffer size for reading binaries

2015-12-02 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035878#comment-15035878
 ] 

Thomas Mueller commented on OAK-3715:
-

It looks good to me.

Where the difference will probably show up is on Java GC logs, and in heap 
allocations statistics, when using 
{noformat}
-Xrunhprof:heap=sites,depth=4
{noformat}


> SegmentWriter reduce buffer size for reading binaries
> -
>
> Key: OAK-3715
> URL: https://issues.apache.org/jira/browse/OAK-3715
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Minor
> Fix For: 1.3.12
>
> Attachments: OAK-3715.patch
>
>
> The SegmentWriter uses an initial buffer size of 256k for reading input 
> streams binaries that need to be persisted, then it checks if the input is 
> smaller than 16k to verify if it can be inlined or not. [0]
> In the case the input binary is small and can be inlined (<16k), the initial 
> buffer size is too wasteful and could be reduced to 16k, and if needed 
> increased to 256k after the threshold check is passed.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3637:
---
Attachment: OAK-3637.patch

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3594) Consider using LuceneDictionary in suggester

2015-12-02 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved OAK-3594.
--
   Resolution: Fixed
Fix Version/s: (was: 1.4)
   1.3.11

implemented in OAK-3149

> Consider using LuceneDictionary in suggester
> 
>
> Key: OAK-3594
> URL: https://issues.apache.org/jira/browse/OAK-3594
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.3.11
>
>
> Currently Lucene suggester is based on {{DocumentDictionary}} which builds 
> suggestions upon stored values of a certain field (in this case _:suggest_), 
> however it may be better to stick to plain indexed terms via a 
> {{LuceneDictionary}} as this would allow to save some space in the index 
> (:suggest field wouldn't have to be stored) and we can leverage per index 
> (configurable) analyzer in order to tweak how suggestions will be returned: 
> using a _KeywordAnalyzer_ would result in same behaviour we currently have, 
> using a tokenizing Analyzer will result in term level suggestions (tokens 
> instead of field values).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information

2015-12-02 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3714:

Fix Version/s: 1.3.12

> RDBDocumentStore diagnostics for Oracle might not contain index information
> ---
>
> Key: OAK-3714
> URL: https://issues.apache.org/jira/browse/OAK-3714
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk, rdbmk
>Affects Versions: 1.3.11, 1.2.8, 1.0.24
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.12
>
>
> ...when the table name contains lowercase characters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035958#comment-15035958
 ] 

Tomek Rękawek commented on OAK-3637:


Thanks for the comment. I removed the changes related to ResultSets outside the 
new bulk methods.

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3715) SegmentWriter reduce buffer size for reading binaries

2015-12-02 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu resolved OAK-3715.
--
   Resolution: Fixed
Fix Version/s: 1.3.12

collected some offline feedback from Thomas, pushed the patch in with 
http://svn.apache.org/viewvc?rev=1717636=rev

> SegmentWriter reduce buffer size for reading binaries
> -
>
> Key: OAK-3715
> URL: https://issues.apache.org/jira/browse/OAK-3715
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Minor
> Fix For: 1.3.12
>
> Attachments: OAK-3715.patch
>
>
> The SegmentWriter uses an initial buffer size of 256k for reading input 
> streams binaries that need to be persisted, then it checks if the input is 
> smaller than 16k to verify if it can be inlined or not. [0]
> In the case the input binary is small and can be inlined (<16k), the initial 
> buffer size is too wasteful and could be reduced to 16k, and if needed 
> increased to 256k after the threshold check is passed.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3686) Solr suggestion results should have 1 row per suggestion with appropriate column names

2015-12-02 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036029#comment-15036029
 ] 

Tommaso Teofili commented on OAK-3686:
--

your patch looks good [~catholicon], thanks!

> Solr suggestion results should have 1 row per suggestion with appropriate 
> column names
> --
>
> Key: OAK-3686
> URL: https://issues.apache.org/jira/browse/OAK-3686
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: solr
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.3.12
>
> Attachments: OAK-3686.patch
>
>
> Currently suggest query returns just one row with {{rep:suggest()}} column 
> containing a string that needs to be parsed.
> It'd better if each suggestion is returned as individual row with column 
> names such as {{suggestion}}, {{weight}}(???), etc.
> (This is essentially the same issue as OAK-3509 but for solr)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3686) Solr suggestion results should have 1 row per suggestion with appropriate column names

2015-12-02 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili resolved OAK-3686.
--
Resolution: Fixed

fixed in r1717656.

> Solr suggestion results should have 1 row per suggestion with appropriate 
> column names
> --
>
> Key: OAK-3686
> URL: https://issues.apache.org/jira/browse/OAK-3686
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: solr
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.3.12
>
> Attachments: OAK-3686.patch
>
>
> Currently suggest query returns just one row with {{rep:suggest()}} column 
> containing a string that needs to be parsed.
> It'd better if each suggestion is returned as individual row with column 
> names such as {{suggestion}}, {{weight}}(???), etc.
> (This is essentially the same issue as OAK-3509 but for solr)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035941#comment-15035941
 ] 

Julian Reschke commented on OAK-3637:
-

That's a lot to review. It would be helpful if this patch only contained stuff 
that's actually related to the change. For instance, I see quite a few changes 
that move the ResultSet out of the try block so it can be closed in "finally". 
Unless I'm missing something that's pointless as that's already implied by 
closing the Statement object.

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information

2015-12-02 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-3714.
-
   Resolution: Fixed
Fix Version/s: 1.0.25

trunk: http://svn.apache.org/r1717632
1.2: http://svn.apache.org/r1717635
1.0: http://svn.apache.org/r1717640


> RDBDocumentStore diagnostics for Oracle might not contain index information
> ---
>
> Key: OAK-3714
> URL: https://issues.apache.org/jira/browse/OAK-3714
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk, rdbmk
>Affects Versions: 1.3.11, 1.2.8, 1.0.24
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.2.9, 1.0.25, 1.3.12
>
>
> ...when the table name contains lowercase characters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing

2015-12-02 Thread Manfred Baedke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035899#comment-15035899
 ] 

Manfred Baedke commented on OAK-3436:
-

bq. What would it be the possible cons for having each node async indexing 
itself with its own set of checkpoints?

So far I don't see relevant cons. [~chetanm], [~alex.parvulescu], wdyt?

bq. To generate cluster node specific checkpoints OAK-3529 could help.

Thx.

> Prevent missing checkpoint due to unstable topology from causing complete 
> reindexing
> 
>
> Key: OAK-3436
> URL: https://issues.apache.org/jira/browse/OAK-3436
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: resilience
> Fix For: 1.2.9, 1.0.25, 1.3.12
>
> Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch
>
>
> Async indexing logic relies on embedding application to ensure that async 
> indexing job is run as a singleton in a cluster. For Sling based apps it 
> depends on Sling Discovery support. At times it is being seen that if 
> topology is not stable then different cluster nodes can consider them as 
> leader and execute the async indexing job concurrently.
> This can cause problem as both cluster node might not see same repository 
> state (due to write skew and eventual consistency) and might remove the 
> checkpoint which other cluster node is still relying upon. For e.g. consider 
> a 2 node cluster N1 and N2 where both are performing async indexing.
> # Base state - CP1 is the checkpoint for "async" job
> # N2 starts indexing and removes changes CP1 to CP2. For Mongo the 
> checkpoints are saved in {{settings}} collection
> # N1 also decides to execute indexing but has yet not seen the latest 
> repository state so still thinks that CP1 is the base checkpoint and tries to 
> read it. However CP1 is already removed from {{settings}} and this makes N1 
> think that checkpoint is missing and it decides to reindex everything!
> To avoid this topology must be stable but at Oak level we should still handle 
> such a case and avoid doing a full reindexing. So we would need to have a 
> {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as 
> done in OAK-2203 
> Possible approaches
> # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being 
> missing can have valid reason and invalid reason. Need to see what are valid 
> scenarios where a checkpoint can go missing
> # A2 - When a checkpoint is created also store the creation time. When a 
> checkpoint is found to be missing and its a *recent* checkpoint then fail the 
> run. For e.g. we would fail the run till checkpoint found to be missing is 
> less than an hour old (for just started take startup time into account)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3710) Continuous revision GC

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036140#comment-15036140
 ] 

Vikas Saurabh commented on OAK-3710:


[~mreutegg], I was further discussing this with [~chetanm] and it seemed that 
we might be able to 'reduce' number of document writes during 'rewrite commit 
entries (step 3.2)' if we introduce some sort of early document re-write 
attached to lastRev updates. Chetan had concerns around slowing down 
background-write so we might want to do in a separate thread with queue of docs 
similar to pending-last-revs.
The idea is to clean up document for which last rev is to updated to also be 
scanned for revisions from same cluster node older than last-rev being updated 
from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with 
revisions r-Y-2 where Y Continuous revision GC
> --
>
> Key: OAK-3710
> URL: https://issues.apache.org/jira/browse/OAK-3710
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>
> Implement continuous revision GC cleaning up documents older than a given 
> threshold (e.g. one day). This issue is related to OAK-3070 where each GC run 
> starts where the last one finished.
> This will avoid peak load on the system as we see it right now, when GC is 
> triggered once a day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3710) Continuous revision GC

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036140#comment-15036140
 ] 

Vikas Saurabh edited comment on OAK-3710 at 12/2/15 5:36 PM:
-

[~mreutegg], I was further discussing this with [~chetanm] and it seemed that 
we might be able to 'reduce' number of document writes during 'rewrite commit 
entries (step 3.2)' if we introduce some sort of early document re-write 
attached to lastRev updates. Chetan had concerns around slowing down 
background-write so we might want to do in a separate thread with queue of docs 
similar to pending-last-revs.
The idea is to clean up document for which last rev is to updated to also be 
scanned for revisions from same cluster node older than last-rev being updated 
from ie. for last-rev update of r-0-2=r-X-2, we can clean properties with 
revisions r-Y-2 where Y Continuous revision GC
> --
>
> Key: OAK-3710
> URL: https://issues.apache.org/jira/browse/OAK-3710
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>
> Implement continuous revision GC cleaning up documents older than a given 
> threshold (e.g. one day). This issue is related to OAK-3070 where each GC run 
> starts where the last one finished.
> This will avoid peak load on the system as we see it right now, when GC is 
> triggered once a day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3662) Create bulk createOrUpdate method and use it in Commit

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036103#comment-15036103
 ] 

Julian Reschke commented on OAK-3662:
-

We don't have any unit test coverage for the new DocumentStore method, right? 
We really need that, for instance in BasicDocumentStoreTest.

> Create bulk createOrUpdate method and use it in Commit
> --
>
> Key: OAK-3662
> URL: https://issues.apache.org/jira/browse/OAK-3662
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3662.patch
>
>
> The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked 
> in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed 
> node. Investigate if it's possible to implement a batch version of the 
> createOrUpdate method. It should return all documents before they are 
> modified, so the Commit class can discover conflicts (if they are any).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing

2015-12-02 Thread Davide Giannella (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035499#comment-15035499
 ] 

Davide Giannella commented on OAK-3436:
---

I'm jumping in and I may have not properly understood the issue so I
may be wrong. Nevertheless here's my idea.

What would it be the possible cons for having each node async indexing
itself with its own set of checkpoints? Something like Manfred said.

To generate cluster node specific checkpoints OAK-3529 could help.



> Prevent missing checkpoint due to unstable topology from causing complete 
> reindexing
> 
>
> Key: OAK-3436
> URL: https://issues.apache.org/jira/browse/OAK-3436
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: resilience
> Fix For: 1.2.9, 1.0.25, 1.3.12
>
> Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch
>
>
> Async indexing logic relies on embedding application to ensure that async 
> indexing job is run as a singleton in a cluster. For Sling based apps it 
> depends on Sling Discovery support. At times it is being seen that if 
> topology is not stable then different cluster nodes can consider them as 
> leader and execute the async indexing job concurrently.
> This can cause problem as both cluster node might not see same repository 
> state (due to write skew and eventual consistency) and might remove the 
> checkpoint which other cluster node is still relying upon. For e.g. consider 
> a 2 node cluster N1 and N2 where both are performing async indexing.
> # Base state - CP1 is the checkpoint for "async" job
> # N2 starts indexing and removes changes CP1 to CP2. For Mongo the 
> checkpoints are saved in {{settings}} collection
> # N1 also decides to execute indexing but has yet not seen the latest 
> repository state so still thinks that CP1 is the base checkpoint and tries to 
> read it. However CP1 is already removed from {{settings}} and this makes N1 
> think that checkpoint is missing and it decides to reindex everything!
> To avoid this topology must be stable but at Oak level we should still handle 
> such a case and avoid doing a full reindexing. So we would need to have a 
> {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as 
> done in OAK-2203 
> Possible approaches
> # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being 
> missing can have valid reason and invalid reason. Need to see what are valid 
> scenarios where a checkpoint can go missing
> # A2 - When a checkpoint is created also store the creation time. When a 
> checkpoint is found to be missing and its a *recent* checkpoint then fail the 
> run. For e.g. we would fail the run till checkpoint found to be missing is 
> less than an hour old (for just started take startup time into account)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3140) DataStore / BlobStore: add a method to pass a "type" when writing

2015-12-02 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035523#comment-15035523
 ] 

Thomas Mueller commented on OAK-3140:
-

I would make sense to also add the "path", if available. This is to support 
OAK-3402 (Multiplexing DocumentStore support in Oak layer).

> DataStore / BlobStore: add a method to pass a "type" when writing
> -
>
> Key: OAK-3140
> URL: https://issues.apache.org/jira/browse/OAK-3140
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>  Labels: performance
>
> Currently, the BlobStore interface has a method "String writeBlob(InputStream 
> in)". This issue is about adding a new method "String writeBlob(String type, 
> InputStream in)", for the following reasons (in no particular order):
> * Store some binaries (for example Lucene index files) in a different place, 
> in order to safely and quickly run garbage collection just on those files.
> * Store some binaries in a slow, some in a fast storage or location.
> * Disable calculating the content hash (de-duplication) for some binaries.
> * Store some binaries in a shared storage (for fast cross-repository 
> copying), and some in local storage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3709) CugValidator should ignore node type definitions

2015-12-02 Thread angela (JIRA)
angela created OAK-3709:
---

 Summary: CugValidator should ignore node type definitions
 Key: OAK-3709
 URL: https://issues.apache.org/jira/browse/OAK-3709
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: authorization-cug
Reporter: angela
Assignee: angela






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3718) JCR observation should be visible in SessionMBean

2015-12-02 Thread JIRA
Jörg Hoh created OAK-3718:
-

 Summary: JCR observation should be visible in SessionMBean
 Key: OAK-3718
 URL: https://issues.apache.org/jira/browse/OAK-3718
 Project: Jackrabbit Oak
  Issue Type: Improvement
Affects Versions: 1.0.24
Reporter: Jörg Hoh


I am looking for long-running sessions, which are not (explicitly or 
implicitly) refreshed. As the a session, which has a registered JCR observation 
event listener, gets refreshed implicitly, it would be good to see in the 
Session MBean, if a JCR observation event handler is registered for this 
session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-12-02 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3494:
---
Fix Version/s: 1.0.25

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: candidate_oak_1_0, candidate_oak_1_2, performance
> Fix For: 1.3.10, 1.2.9, 1.0.25
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036670#comment-15036670
 ] 

Vikas Saurabh edited comment on OAK-3494 at 12/2/15 10:24 PM:
--

Backported r1707509 and r1710800 into:
* 1.2 at http://svn.apache.org/r1717683
* 1.0 at http://svn.apache.org/r1717690


was (Author: catholicon):
Backported into 1.2 at http://svn.apache.org/r1717683

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: candidate_oak_1_0, candidate_oak_1_2, performance
> Fix For: 1.3.10, 1.2.9, 1.0.25
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3223) Remove MongoDiffCache

2015-12-02 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3223:
---
Fix Version/s: 1.0.25

> Remove MongoDiffCache
> -
>
> Key: OAK-3223
> URL: https://issues.apache.org/jira/browse/OAK-3223
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.4, 1.2.9, 1.0.25
>
>
> The MongoDiffCache is not used anymore and can be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3223) Remove MongoDiffCache

2015-12-02 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3223:
---
Fix Version/s: 1.2.9

> Remove MongoDiffCache
> -
>
> Key: OAK-3223
> URL: https://issues.apache.org/jira/browse/OAK-3223
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.4, 1.2.9
>
>
> The MongoDiffCache is not used anymore and can be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3223) Remove MongoDiffCache

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036668#comment-15036668
 ] 

Vikas Saurabh commented on OAK-3223:


Backported into 1.2 at http://svn.apache.org/r1717683

> Remove MongoDiffCache
> -
>
> Key: OAK-3223
> URL: https://issues.apache.org/jira/browse/OAK-3223
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.4, 1.2.9
>
>
> The MongoDiffCache is not used anymore and can be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3223) Remove MongoDiffCache

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036668#comment-15036668
 ] 

Vikas Saurabh edited comment on OAK-3223 at 12/2/15 10:23 PM:
--

Backported r1695571 into:
* 1.2 at http://svn.apache.org/r1717683
* 1.0 at http://svn.apache.org/r1717690


was (Author: catholicon):
Backported into 1.2 at http://svn.apache.org/r1717683

> Remove MongoDiffCache
> -
>
> Key: OAK-3223
> URL: https://issues.apache.org/jira/browse/OAK-3223
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.4, 1.2.9
>
>
> The MongoDiffCache is not used anymore and can be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3720) Update script console bundle version to 1.0.2

2015-12-02 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-3720:


 Summary: Update script console bundle version to 1.0.2
 Key: OAK-3720
 URL: https://issues.apache.org/jira/browse/OAK-3720
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: examples
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.3.12


Script Console bundle version 1.0.2 is released which has fix for FELIX-5120. 
This was causing exception stacktrace on startup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3719) Test failure: ManyChildNodesTest

2015-12-02 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-3719:


 Summary: Test failure: ManyChildNodesTest 
 Key: OAK-3719
 URL: https://issues.apache.org/jira/browse/OAK-3719
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: documentmk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.3.12


At times {{ManyChildNodesTest#manyChildNodes}} fails like

{noformat}
Stack Trace:
java.lang.AssertionError: 2147483647 > 1601
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
{noformat}

This happens because during the test Cluster lease logic can also make call 
which get included in test result. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3719) Test failure: ManyChildNodesTest

2015-12-02 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037261#comment-15037261
 ] 

Chetan Mehrotra edited comment on OAK-3719 at 12/3/15 5:01 AM:
---

Fixed with 1717712 by only intercepting calls for Nodes collection in TestStore


was (Author: chetanm):
Fixed with 1717712

> Test failure: ManyChildNodesTest 
> -
>
> Key: OAK-3719
> URL: https://issues.apache.org/jira/browse/OAK-3719
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.12
>
>
> At times {{ManyChildNodesTest#manyChildNodes}} fails like
> {noformat}
> Stack Trace:
> java.lang.AssertionError: 2147483647 > 1601
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {noformat}
> This happens because during the test Cluster lease logic can also make call 
> which get included in test result. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2110) performance issues with VersionGarbageCollector

2015-12-02 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-2110:

Component/s: rdbmk
 doc

> performance issues with VersionGarbageCollector
> ---
>
> Key: OAK-2110
> URL: https://issues.apache.org/jira/browse/OAK-2110
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: doc, mongomk, rdbmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.4
>
>
> This one currently special-cases Mongo. For other persistences, it
> - fetches *all* documents
> - filters by SD_TYPE
> - filters by lastmod of versions
> - deletes what remains
> This is not only inefficient but also fails with OutOfMemory for any larger 
> repo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-12-02 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-3494:
-
Labels: performance  (was: candidate_oak_1_0 candidate_oak_1_2 performance)

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: performance
> Fix For: 1.3.10, 1.2.9, 1.0.25
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3719) Test failure: ManyChildNodesTest

2015-12-02 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-3719.
--
Resolution: Fixed

Fixed with 1717712

> Test failure: ManyChildNodesTest 
> -
>
> Key: OAK-3719
> URL: https://issues.apache.org/jira/browse/OAK-3719
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.12
>
>
> At times {{ManyChildNodesTest#manyChildNodes}} fails like
> {noformat}
> Stack Trace:
> java.lang.AssertionError: 2147483647 > 1601
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.jackrabbit.oak.plugins.document.ManyChildNodesTest.manyChildNodes(ManyChildNodesTest.java:63)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {noformat}
> This happens because during the test Cluster lease logic can also make call 
> which get included in test result. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3709) CugValidator should ignore node type definitions

2015-12-02 Thread angela (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela updated OAK-3709:

Description: since the node type definitions contain the reserved names as 
part of the nt definitions, the validator must omit that part during the 
verification.

> CugValidator should ignore node type definitions
> 
>
> Key: OAK-3709
> URL: https://issues.apache.org/jira/browse/OAK-3709
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: authorization-cug
>Reporter: angela
>Assignee: angela
>
> since the node type definitions contain the reserved names as part of the nt 
> definitions, the validator must omit that part during the verification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3709) CugValidator should ignore node type definitions

2015-12-02 Thread angela (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela resolved OAK-3709.
-
   Resolution: Fixed
Fix Version/s: 1.3.12

Committed revision 1717596.


> CugValidator should ignore node type definitions
> 
>
> Key: OAK-3709
> URL: https://issues.apache.org/jira/browse/OAK-3709
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: authorization-cug
>Reporter: angela
>Assignee: angela
> Fix For: 1.3.12
>
>
> since the node type definitions contain the reserved names as part of the nt 
> definitions, the validator must omit that part during the verification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-1981) Implement full scale Revision GC for DocumentNodeStore

2015-12-02 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-1981.
---
   Resolution: Duplicate
Fix Version/s: (was: 1.4)

Resolving this issue as duplicate of the DocumentMK Revision GC epic introduced 
a while ago. Missing GC features are listed in the epic and more up-to-date 
there.

> Implement full scale Revision GC for DocumentNodeStore
> --
>
> Key: OAK-1981
> URL: https://issues.apache.org/jira/browse/OAK-1981
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: mongomk
>Reporter: Chetan Mehrotra
>Assignee: Marcel Reutegger
>  Labels: resilience, scalability
>
> So far we have implemented garbage collection in some form with OAK-1341. 
> Those approaches help us remove quite a bit of garbage (mostly due to deleted 
> nodes) but till some part is left
> However full GC is still not performed due to which some of the old revision 
> related data cannot be GCed like
> * Revision info present in revision maps of various commit roots
> * Revision related to unmerged branches (OAK-1926)
> * Revision data created to property being modified by different cluster nodes
> So having a tool which can perform above GC would be helpful. For start we 
> can have an implementation which takes a brute force approach and scans whole 
> repo (would take quite a bit of time) and later we can evolve it. Or allow 
> system admins to determine to what level GC has to be done



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3710) Continuous revision GC

2015-12-02 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3710:
-

 Summary: Continuous revision GC
 Key: OAK-3710
 URL: https://issues.apache.org/jira/browse/OAK-3710
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, documentmk
Reporter: Marcel Reutegger


Implement continuous revision GC cleaning up documents older than a given 
threshold (e.g. one day). This issue is related to OAK-3070 where each GC run 
starts where the last one finished.

This will avoid peak load on the system as we see it right now, when GC is 
triggered once a day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3712) Clean up old and uncommitted changes

2015-12-02 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3712:
-

 Summary: Clean up old and uncommitted changes
 Key: OAK-3712
 URL: https://issues.apache.org/jira/browse/OAK-3712
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, documentmk
Reporter: Marcel Reutegger


Clean up old and uncommitted changes in the main document. This issue is 
related to OAK-2392, which is specifically about changes on binary properties 
and effect on blob GC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2860) RDBBlobStore: seen insert failures due to duplicate keys

2015-12-02 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035713#comment-15035713
 ] 

Julian Reschke commented on OAK-2860:
-

trunk: http://svn.apache.org/r1678938
1.2: http://svn.apache.org/r1679216
1.0: http://svn.apache.org/r1678951

> RDBBlobStore: seen insert failures due to duplicate keys
> 
>
> Key: OAK-2860
> URL: https://issues.apache.org/jira/browse/OAK-2860
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob, rdbmk
>Affects Versions: 1.0.13, 1.2.2
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: resilience
> Fix For: 1.3.1, 1.0.14, 1.2.3
>
> Attachments: OAK-2860.diff
>
>
> In production, we've seen exceptions like this:
> {noformat}
>  org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore insert document 
> failed for id 
> bd89b0745aa22429234f17dfc3e2a35b744dc6e86f5e8094a4153b2366c4d822 w
> ith length 14691 (check max size of datastore_data.data)
> com.ibm.db2.jcc.am.SqlIntegrityConstraintViolationException: DB2 SQL Error: 
> SQLCODE=-803, SQLSTATE=23505, SQLERRMC=1;DB2INST1.DATASTORE_DATA, 
> DRIVER=4.16.53
> at com.ibm.db2.jcc.am.fd.a(fd.java:735)
> at com.ibm.db2.jcc.am.fd.a(fd.java:60)
> at com.ibm.db2.jcc.am.fd.a(fd.java:127)
> at com.ibm.db2.jcc.am.to.b(to.java:2422)
> at com.ibm.db2.jcc.am.to.c(to.java:2405)
> at com.ibm.db2.jcc.t4.ab.l(ab.java:408)
> at com.ibm.db2.jcc.t4.ab.a(ab.java:62)
> at com.ibm.db2.jcc.t4.o.a(o.java:50)
> at com.ibm.db2.jcc.t4.ub.b(ub.java:220)
> at com.ibm.db2.jcc.am.uo.sc(uo.java:3526)
> at com.ibm.db2.jcc.am.uo.b(uo.java:4489)
> at com.ibm.db2.jcc.am.uo.mc(uo.java:2833)
> at com.ibm.db2.jcc.am.uo.execute(uo.java:2808)
> at sun.reflect.GeneratedMethodAccessor941.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:600)
> at 
> org.apache.tomcat.jdbc.pool.interceptor.AbstractQueryReport$StatementProxy.invoke(AbstractQueryReport.java:235)
> at com.sun.proxy.$Proxy259.execute(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor941.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:600)
> at 
> org.apache.tomcat.jdbc.pool.interceptor.StatementDecoratorInterceptor$StatementProxy.invoke(StatementDecoratorInterceptor.java:252)
> at com.sun.proxy.$Proxy259.execute(Unknown Source)
> at 
> org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore.storeBlockInDatabase(RDBBlobStore.java:374)
> at 
> org.apache.jackrabbit.oak.plugins.document.rdb.RDBBlobStore.storeBlock(RDBBlobStore.java:340)
> {noformat}
> This seems to indicate that they key is present in _data but not in _meta. We 
> need to find out whether that's caused by an earlier problem, or whether 
> storeInBlock is supposed to handle this.
> (Note that the actual exception message about "check max size of 
> datastore_data.data" is misleading; it's due to an earlier attempt to 
> diagnose DB config problems)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3711) Clean up _revision entries on commit root documents

2015-12-02 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3711:
-

 Summary: Clean up _revision entries on commit root documents
 Key: OAK-3711
 URL: https://issues.apache.org/jira/browse/OAK-3711
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: core, documentmk
Reporter: Marcel Reutegger


The _revisions entries on commit root documents are currently not cleaned up 
and accumulate in split documents.

One possible solution may be to ensure that there are no uncommitted changes up 
to certain revisions. Older revisions would then be considered valid and commit 
information on the commit root document wouldn't be needed anymore.

For regular commits this is probably not that difficult. However, changes from 
branch commits require the merge revision set in the commit entry on the commit 
root to decide when those changes were made visible to other sessions. A simple 
solution could be to rewrite such changes with the merge revision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2843) Broadcasting cache

2015-12-02 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035683#comment-15035683
 ] 

Thomas Mueller commented on OAK-2843:
-

I could now verify the cache works as expected. My test was: 

* Two cluster nodes, using the MongoDB document store.
* Delete the persistent cache files.
* Using the persistent cache setting is follows (OSGi configuration):
{noformat}
persistentCache="crx-quickstart/repository/cache,size\=1024,binary\=0,broadcast\=tcp:key
 123"
{noformat}
* Read all nodes of the repository (called "traversal check" in our 
application).
* This took 20 seconds (because it had to load all nodes from MongoDB).
* Do the same on the other cluster node, which only took 5 seconds.

I ran the same test without the broadcasting cache enabled, that is just with 
{noformat}
persistentCache="crx-quickstart/repository/cache,size\=1024,binary\=0"
{noformat}
The first time it took 24 seconds on _each_ cluster node (because both cluster 
nodes have to load all data from MongoDB, if the persistent cache is empty). 
The second time it took 5 seconds. After a restart (but without deleting the 
local persistent cache), it also took 5 seconds.

> Broadcasting cache
> --
>
> Key: OAK-2843
> URL: https://issues.apache.org/jira/browse/OAK-2843
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
> Fix For: 1.3.12
>
>
> In a cluster environment, we could speed up reading if the cache(s) broadcast 
> data to other instances. This would avoid bottlenecks at the storage layer 
> (MongoDB, RDBMs).
> The configuration metadata (IP addresses and ports of where to send data to, 
> a unique identifier of the repository and the cluster nodes, possibly 
> encryption key) rarely changes and can be stored in the same place as we 
> store cluster metadata (cluster info collection). That way, in many cases no 
> manual configuration is needed. We could use TCP/IP and / or UDP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Moved] (OAK-3713) Remove dep cycle between plugins/tree and spi.security

2015-12-02 Thread angela (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela moved JCR-3936 to OAK-3713:
--

Component/s: (was: core)
 core
   Workflow: no-reopen-closed  (was: no-reopen-closed, patch-avail)
Key: OAK-3713  (was: JCR-3936)
Project: Jackrabbit Oak  (was: Jackrabbit Content Repository)

> Remove dep cycle between plugins/tree and spi.security
> --
>
> Key: OAK-3713
> URL: https://issues.apache.org/jira/browse/OAK-3713
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: angela
>Assignee: angela
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3662) Create bulk createOrUpdate method and use it in Commit

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3662:
---
Attachment: (was: OAK-3662.patch)

> Create bulk createOrUpdate method and use it in Commit
> --
>
> Key: OAK-3662
> URL: https://issues.apache.org/jira/browse/OAK-3662
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3662.patch
>
>
> The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked 
> in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed 
> node. Investigate if it's possible to implement a batch version of the 
> createOrUpdate method. It should return all documents before they are 
> modified, so the Commit class can discover conflicts (if they are any).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3559) Bulk document updates in MongoDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3559:
---
Attachment: OAK-3559.patch

> Bulk document updates in MongoDocumentStore
> ---
>
> Key: OAK-3559
> URL: https://issues.apache.org/jira/browse/OAK-3559
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: mongomk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3559.patch
>
>
> Using the MongoDB [Bulk 
> API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement 
> the [batch version of createOrUpdate method|OAK-3662].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3559) Bulk document updates in MongoDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3559:
---
Attachment: (was: OAK-3559.patch)

> Bulk document updates in MongoDocumentStore
> ---
>
> Key: OAK-3559
> URL: https://issues.apache.org/jira/browse/OAK-3559
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: mongomk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3559.patch
>
>
> Using the MongoDB [Bulk 
> API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement 
> the [batch version of createOrUpdate method|OAK-3662].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3662) Create bulk createOrUpdate method and use it in Commit

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3662:
---
Attachment: OAK-3662.patch

> Create bulk createOrUpdate method and use it in Commit
> --
>
> Key: OAK-3662
> URL: https://issues.apache.org/jira/browse/OAK-3662
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3662.patch
>
>
> The {{DocumentStore#createOrUpdate(Collection, UpdateOp)}} method is invoked 
> in a loop in the {{Commit#applyToDocumentStore()}}, once for each changed 
> node. Investigate if it's possible to implement a batch version of the 
> createOrUpdate method. It should return all documents before they are 
> modified, so the Commit class can discover conflicts (if they are any).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3637:
---
Attachment: (was: OAK-3637.patch)

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3559) Bulk document updates in MongoDocumentStore

2015-12-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14986971#comment-14986971
 ] 

Tomek Rękawek edited comment on OAK-3559 at 12/2/15 1:27 PM:
-

h4. New bulk update method

The patch adds new {{createOrUpdate(Collection collection, List 
updateOps)}} method to the {{DocumentStore}} interface. The MongoDB 
implementation uses Bulk API. RDB and Memory document stores has been extended 
with a naive implementation iterating over {{updateOps}}. The Mongo 
implementation works as follows:

1. For each {{UpdateOp}} try to read the assigned document from the cache. Add 
them to {{oldDocs}}.
2. Prepare a list of all {{UpdateOps}} that doesn't have their documents and 
read them in one {{find()}} call. Add results to {{oldDocs}}.
3. Prepare a bulk update. For each remaining {{UpdateOp}} add following 
operation:
* Find document with the same id and the same {{mod_count}} as in the 
{{oldDocs}}.
* Apply changes from the {{UpdateOps}}.

4. Execute the bulk update.

If some other process modifies the target documents between points 2 and 3, the 
{{mod_count}} will be increased as well and the bulk update will fail for the 
concurrently modified docs. The method will then remove the failed documents 
from the {{oldDocs}} and restart the process from point 2. It will stop after 
3rd iteration.

h4. Changes in the Commit class

The new method has been used in the {{Commit#applyToDocumentStore}}. If it 
fails (eg. there has been more than 3 unsuccessful retries in the Mongo 
implementation), there will be fallback to the classic approach, applying one 
update after another.

h4. Changes in the CommitQueue and ConflictException

Introducing bulk updates means that we may have conflicts in many revisions at 
the same time. That's the reason why the {{ConflictException}} now contains the 
revision list, rather than a single revision number. In order to resolve 
conflicts in the {{DocumentNodeStoreBranch#merge0}} method, the 
{{CommitQueue#suspendUntil()}} has been extended as well. Now it allows to pass 
a list of revisions and suspends execution until all of them are visible.


was (Author: tomek.rekawek):
The pull request has been created here:
https://github.com/apache/jackrabbit-oak/pull/43

The patch can be downloaded from:
https://patch-diff.githubusercontent.com/raw/apache/jackrabbit-oak/pull/43.diff

h4. New bulk update method

The patch adds new {{createOrUpdate(Collection collection, List 
updateOps)}} method to the {{DocumentStore}} interface. The MongoDB 
implementation uses Bulk API. RDB and Memory document stores has been extended 
with a naive implementation iterating over {{updateOps}}. The Mongo 
implementation works as follows:

1. For each {{UpdateOp}} try to read the assigned document from the cache. Add 
them to {{oldDocs}}.
2. Prepare a list of all {{UpdateOps}} that doesn't have their documents and 
read them in one {{find()}} call. Add results to {{oldDocs}}.
3. Prepare a bulk update. For each remaining {{UpdateOp}} add following 
operation:
* Find document with the same id and the same {{mod_count}} as in the 
{{oldDocs}}.
* Apply changes from the {{UpdateOps}}.

4. Execute the bulk update.

If some other process modifies the target documents between points 2 and 3, the 
{{mod_count}} will be increased as well and the bulk update will fail for the 
concurrently modified docs. The method will then remove the failed documents 
from the {{oldDocs}} and restart the process from point 2. It will stop after 
3rd iteration.

h4. Changes in the Commit class

The new method has been used in the {{Commit#applyToDocumentStore}}. If it 
fails (eg. there has been more than 3 unsuccessful retries in the Mongo 
implementation), there will be fallback to the classic approach, applying one 
update after another.

h4. Changes in the CommitQueue and ConflictException

Introducing bulk updates means that we may have conflicts in many revisions at 
the same time. That's the reason why the {{ConflictException}} now contains the 
revision list, rather than a single revision number. In order to resolve 
conflicts in the {{DocumentNodeStoreBranch#merge0}} method, the 
{{CommitQueue#suspendUntil()}} has been extended as well. Now it allows to pass 
a list of revisions and suspends execution until all of them are visible.

> Bulk document updates in MongoDocumentStore
> ---
>
> Key: OAK-3559
> URL: https://issues.apache.org/jira/browse/OAK-3559
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: mongomk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3559.patch
>
>
> Using the MongoDB [Bulk 
> API|https://docs.mongodb.org/manual/reference/method/Bulk/#Bulk] implement 
> the [batch version of createOrUpdate 

[jira] [Updated] (OAK-3713) Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security

2015-12-02 Thread angela (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela updated OAK-3713:

Summary: Remove dep cycle between plugins/tree/TreeTypeProvider and 
spi.security  (was: Remove dep cycle between plugins/tree and spi.security)

> Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security
> ---
>
> Key: OAK-3713
> URL: https://issues.apache.org/jira/browse/OAK-3713
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: angela
>Assignee: angela
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3637) Bulk document updates in RDBDocumentStore

2015-12-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-3637:
---
Attachment: OAK-3637.patch

> Bulk document updates in RDBDocumentStore
> -
>
> Key: OAK-3637
> URL: https://issues.apache.org/jira/browse/OAK-3637
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Reporter: Tomek Rękawek
> Fix For: 1.4
>
> Attachments: OAK-3637.patch
>
>
> Implement the [batch createOrUpdate|OAK-3662] in the RDBDocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-12-02 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15036670#comment-15036670
 ] 

Vikas Saurabh commented on OAK-3494:


Backported into 1.2 at http://svn.apache.org/r1717683

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: candidate_oak_1_0, candidate_oak_1_2, performance
> Fix For: 1.3.10, 1.2.9
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-12-02 Thread Vikas Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Saurabh updated OAK-3494:
---
Fix Version/s: 1.2.9

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: candidate_oak_1_0, candidate_oak_1_2, performance
> Fix For: 1.3.10, 1.2.9
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3714) RDBDocumentStore diagnostics for Oracle might not contain index information

2015-12-02 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-3714:
---

 Summary: RDBDocumentStore diagnostics for Oracle might not contain 
index information
 Key: OAK-3714
 URL: https://issues.apache.org/jira/browse/OAK-3714
 Project: Jackrabbit Oak
  Issue Type: Technical task
Affects Versions: 1.0.24, 1.2.8, 1.3.11
Reporter: Julian Reschke
Assignee: Julian Reschke
Priority: Minor


...when the table name contains lowercase characters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3713) Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security

2015-12-02 Thread angela (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angela resolved OAK-3713.
-
   Resolution: Fixed
Fix Version/s: 1.3.12

> Remove dep cycle between plugins/tree/TreeTypeProvider and spi.security
> ---
>
> Key: OAK-3713
> URL: https://issues.apache.org/jira/browse/OAK-3713
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: angela
>Assignee: angela
> Fix For: 1.3.12
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3715) SegmentWriter reduce buffer size for reading binaries

2015-12-02 Thread Alex Parvulescu (JIRA)
Alex Parvulescu created OAK-3715:


 Summary: SegmentWriter reduce buffer size for reading binaries
 Key: OAK-3715
 URL: https://issues.apache.org/jira/browse/OAK-3715
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segmentmk
Reporter: Alex Parvulescu
Assignee: Alex Parvulescu
Priority: Minor


The SegmentWriter uses an initial buffer size of 256k for reading input streams 
binaries that need to be persisted, then it checks if the input is smaller than 
16k to verify if it can be inlined or not.
In the case the input binary is small and can be inlined (<16k), the initial 
buffer size is too wasteful and could be reduced to 16k, and _if needed_ 
increased to 256k after the threshold check is passed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3715) SegmentWriter reduce buffer size for reading binaries

2015-12-02 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-3715:
-
Attachment: OAK-3715.patch

proposed patch. fyi [~mduerig], [~tmueller]

> SegmentWriter reduce buffer size for reading binaries
> -
>
> Key: OAK-3715
> URL: https://issues.apache.org/jira/browse/OAK-3715
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Minor
> Attachments: OAK-3715.patch
>
>
> The SegmentWriter uses an initial buffer size of 256k for reading input 
> streams binaries that need to be persisted, then it checks if the input is 
> smaller than 16k to verify if it can be inlined or not.
> In the case the input binary is small and can be inlined (<16k), the initial 
> buffer size is too wasteful and could be reduced to 16k, and _if needed_ 
> increased to 256k after the threshold check is passed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3715) SegmentWriter reduce buffer size for reading binaries

2015-12-02 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-3715:
-
Description: 
The SegmentWriter uses an initial buffer size of 256k for reading input streams 
binaries that need to be persisted, then it checks if the input is smaller than 
16k to verify if it can be inlined or not. [0]
In the case the input binary is small and can be inlined (<16k), the initial 
buffer size is too wasteful and could be reduced to 16k, and if needed 
increased to 256k after the threshold check is passed.

[0] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495

  was:
The SegmentWriter uses an initial buffer size of 256k for reading input streams 
binaries that need to be persisted, then it checks if the input is smaller than 
16k to verify if it can be inlined or not.
In the case the input binary is small and can be inlined (<16k), the initial 
buffer size is too wasteful and could be reduced to 16k, and _if needed_ 
increased to 256k after the threshold check is passed.


> SegmentWriter reduce buffer size for reading binaries
> -
>
> Key: OAK-3715
> URL: https://issues.apache.org/jira/browse/OAK-3715
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Minor
> Attachments: OAK-3715.patch
>
>
> The SegmentWriter uses an initial buffer size of 256k for reading input 
> streams binaries that need to be persisted, then it checks if the input is 
> smaller than 16k to verify if it can be inlined or not. [0]
> In the case the input binary is small and can be inlined (<16k), the initial 
> buffer size is too wasteful and could be reduced to 16k, and if needed 
> increased to 256k after the threshold check is passed.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentWriter.java#L495



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3710) Continuous revision GC

2015-12-02 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035849#comment-15035849
 ] 

Marcel Reutegger commented on OAK-3710:
---

Had an offline discussion with Chetan and Vikas about how to implement this 
feature. The basic ideas are:

- Remember T' as lowest revision time of _lastRev entries on the root document.
- Scan through documents that have a _modified >= T read from settings 
collection. Use a value of 0 if T is undefined.
- For each document:
-- remove changes (committed and uncommitted) that are older than 
{{maxRevisionAge}} (see also OAK-3712)
-- rewrite commit entries of remaining committed changes and set local 
_revisions entries accordingly (may collide with split operations!)
- Store T' in settings collection for starting point of next cycle
- Remove split documents with {{_sdMaxRevTime}} < T (see also OAK-3711)

In addition it would also be good to change the way documents are split. 
Currently _commitRoot entries are moved to previous documents. I think it would 
be better to rewrite the change on split and replace _commitRoot with 
_revisions entries with the correct commit value. This reduces dependency on 
the commit root document.

> Continuous revision GC
> --
>
> Key: OAK-3710
> URL: https://issues.apache.org/jira/browse/OAK-3710
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: core, documentmk
>Reporter: Marcel Reutegger
>
> Implement continuous revision GC cleaning up documents older than a given 
> threshold (e.g. one day). This issue is related to OAK-3070 where each GC run 
> starts where the last one finished.
> This will avoid peak load on the system as we see it right now, when GC is 
> triggered once a day.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3611) upgrade H2DB dependency to 1.4.190

2015-12-02 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-3611:

Fix Version/s: 1.3.12

> upgrade H2DB dependency to 1.4.190
> --
>
> Key: OAK-3611
> URL: https://issues.apache.org/jira/browse/OAK-3611
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core
>Reporter: Julian Reschke
>Assignee: Thomas Mueller
> Fix For: 1.3.12
>
>
> (we are currently at 1.4.185)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)