[jira] [Updated] (OAK-3071) Add a compound index for _modified + _id

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3071:
---
Labels: performance resilience  (was: )

 Add a compound index for _modified + _id
 

 Key: OAK-3071
 URL: https://issues.apache.org/jira/browse/OAK-3071
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
  Labels: performance, resilience
 Fix For: 1.3.5


 As explained in OAK-1966 diff logic makes a call like
 bq. db.nodes.find({ _id: { $gt: 3:/content/foo/01/, $lt: 
 3:/content/foo010 }, _modified: { $gte: 1405085300 } }).sort({_id:1})
 For better and deterministic query performance we would need to create a 
 compound index like \{_modified:1, _id:1\}. This index would ensure that 
 Mongo does not have to perform object scan while evaluating such a query.
 Care must be taken that index is only created by default for fresh setup. For 
 existing setup we should expose a JMX operation which can be invoked by 
 system admin to create the required index as per maintenance window



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2779) DocumentNodeStore should provide option to set initial cache size as percentage of MAX VM size

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2779:
---
Labels: performance resilience  (was: )

 DocumentNodeStore should provide option to set initial cache size as 
 percentage of MAX VM size
 --

 Key: OAK-2779
 URL: https://issues.apache.org/jira/browse/OAK-2779
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.2
Reporter: Will McGauley
  Labels: performance, resilience
 Fix For: 1.3.5


 Currently the DocumentNodeStore provides a way to configure various cache 
 parameters, including cache size and distribution of that size to various 
 caches.  The distribution of caches is done as a % of the total cache size, 
 which is very helpful, but the overall cache size can only be set as a 
 literal value.
 It would be helpful to achieve a good default value based on the available VM 
 memory as a %, instead of a literal value.  By doing this the cache size 
 would not need to be set by each customer, and a better initial experience 
 would be achieved.  
 I suggest that 25% of the max VM size would be a good starting point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1575) DocumentNS: Implement refined conflict resolution for addExistingNode conflicts

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-1575:
---
Labels: resilience  (was: )

 DocumentNS: Implement refined conflict resolution for addExistingNode 
 conflicts
 ---

 Key: OAK-1575
 URL: https://issues.apache.org/jira/browse/OAK-1575
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: mongomk
Reporter: Michael Dürig
Assignee: Marcel Reutegger
  Labels: resilience
 Fix For: 1.4


 Implement refined conflict resolution for addExistingNode conflicts as 
 defined in the parent issue for the document NS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2242) provide a way to update the created timestamp of a NodeDocument

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2242:
---
Labels: performance  (was: )

 provide a way to update the created timestamp of a NodeDocument
 -

 Key: OAK-2242
 URL: https://issues.apache.org/jira/browse/OAK-2242
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.1.1
Reporter: Julian Reschke
Assignee: Julian Reschke
  Labels: performance
 Fix For: 1.4


 Both the MongoDocumentStore and the RDBDocumentStore maintain a _modCount 
 property, which uniquely identifies a version of a document in the 
 persistence.
 Sometimes, we read data from the persistence although we already might have 
 the document cached. This happens:
 a) when the cached document is older than what the caller asked for
 b) when running a query (for instance when looking up children of a node)
 In both cases, we currently replace the cache entry with a newly built 
 NodeDocument.
 It would make sense to re-use the existing document instead. (This would 
 probably require modifying the created timestamp, but would avoid the 
 trouble of having to update the cache at all) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1557) Mark documents as deleted

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-1557:
---
Fix Version/s: (was: 1.4)
   1.3.6

 Mark documents as deleted
 -

 Key: OAK-1557
 URL: https://issues.apache.org/jira/browse/OAK-1557
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Marcel Reutegger
Assignee: Chetan Mehrotra
  Labels: performance, resilience
 Fix For: 1.3.6


 This is an improvement to make a certain use case more efficient. When there 
 is a parent node with frequently added and removed child nodes, the reading 
 of the current list of child nodes becomes inefficient because the decision 
 whether a node exists at a certain revision is done in the DocumentNodeStore 
 and no filtering is done on the MongoDB side.
 So far we figured this would be solved automatically by the MVCC garbage 
 collection, when documents for deleted nodes are removed. However for 
 locations in the repository where nodes are added and deleted again 
 frequently (think of a temp folder), the issue pops up before the GC had a 
 chance to clean up.
 The Document should have an additional field, which is set when the node is 
 deleted in the most recent revision. Based on this field the 
 DocumentNodeStore can limit the query to MongoDB to documents that are not 
 deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2242) provide a way to update the created timestamp of a NodeDocument

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2242:
---
Fix Version/s: (was: 1.4)
   1.3.7

 provide a way to update the created timestamp of a NodeDocument
 -

 Key: OAK-2242
 URL: https://issues.apache.org/jira/browse/OAK-2242
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.1.1
Reporter: Julian Reschke
Assignee: Julian Reschke
  Labels: performance
 Fix For: 1.3.7


 Both the MongoDocumentStore and the RDBDocumentStore maintain a _modCount 
 property, which uniquely identifies a version of a document in the 
 persistence.
 Sometimes, we read data from the persistence although we already might have 
 the document cached. This happens:
 a) when the cached document is older than what the caller asked for
 b) when running a query (for instance when looking up children of a node)
 In both cases, we currently replace the cache entry with a newly built 
 NodeDocument.
 It would make sense to re-use the existing document instead. (This would 
 probably require modifying the created timestamp, but would avoid the 
 trouble of having to update the cache at all) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2621) Too many reads for child nodes

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2621:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Too many reads for child nodes
 --

 Key: OAK-2621
 URL: https://issues.apache.org/jira/browse/OAK-2621
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.0
Reporter: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.7


 The DocumentNodeStore issues a lot of reads when sibling nodes are deleted, 
 which are also index with a property index.
 The following calls will become a hotspot:
 {noformat}
   at 
 org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.query(MongoDocumentStore.java:406)
   at 
 org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.readChildDocs(DocumentNodeStore.java:846)
   at 
 org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.readChildren(DocumentNodeStore.java:788)
   at 
 org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getChildren(DocumentNodeStore.java:753)
   at 
 org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.getChildNodeCount(DocumentNodeState.java:194)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.getChildNodeCount(ModifiedNodeState.java:198)
   at 
 org.apache.jackrabbit.oak.plugins.memory.MutableNodeState.getChildNodeCount(MutableNodeState.java:265)
   at 
 org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.getChildNodeCount(MemoryNodeBuilder.java:293)
   at 
 org.apache.jackrabbit.oak.plugins.index.property.strategy.ContentMirrorStoreStrategy.prune(ContentMirrorStoreStrategy.java:456)
 {noformat}
 I think the code triggering this issue is in 
 {{ModifiedNodeState.getChildNodeCount()}}. It keeps track of already deleted 
 children and requests {{max += deleted}}. The actual {{max}} is always 1 as 
 requested from {{ContentMirrorStoreStrategy.prune()}}, but as more nodes get 
 deleted, the higher {{max}} gets passed to 
 {{DocumentNodeState.getChildNodeCount()}}. The DocumentNodeStore then checks 
 if it has the children in the cache, only to find out the cache entry has too 
 few entries and it needs to fetch one more.
 It would be best to have a minimum number of child nodes to fetch from 
 MongoDB in this case. E.g. when NodeState.getChildNodeEntries() is called, 
 the DocumentNodeState fetches 100 children.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3071) Add a compound index for _modified + _id

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3071:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Add a compound index for _modified + _id
 

 Key: OAK-3071
 URL: https://issues.apache.org/jira/browse/OAK-3071
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
  Labels: performance, resilience
 Fix For: 1.3.7


 As explained in OAK-1966 diff logic makes a call like
 bq. db.nodes.find({ _id: { $gt: 3:/content/foo/01/, $lt: 
 3:/content/foo010 }, _modified: { $gte: 1405085300 } }).sort({_id:1})
 For better and deterministic query performance we would need to create a 
 compound index like \{_modified:1, _id:1\}. This index would ensure that 
 Mongo does not have to perform object scan while evaluating such a query.
 Care must be taken that index is only created by default for fresh setup. For 
 existing setup we should expose a JMX operation which can be invoked by 
 system admin to create the required index as per maintenance window



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2492) Flag Document having many children

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2492:
---
Fix Version/s: (was: 1.4)
   1.3.7

 Flag Document having many children
 --

 Key: OAK-2492
 URL: https://issues.apache.org/jira/browse/OAK-2492
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
  Labels: performance
 Fix For: 1.3.7


 Current DocumentMK logic while performing a diff for child nodes works as 
 below
 # Get children for _before_ revision upto MANY_CHILDREN_THRESHOLD (which 
 defaults to 50). Further note that current logic of fetching children nodes 
 also add children {{NodeDocument}} to {{Document}} cache and also reads the 
 complete Document for those children
 # Get children for _after_ revision with limits as above
 # If the child list is complete then it does a direct diff on the fetched 
 children
 # if the list is not complete i.e. number of children are more than the 
 threshold then it for a query based diff (also see OAK-1970)
 So in those cases where number of children are large then all work done in #1 
 above is wasted and should be avoided. To do that we can mark those parent 
 nodes which have many children via special flag like {{_manyChildren}}. One 
 such nodes are marked the diff logic can check for the flag and skip the work 
 done in #1
 This is kind of similar to way we mark nodes which have at least one child 
 (OAK-1117)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3018) Use batch-update in backgroundWrite

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3018:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Use batch-update in backgroundWrite
 ---

 Key: OAK-3018
 URL: https://issues.apache.org/jira/browse/OAK-3018
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Stefan Egli
  Labels: performance
 Fix For: 1.3.7


 (From an earlier [post on the 
 list|http://markmail.org/thread/mkrvhkfabit4osli]) The 
 DocumentNodeStore.backgroundWrite goes through the heavy work of updating the 
 lastRev for all pending changes and does so in a hierarchical-depth-first 
 manner. Unfortunately, if the pending changes all come from separate commits 
 (as does not sound so unlikely), the updates are sent in individual update 
 calls to mongo (whenever the lastRev differs). Which, if there are many 
 changes, results in many calls to mongo.
 OAK-2066 is about extending the DocumentStore API with a batch-update method. 
 That one, once available, should thus be used in the {{backgroundWrite}} as 
 well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3066) Persistent cache for previous documents

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3066:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Persistent cache for previous documents
 ---

 Key: OAK-3066
 URL: https://issues.apache.org/jira/browse/OAK-3066
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Vikas Saurabh
  Labels: performance
 Fix For: 1.3.7


 Previous (aka split) documents contain old revisions and are immutable 
 documents. Those documents should go into the persistent cache to reduce 
 calls to the underlying DocumentStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3222) RDBDocumentStore: add missing RDBHelper support for JOURNAL table

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-3222.
-

Bulk close for 1.0.19

 RDBDocumentStore: add missing RDBHelper support for JOURNAL table
 -

 Key: OAK-3222
 URL: https://issues.apache.org/jira/browse/OAK-3222
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: rdbmk
Affects Versions: 1.2.3
Reporter: Julian Reschke
Assignee: Julian Reschke
 Fix For: 1.2.4, 1.0.19






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2744) Change default cache distribution ratio if persistent cache is enabled

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2744:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Change default cache distribution ratio if persistent cache is enabled
 --

 Key: OAK-2744
 URL: https://issues.apache.org/jira/browse/OAK-2744
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
  Labels: performance
 Fix For: 1.3.7


 By default the cache memory in DocumentNodeStore is distributed in following 
 ratio
 * nodeCache - 25%
 * childrenCache - 10%
 * docChildrenCache - 3%
 * diffCache - 5%
 * documentCache - Is given the rest i.e. 57%
 However off late we have found that with persistent cache enabled we can 
 lower the cache allocated to Document cache. That would reduce the time spent 
 in invalidating cache entries in periodic reads. So far we are using 
 following ration in few setup and that is turning out well
 * nodeCachePercentage=35
 * childrenCachePercentage=20
 * diffCachePercentage=30
 * docChildrenCachePercentage=10
 * documentCache - Is given the rest i.e. 5%
 We should use the above distribution by default if the persistent cache is 
 found to be enabled



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3180) Versioning: improve diagnostics when version history state is broken

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-3180.
-

Bulk close for 1.0.19

 Versioning: improve diagnostics when version history state is broken
 

 Key: OAK-3180
 URL: https://issues.apache.org/jira/browse/OAK-3180
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core
Affects Versions: 1.2.3, 1.3.3, 1.0.18
Reporter: Julian Reschke
Assignee: Julian Reschke
 Fix For: 1.0.19

 Attachments: OAK-3180.diff


 Users suffering from the problem described in OAK-3169 may encounter NPEs 
 upon checkin(), as ReadWriteVersionManager.checkin() does not check the 
 return value of getExistingBaseVersion() for null. Even if we can't fix the 
 underlying problem easily, we should at least provide better diagnostics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2836) Create diff cache entry for merged persisted branch

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2836:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Create diff cache entry for merged persisted branch
 ---

 Key: OAK-2836
 URL: https://issues.apache.org/jira/browse/OAK-2836
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.7


 The diff cache is currently not populated with an entry when a persisted 
 branch in the DocumentNodeStore is merged. This means the diff needs to be 
 calculated later, which may affect performance when events are generated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3189) CLONE - MissingLastRevSeeker non MongoDS may fail with OOM

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-3189.
-

Bulk close for 1.0.19

 CLONE - MissingLastRevSeeker non MongoDS may fail with OOM
 --

 Key: OAK-3189
 URL: https://issues.apache.org/jira/browse/OAK-3189
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: core, rdbmk
Affects Versions: 1.0.18
Reporter: Julian Reschke
Assignee: Julian Reschke
 Fix For: 1.0.19


 (This clones OAK-2208 as that never made it into the 1.0 branch)
 This code currently has a hardwired optimization for MongoDB (returning an 
 Iterable over a DBCursor). For all other persistences, a java List of all 
 matching NodeDocuments will be built.
 I see two ways to address this:
 1) Generalize the Mongo approach, where a query to the persistence can return 
 a live iterator, or
 2) Stick with the public DS API, but leverage paging (get N nodes at once, 
 and then keep calling query() again with the right starting ID).
 2) sounds simpler, but is not transactional; [~mreutegg] would that be 
 sufficient?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2610) Annotate intermediate docs with property names

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2610:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Annotate intermediate docs with property names
 --

 Key: OAK-2610
 URL: https://issues.apache.org/jira/browse/OAK-2610
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.7


 Reading through a ValueMap can be very inefficient if the changes of a given
 property are distributed sparsely across the previous documents. The current
 implementation has to scan through the entire set of previous documents to
 collect the changes.
 Intermediate documents should have additional information about what 
 properties
 are present on referenced previous documents. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3221) JournalTest may fail on machine with slow I/O

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-3221.
-

Bulk close for 1.0.19

 JournalTest may fail on machine with slow I/O
 -

 Key: OAK-3221
 URL: https://issues.apache.org/jira/browse/OAK-3221
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
 Fix For: 1.0.19


 [~reschke] reported a failure for 
 JournalTest.lastRevRecoveryJournalTestWithConcurrency() on a test machine 
 without an SSD.
 This test creates 200 threads running lastRev recovery concurrently. Each 
 thread will create a map using MapFactory. The default implementation in 1.0 
 is backed by MapDB and therefore creates quite a bit of I/O. Even on my 
 machine with an SSD the test takes 11 seconds to run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3272) DocumentMK scalability improvements

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3272:
---
Fix Version/s: 1.3.8

 DocumentMK scalability improvements
 ---

 Key: OAK-3272
 URL: https://issues.apache.org/jira/browse/OAK-3272
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Michael Marth
  Labels: scalability
 Fix For: 1.3.8


 Collector issue for tracking DocMK issues concerning scalability



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1769) Better cooperation for conflicting updates across cluster nodes

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-1769:
---
Fix Version/s: (was: 1.3.6)
   1.3.8

 Better cooperation for conflicting updates across cluster nodes
 ---

 Key: OAK-1769
 URL: https://issues.apache.org/jira/browse/OAK-1769
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: concurrency, scalability
 Fix For: 1.3.8


 Every now and then we see commit failures in a cluster when many sessions try 
 to update the same property or perform some other conflicting update.
 The current implementation will retry the merge after a delay, but chances 
 are some session on another cluster node again changed the property in the 
 meantime. This will lead to yet another retry until the limit is reached and 
 the commit fails. The conflict logic is quite unfair, because it favors the 
 winning session.
 The implementation should be improved to show a more fair behavior across 
 cluster nodes when there are conflicts caused by competing session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2592) Commit does not ensure w:majority

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2592:
---
Fix Version/s: (was: 1.3.6)
   1.3.8

 Commit does not ensure w:majority
 -

 Key: OAK-2592
 URL: https://issues.apache.org/jira/browse/OAK-2592
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: resilience, scalability
 Fix For: 1.3.8


 The MongoDocumentStore uses {{findAndModify()}} to commit a transaction. This 
 operation does not allow an application specified write concern and always 
 uses the MongoDB default write concern {{Acknowledged}}. This means a commit 
 may not make it to a majority of a replica set when the primary fails. From a 
 MongoDocumentStore perspective it may appear as if a write was successful and 
 later reverted. See also the test in OAK-1641.
 To fix this, we'd probably have to change the MongoDocumentStore to avoid 
 {{findAndModify()}} and use {{update()}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2106) Optimize reads from secondaries

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2106:
---
Fix Version/s: (was: 1.3.6)
   1.3.8

 Optimize reads from secondaries
 ---

 Key: OAK-2106
 URL: https://issues.apache.org/jira/browse/OAK-2106
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance, scalability
 Fix For: 1.3.8


 OAK-1645 introduced support for reads from secondaries under certain
 conditions. The current implementation checks the _lastRev on a potentially
 cached parent document and reads from a secondary if it has not been
 modified in the last 6 hours. This timespan is somewhat arbitrary but
 reflects the assumption that the replication lag of a secondary shouldn't
 be more than 6 hours.
 This logic should be optimized to take the actual replication lag into
 account. MongoDB provides information about the replication lag with
 the command rs.status().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2622) dynamic cache allocation

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2622:
---
Labels: performance resilience  (was: resilience)

 dynamic cache allocation
 

 Key: OAK-2622
 URL: https://issues.apache.org/jira/browse/OAK-2622
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.0.12
Reporter: Stefan Egli
  Labels: performance, resilience
 Fix For: 1.3.5


 At the moment mongoMk's various caches are configurable (OAK-2546) but other 
 than that static in terms of size. Different use-cases might require 
 different allocations of the sub caches though. And it might not always be 
 possible to find a good configuration upfront for all use cases. 
 We might be able to come up with dynamically allocating the overall cache 
 size to the different sub-caches, based on which cache is how heavily loaded 
 or how well performing for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3271) Improve DocumentMK performance

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3271:
---
Component/s: rdbmk
 mongomk

 Improve DocumentMK performance
 --

 Key: OAK-3271
 URL: https://issues.apache.org/jira/browse/OAK-3271
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Michael Marth
  Labels: performance
 Fix For: 1.3.7


 Collector issue for DocMK performance improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3271) Improve DocumentMK performance

2015-08-24 Thread Michael Marth (JIRA)
Michael Marth created OAK-3271:
--

 Summary: Improve DocumentMK performance
 Key: OAK-3271
 URL: https://issues.apache.org/jira/browse/OAK-3271
 Project: Jackrabbit Oak
  Issue Type: Improvement
Reporter: Michael Marth
 Fix For: 1.3.7


Collector issue for DocMK performance improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (OAK-3273) ColdStandby make sync start and end timestamp updates atomic

2015-08-24 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu reassigned OAK-3273:


Assignee: Alex Parvulescu

 ColdStandby make sync start and end timestamp updates atomic
 

 Key: OAK-3273
 URL: https://issues.apache.org/jira/browse/OAK-3273
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: tarmk-standby
Reporter: Valentin Olteanu
Assignee: Alex Parvulescu
Priority: Minor
 Attachments: OAK-3273.patch


 OAK-3113 introduced two fields in the ColdStandby MBean: SyncStartTimestamp 
 and SyncEndTimestamp. This is much more useful than the old 
 SecondsSinceLastSuccess, yet, there are situations in which it's hard to 
 interpret them since they are updated independently:
  - it's impossible to correlate the start with the end
  - in case of fail, the start still reflects the failed cycle
 It would be even better if the two would be updated atomically, to reflect 
 the start and end of the last successful cycle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709335#comment-14709335
 ] 

Stefan Egli commented on OAK-2844:
--

thx, just saw it on jenkins too and switched to memory store. can you pls try 
again? thx!

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709345#comment-14709345
 ] 

Alex Parvulescu commented on OAK-2844:
--

nope, looks like the fiesta still cannot start!
{code}
testLargeStartStopFiesta(org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteServiceTest)
  Time elapsed: 0 sec   ERROR!
java.lang.NullPointerException
at 
org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteServiceTest.createMK(DocumentDiscoveryLiteServiceTest.java:1025)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteServiceTest.createNodeStore(DocumentDiscoveryLiteServiceTest.java:594)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteServiceTest.createInstance(DocumentDiscoveryLiteServiceTest.java:609)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteServiceTest.testLargeStartStopFiesta(DocumentDiscoveryLiteServiceTest.java:931)
{code}

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are 

[jira] [Commented] (OAK-3265) Test failures on trunk: NodeLocalNameTest, NodeNameTest

2015-08-24 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709143#comment-14709143
 ] 

Marcel Reutegger commented on OAK-3265:
---

I think this is caused by OAK-2634, but from what I can see, in some cases the 
tests are actually at fault.

 Test failures on trunk: NodeLocalNameTest, NodeNameTest
 ---

 Key: OAK-3265
 URL: https://issues.apache.org/jira/browse/OAK-3265
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
Reporter: Michael Dürig
 Fix For: 1.3.5


 Trunk's it fail for me:
 {noformat}
 testStringLiteralInvalidName(org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest)
   Time elapsed: 0.007 sec   ERROR!
 javax.jcr.query.InvalidQueryException: java.lang.IllegalArgumentException: 
 Not a valid JCR path: [node1
   at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:142)
   at 
 org.apache.jackrabbit.oak.jcr.query.qom.QueryObjectModelImpl.execute(QueryObjectModelImpl.java:131)
   at 
 org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest.testStringLiteralInvalidName(NodeLocalNameTest.java:68)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at 
 org.apache.jackrabbit.test.AbstractJCRTest.run(AbstractJCRTest.java:464)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
 Caused by: java.lang.IllegalArgumentException: Not a valid JCR path: [node1
   at 
 org.apache.jackrabbit.oak.spi.query.PropertyValues.getOakPath(PropertyValues.java:405)
   at 
 org.apache.jackrabbit.oak.query.ast.NodeNameImpl.getName(NodeNameImpl.java:131)
   at 
 org.apache.jackrabbit.oak.query.ast.NodeLocalNameImpl.restrict(NodeLocalNameImpl.java:89)
   at 
 org.apache.jackrabbit.oak.query.ast.ComparisonImpl.restrict(ComparisonImpl.java:184)
   at 
 org.apache.jackrabbit.oak.query.ast.AndImpl.restrict(AndImpl.java:153)
   at 
 org.apache.jackrabbit.oak.query.ast.SelectorImpl.createFilter(SelectorImpl.java:389)
   at 
 org.apache.jackrabbit.oak.query.ast.SelectorImpl.prepare(SelectorImpl.java:284)
   at org.apache.jackrabbit.oak.query.QueryImpl.prepare(QueryImpl.java:591)
   at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:193)
   at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:132)
   ... 32 more
 testURILiteral(org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest)  
 Time elapsed: 0.005 sec   ERROR!
 javax.jcr.query.InvalidQueryException: java.lang.IllegalArgumentException: 
 Not a valid JCR path: http://example.com
   at 
 

[jira] [Updated] (OAK-3247) DocumentNodeStore.retrieve() should not throw IllegalArgumentException

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3247:
---
Component/s: rdbmk
 mongomk

 DocumentNodeStore.retrieve() should not throw IllegalArgumentException
 --

 Key: OAK-3247
 URL: https://issues.apache.org/jira/browse/OAK-3247
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk, rdbmk
Affects Versions: 1.3.3
Reporter: Julian Sedding
Priority: Minor
  Labels: resilience
 Fix For: 1.3.5


 {{DocumentNodeSTore#retrieve(checkpoint)}} may throw an 
 {{IllegalArgumentException}} via {{Revision.fromString(checkpoint)}}.
 The javadocs say that it returns a {{NodeState}} or {{null}}. The exception 
 prevents recovery of {{AsyncIndexUpdate}} from a bad recorded checkpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3247) DocumentNodeStore.retrieve() should not throw IllegalArgumentException

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3247:
---
Labels: resilience  (was: )

 DocumentNodeStore.retrieve() should not throw IllegalArgumentException
 --

 Key: OAK-3247
 URL: https://issues.apache.org/jira/browse/OAK-3247
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk, rdbmk
Affects Versions: 1.3.3
Reporter: Julian Sedding
Priority: Minor
  Labels: resilience
 Fix For: 1.3.5


 {{DocumentNodeSTore#retrieve(checkpoint)}} may throw an 
 {{IllegalArgumentException}} via {{Revision.fromString(checkpoint)}}.
 The javadocs say that it returns a {{NodeState}} or {{null}}. The exception 
 prevents recovery of {{AsyncIndexUpdate}} from a bad recorded checkpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2929) Parent of unseen children must not be removable

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2929:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 Parent of unseen children must not be removable
 ---

 Key: OAK-2929
 URL: https://issues.apache.org/jira/browse/OAK-2929
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.0.13, 1.2
Reporter: Vikas Saurabh
Assignee: Marcel Reutegger
Priority: Minor
  Labels: concurrency, technical_debt
 Fix For: 1.3.6

 Attachments: IgnoredTestCase.patch


 With OAK-2673, it's now possible to have hidden intermediate nodes created 
 concurrently.
 So, a scenario like:
 {noformat}
 start - /:hidden
 N1 creates /:hiddent/parent/node1
 N2 creates /:hidden/parent/node2
 {noformat}
 is allowed.
 But, if N2's creation of {{parent}} got persisted later than that on N1, then 
 N2 is currently able to delete {{parent}} even though there's {{node1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2126) retry strategy for failed JDBC requests

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2126:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 retry strategy for failed JDBC requests
 ---

 Key: OAK-2126
 URL: https://issues.apache.org/jira/browse/OAK-2126
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: rdbmk
Reporter: Julian Reschke
  Labels: resilience
 Fix For: 1.3.6


 Discussion: should we have a retry strategy for failed commits?
 Things to consider:
 - does this potentially interfere with other retry strategies (either on a 
 lower layer or in the DocumentMK)?
 - what failure scenarios would it address?
 - how to test those?
 - how to configure it?
 - what would be good defaults? (number of retries, interval)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3079) LastRevRecoveryAgent can update _lastRev of children but not the root

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3079:
---
Fix Version/s: (was: 1.4)
   1.3.6

 LastRevRecoveryAgent can update _lastRev of children but not the root
 -

 Key: OAK-3079
 URL: https://issues.apache.org/jira/browse/OAK-3079
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.3.2
Reporter: Stefan Egli
  Labels: resilience
 Fix For: 1.3.6

 Attachments: NonRootUpdatingLastRevRecoveryTest.java


 As mentioned in 
 [OAK-2131|https://issues.apache.org/jira/browse/OAK-2131?focusedCommentId=14616391page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14616391]
  there can be a situation wherein the LastRevRecoveryAgent updates some nodes 
 in the tree but not the root. This seems to happen due to OAK-2131's change 
 in the Commit.applyToCache (where paths to update are collected via 
 tracker.track): in that code, paths which are non-root and for which no 
 content has changed (and mind you, a content change includes adding _deleted, 
 which happens by default for nodes with children) are not 'tracked', ie for 
 those the _lastRev is not update by subsequent backgroundUpdate operations - 
 leaving them 'old/out-of-date'. This seems correct as per 
 description/intention of OAK-2131 where the last revision can be determined 
 via the commitRoot of the parent. But it has the effect that the 
 LastRevRecoveryAgent then finds those intermittent nodes to be updated while 
 as the root has already been updated (which is at first glance non-intuitive).
 I'll attach a test case to reproduce this.
 Perhaps this is a bug, perhaps it's ok. [~mreutegg] wdyt?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3036) DocumentRootBuilder: revisit update.limit default

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3036:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 DocumentRootBuilder: revisit update.limit default
 -

 Key: OAK-3036
 URL: https://issues.apache.org/jira/browse/OAK-3036
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Julian Reschke
  Labels: resilience
 Fix For: 1.3.6


 update.limit decides whether a commit is persisted using a branch or not. The 
 default is 1 (and can be overridden using the system property).
 A typical call pattern in JCR is to persist batches of ~1024 nodes. These 
 translate to more than 1 changes (see PackageImportIT), due to JCR 
 properties, and also indexing commit hooks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3070:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 Use a lower bound in VersionGC query to avoid checking unmodified once 
 deleted docs
 ---

 Key: OAK-3070
 URL: https://issues.apache.org/jira/browse/OAK-3070
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Chetan Mehrotra
Assignee: Vikas Saurabh
  Labels: performance, resilience
 Fix For: 1.3.6

 Attachments: OAK-3070.patch


 As part of OAK-3062 [~mreutegg] suggested
 {quote}
 As a further optimization we could also limit the lower bound of the _modified
 range. The revision GC does not need to check documents with a _deletedOnce
 again if they were not modified after the last successful GC run. If they
 didn't change and were considered existing during the last run, then they
 must still exist in the current GC run. To make this work, we'd need to
 track the last successful revision GC run. 
 {quote}
 Lowest last validated _modified can be possibly saved in settings collection 
 and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2622) dynamic cache allocation

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2622:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 dynamic cache allocation
 

 Key: OAK-2622
 URL: https://issues.apache.org/jira/browse/OAK-2622
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Affects Versions: 1.0.12
Reporter: Stefan Egli
  Labels: performance, resilience
 Fix For: 1.3.6


 At the moment mongoMk's various caches are configurable (OAK-2546) but other 
 than that static in terms of size. Different use-cases might require 
 different allocations of the sub caches though. And it might not always be 
 possible to find a good configuration upfront for all use cases. 
 We might be able to come up with dynamically allocating the overall cache 
 size to the different sub-caches, based on which cache is how heavily loaded 
 or how well performing for example.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1575) DocumentNS: Implement refined conflict resolution for addExistingNode conflicts

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-1575:
---
Fix Version/s: (was: 1.4)
   1.3.6

 DocumentNS: Implement refined conflict resolution for addExistingNode 
 conflicts
 ---

 Key: OAK-1575
 URL: https://issues.apache.org/jira/browse/OAK-1575
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: mongomk
Reporter: Michael Dürig
Assignee: Marcel Reutegger
  Labels: resilience
 Fix For: 1.3.6


 Implement refined conflict resolution for addExistingNode conflicts as 
 defined in the parent issue for the document NS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2920) RDBDocumentStore: fail init when database config seems to be inadequate

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2920:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 RDBDocumentStore: fail init when database config seems to be inadequate
 ---

 Key: OAK-2920
 URL: https://issues.apache.org/jira/browse/OAK-2920
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: rdbmk
Reporter: Julian Reschke
Priority: Minor
  Labels: resilience
 Fix For: 1.3.6


 It has been suggested that the implementation should fail to start (rather 
 than warn) when it detects a DB configuration that is likely to cause 
 problems (such as wrt character encoding or collation sequences)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2066) DocumentStore API: batch create, but no batch update

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2066:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 DocumentStore API: batch create, but no batch update
 

 Key: OAK-2066
 URL: https://issues.apache.org/jira/browse/OAK-2066
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Julian Reschke
  Labels: performance
 Fix For: 1.3.7


 The DocumentStore API currently has a call for creating many nodes at once.
 However, this will sometimes fail for large save operations in JCR, because 
 in the DS persistence, JCR-deleted nodes are still present (with a deleted 
 flag). This causes two subsequent sequences of
 1) create test container
 2) create many child nodes
 3) remove test container
 to behave very differently, depending on whether the test container is 
 created for the first time or not.
 (see CreateManyChildNodesTest)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1322) Reduce calls to MongoDB

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-1322:
---
Fix Version/s: (was: 1.3.5)
   1.3.7

 Reduce calls to MongoDB
 ---

 Key: OAK-1322
 URL: https://issues.apache.org/jira/browse/OAK-1322
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.7

 Attachments: OAK-1322-mreutegg.patch


 As discussed with Chetan offline we'd like to reduce the number of calls to 
 MongoDB when content is added to the repository with a filevault package 
 import.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3259) Optimize NodeDocument.getNewestRevision()

2015-08-24 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-3259.
---
Resolution: Fixed

Introduced a new method {{NodeDocument.getAllChanges()}} which returns the 
{{_revisions}} and {{_commitRoot}} revisions in descending order. The 
implementation performs lazy loading of previous documents as needed.

Implemented in trunk: http://svn.apache.org/r1697373

 Optimize NodeDocument.getNewestRevision()
 -

 Key: OAK-3259
 URL: https://issues.apache.org/jira/browse/OAK-3259
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.5


 Most of the time NodeDocument.getNewestRevision() is able to quickly identify 
 the newest revision, but sometimes the code falls to a more expensive 
 calculation, which attempts to read through available {{_revisions}} and 
 {{_commitRoot}} entries. If either of those maps are empty, the method will 
 go through the entire revision history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3273) ColdStandby JMX Status

2015-08-24 Thread Valentin Olteanu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709242#comment-14709242
 ] 

Valentin Olteanu commented on OAK-3273:
---

[~alex.parvulescu], could you please take a look and review the patch I've 
created for this issue?

 ColdStandby JMX Status 
 ---

 Key: OAK-3273
 URL: https://issues.apache.org/jira/browse/OAK-3273
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: tarmk-standby
Reporter: Valentin Olteanu
Priority: Minor
 Attachments: OAK-3273.patch


 OAK-3113 introduced two fields in the ColdStandby MBean: SyncStartTimestamp 
 and SyncEndTimestamp. This is much more useful than the old 
 SecondsSinceLastSuccess, yet, there are situations in which it's hard to 
 interpret them since they are updated independently:
  - it's impossible to correlate the start with the end
  - in case of fail, the start still reflects the failed cycle
 It would be even better if the two would be updated atomically, to reflect 
 the start and end of the last successful cycle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2929) Parent of unseen children must not be removable

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2929:
---
Priority: Minor  (was: Major)

 Parent of unseen children must not be removable
 ---

 Key: OAK-2929
 URL: https://issues.apache.org/jira/browse/OAK-2929
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.0.13, 1.2
Reporter: Vikas Saurabh
Assignee: Marcel Reutegger
Priority: Minor
  Labels: concurrency, technical_debt
 Fix For: 1.3.5

 Attachments: IgnoredTestCase.patch


 With OAK-2673, it's now possible to have hidden intermediate nodes created 
 concurrently.
 So, a scenario like:
 {noformat}
 start - /:hidden
 N1 creates /:hiddent/parent/node1
 N2 creates /:hidden/parent/node2
 {noformat}
 is allowed.
 But, if N2's creation of {{parent}} got persisted later than that on N1, then 
 N2 is currently able to delete {{parent}} even though there's {{node1}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3018) Use batch-update in backgroundWrite

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3018:
---
Labels: performance  (was: )

 Use batch-update in backgroundWrite
 ---

 Key: OAK-3018
 URL: https://issues.apache.org/jira/browse/OAK-3018
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Stefan Egli
  Labels: performance
 Fix For: 1.3.5


 (From an earlier [post on the 
 list|http://markmail.org/thread/mkrvhkfabit4osli]) The 
 DocumentNodeStore.backgroundWrite goes through the heavy work of updating the 
 lastRev for all pending changes and does so in a hierarchical-depth-first 
 manner. Unfortunately, if the pending changes all come from separate commits 
 (as does not sound so unlikely), the updates are sent in individual update 
 calls to mongo (whenever the lastRev differs). Which, if there are many 
 changes, results in many calls to mongo.
 OAK-2066 is about extending the DocumentStore API with a batch-update method. 
 That one, once available, should thus be used in the {{backgroundWrite}} as 
 well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3265) Test failures on trunk: NodeLocalNameTest, NodeNameTest

2015-08-24 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709162#comment-14709162
 ] 

Marcel Reutegger commented on OAK-3265:
---

Added the failing tests to the known issues list: http://svn.apache.org/r1697363

 Test failures on trunk: NodeLocalNameTest, NodeNameTest
 ---

 Key: OAK-3265
 URL: https://issues.apache.org/jira/browse/OAK-3265
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: jcr
Reporter: Michael Dürig
 Fix For: 1.3.5


 Trunk's it fail for me:
 {noformat}
 testStringLiteralInvalidName(org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest)
   Time elapsed: 0.007 sec   ERROR!
 javax.jcr.query.InvalidQueryException: java.lang.IllegalArgumentException: 
 Not a valid JCR path: [node1
   at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:142)
   at 
 org.apache.jackrabbit.oak.jcr.query.qom.QueryObjectModelImpl.execute(QueryObjectModelImpl.java:131)
   at 
 org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest.testStringLiteralInvalidName(NodeLocalNameTest.java:68)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at junit.framework.TestCase.runTest(TestCase.java:168)
   at junit.framework.TestCase.runBare(TestCase.java:134)
   at junit.framework.TestResult$1.protect(TestResult.java:110)
   at junit.framework.TestResult.runProtected(TestResult.java:128)
   at junit.framework.TestResult.run(TestResult.java:113)
   at junit.framework.TestCase.run(TestCase.java:124)
   at 
 org.apache.jackrabbit.test.AbstractJCRTest.run(AbstractJCRTest.java:464)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at junit.framework.TestSuite.runTest(TestSuite.java:243)
   at junit.framework.TestSuite.run(TestSuite.java:238)
   at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
   at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
   at 
 org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
   at 
 org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
   at 
 org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
 Caused by: java.lang.IllegalArgumentException: Not a valid JCR path: [node1
   at 
 org.apache.jackrabbit.oak.spi.query.PropertyValues.getOakPath(PropertyValues.java:405)
   at 
 org.apache.jackrabbit.oak.query.ast.NodeNameImpl.getName(NodeNameImpl.java:131)
   at 
 org.apache.jackrabbit.oak.query.ast.NodeLocalNameImpl.restrict(NodeLocalNameImpl.java:89)
   at 
 org.apache.jackrabbit.oak.query.ast.ComparisonImpl.restrict(ComparisonImpl.java:184)
   at 
 org.apache.jackrabbit.oak.query.ast.AndImpl.restrict(AndImpl.java:153)
   at 
 org.apache.jackrabbit.oak.query.ast.SelectorImpl.createFilter(SelectorImpl.java:389)
   at 
 org.apache.jackrabbit.oak.query.ast.SelectorImpl.prepare(SelectorImpl.java:284)
   at org.apache.jackrabbit.oak.query.QueryImpl.prepare(QueryImpl.java:591)
   at 
 org.apache.jackrabbit.oak.query.QueryEngineImpl.executeQuery(QueryEngineImpl.java:193)
   at 
 org.apache.jackrabbit.oak.jcr.query.QueryManagerImpl.executeQuery(QueryManagerImpl.java:132)
   ... 32 more
 testURILiteral(org.apache.jackrabbit.test.api.query.qom.NodeLocalNameTest)  
 Time elapsed: 0.005 sec   ERROR!
 javax.jcr.query.InvalidQueryException: java.lang.IllegalArgumentException: 
 Not a valid JCR path: http://example.com
   at 
 

[jira] [Updated] (OAK-2492) Flag Document having many children

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2492:
---
Labels: performance  (was: )

 Flag Document having many children
 --

 Key: OAK-2492
 URL: https://issues.apache.org/jira/browse/OAK-2492
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
  Labels: performance
 Fix For: 1.4


 Current DocumentMK logic while performing a diff for child nodes works as 
 below
 # Get children for _before_ revision upto MANY_CHILDREN_THRESHOLD (which 
 defaults to 50). Further note that current logic of fetching children nodes 
 also add children {{NodeDocument}} to {{Document}} cache and also reads the 
 complete Document for those children
 # Get children for _after_ revision with limits as above
 # If the child list is complete then it does a direct diff on the fetched 
 children
 # if the list is not complete i.e. number of children are more than the 
 threshold then it for a query based diff (also see OAK-1970)
 So in those cases where number of children are large then all work done in #1 
 above is wasted and should be avoided. To do that we can mark those parent 
 nodes which have many children via special flag like {{_manyChildren}}. One 
 such nodes are marked the diff logic can check for the flag and skip the work 
 done in #1
 This is kind of similar to way we mark nodes which have at least one child 
 (OAK-1117)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3272) DocumentMK scalability improvements

2015-08-24 Thread Michael Marth (JIRA)
Michael Marth created OAK-3272:
--

 Summary: DocumentMK scalability improvements
 Key: OAK-3272
 URL: https://issues.apache.org/jira/browse/OAK-3272
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Michael Marth


Collector issue for tracking DocMK issues concerning scalability



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3259) Optimize NodeDocument.getNewestRevision()

2015-08-24 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3259:
--
Labels: performance  (was: resilience)

 Optimize NodeDocument.getNewestRevision()
 -

 Key: OAK-3259
 URL: https://issues.apache.org/jira/browse/OAK-3259
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.5


 Most of the time NodeDocument.getNewestRevision() is able to quickly identify 
 the newest revision, but sometimes the code falls to a more expensive 
 calculation, which attempts to read through available {{_revisions}} and 
 {{_commitRoot}} entries. If either of those maps are empty, the method will 
 go through the entire revision history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709299#comment-14709299
 ] 

Alex Parvulescu commented on OAK-2844:
--

fyi this fails the trunk build on my machine for 
_DocumentDiscoveryLiteServiceTest_

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3274) DefaultSyncConfigImpl: add information to user.membershipExpTime about minimum expiration time

2015-08-24 Thread Konrad Windszus (JIRA)
Konrad Windszus created OAK-3274:


 Summary: DefaultSyncConfigImpl: add information to 
user.membershipExpTime about minimum expiration time
 Key: OAK-3274
 URL: https://issues.apache.org/jira/browse/OAK-3274
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: auth-external
Affects Versions: 1.3.5
Reporter: Konrad Windszus
Priority: Trivial


The {{user.membershipExpTime}} property cannot have a value which is less than 
the value of the {{user.expirationTime}}. Please add this information to the 
OSGi property description. Otherwise it is hard to debug issues here.

The reason why {{user.expirationTime}} must be less or equal to 
{{user.membershipExpTime}} is in 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L421.
Since {{syncMembership}} is only called after the {{user.expirationTime}} 
guard, it cannot be updated more often than the user itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3259) Optimize NodeDocument.getNewestRevision()

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3259:
---
Labels: resilience  (was: )

 Optimize NodeDocument.getNewestRevision()
 -

 Key: OAK-3259
 URL: https://issues.apache.org/jira/browse/OAK-3259
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: resilience
 Fix For: 1.3.5


 Most of the time NodeDocument.getNewestRevision() is able to quickly identify 
 the newest revision, but sometimes the code falls to a more expensive 
 calculation, which attempts to read through available {{_revisions}} and 
 {{_commitRoot}} entries. If either of those maps are empty, the method will 
 go through the entire revision history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2986) RDB: switch to tomcat datasource implementation

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2986:
---
Labels: resilience  (was: )

 RDB: switch to tomcat datasource implementation 
 

 Key: OAK-2986
 URL: https://issues.apache.org/jira/browse/OAK-2986
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: rdbmk
Affects Versions: 1.2.2, 1.0.15
Reporter: Julian Reschke
Assignee: Julian Reschke
  Labels: resilience
 Fix For: 1.3.5

 Attachments: OAK-2986.diff, OAK-2986.diff


 See https://people.apache.org/~fhanik/jdbc-pool/jdbc-pool.html.
 In addition, this is the datasource used in Sling's datasource service, so 
 it's closer to what people will use in practice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3273) ColdStandby make sync start and end timestamp updates atomic

2015-08-24 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu resolved OAK-3273.
--
   Resolution: Fixed
Fix Version/s: 1.3.5

thanks for the patch [~volteanu]! applied at http://svn.apache.org/r1697383

 ColdStandby make sync start and end timestamp updates atomic
 

 Key: OAK-3273
 URL: https://issues.apache.org/jira/browse/OAK-3273
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: tarmk-standby
Reporter: Valentin Olteanu
Assignee: Alex Parvulescu
Priority: Minor
 Fix For: 1.3.5

 Attachments: OAK-3273.patch


 OAK-3113 introduced two fields in the ColdStandby MBean: SyncStartTimestamp 
 and SyncEndTimestamp. This is much more useful than the old 
 SecondsSinceLastSuccess, yet, there are situations in which it's hard to 
 interpret them since they are updated independently:
  - it's impossible to correlate the start with the end
  - in case of fail, the start still reflects the failed cycle
 It would be even better if the two would be updated atomically, to reflect 
 the start and end of the last successful cycle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3275) DefaultSyncConfig: User membership expiration time not working under some circumstances

2015-08-24 Thread Konrad Windszus (JIRA)
Konrad Windszus created OAK-3275:


 Summary: DefaultSyncConfig: User membership expiration time not 
working under some circumstances
 Key: OAK-3275
 URL: https://issues.apache.org/jira/browse/OAK-3275
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: auth-external
Affects Versions: 1.3.5
Reporter: Konrad Windszus


Currently the user expiration and the user membership expiration can be set 
independently of each other in the OSGi configuration for the 
{{DefaultSyncConfigImpl}}.

In reality this is not true though:
Not only can the membership not be updated more often than the other user 
properties (compare with OAK-3274). 
Also the property which is used to mark the last successfull sync is the same 
for both synchronisations 
(https://github.com/apache/jackrabbit-oak/blob/trunk/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L433
 and 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L422).

That is a problem if e.g. the user expiration time is 10 minutes but the user 
membership expiration time is 1 hour. Then every 10 minutes the property 
{{rep:lastSynced}} would be updated to the current time and the expiration 
check for the membership expiration would never return true 
(https://github.com/apache/jackrabbit-oak/blob/trunk/oak-auth-external/src/main/java/org/apache/jackrabbit/oak/spi/security/authentication/external/basic/DefaultSyncContext.java#L433).
 Therefore memberships would never be updated!

I suggest to completely get rid of user membership expiration time and only 
have one expiration time for both the user properties and the memberships.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3235) Deadlock when closing a concurrently used FileStore

2015-08-24 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709147#comment-14709147
 ] 

Alex Parvulescu commented on OAK-3235:
--

patch looks good!

bq. I'd rather not remove the synchronized from writeMapBucket() for now 
though. (We can discuss doing so but lets move it out of this issue).
agreed, I would not introduce this change with this patch either. I would 
rather tackle this as a part of (a subtask of) OAK-1828

 Deadlock when closing a concurrently used FileStore
 ---

 Key: OAK-3235
 URL: https://issues.apache.org/jira/browse/OAK-3235
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segmentmk
Affects Versions: 1.3.3
Reporter: Francesco Mari
Assignee: Michael Dürig
Priority: Critical
 Fix For: 1.3.5

 Attachments: OAK-3235-01.patch


 A deadlock was detected while stopping the {{SegmentCompactionIT}} using the 
 exposed MBean.
 {noformat}
 Found one Java-level deadlock:
 =
 pool-1-thread-23:
   waiting to lock monitor 0x7fa8cf1f0488 (object 0x0007a0081e48, a 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore),
   which is held by main
 main:
   waiting to lock monitor 0x7fa8cc015ff8 (object 0x0007a011f750, a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter),
   which is held by pool-1-thread-23
 Java stack information for the threads listed above:
 ===
 pool-1-thread-23:
   at 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore.writeSegment(FileStore.java:948)
   - waiting to lock 0x0007a0081e48 (a 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.flush(SegmentWriter.java:228)
   - locked 0x0007a011f750 (a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.prepare(SegmentWriter.java:329)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeListBucket(SegmentWriter.java:447)
   - locked 0x0007a011f750 (a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeList(SegmentWriter.java:698)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1190)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 

[jira] [Updated] (OAK-3070) Use a lower bound in VersionGC query to avoid checking unmodified once deleted docs

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3070:
---
Labels: performance resilience  (was: )

 Use a lower bound in VersionGC query to avoid checking unmodified once 
 deleted docs
 ---

 Key: OAK-3070
 URL: https://issues.apache.org/jira/browse/OAK-3070
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Chetan Mehrotra
Assignee: Vikas Saurabh
  Labels: performance, resilience
 Fix For: 1.3.5

 Attachments: OAK-3070.patch


 As part of OAK-3062 [~mreutegg] suggested
 {quote}
 As a further optimization we could also limit the lower bound of the _modified
 range. The revision GC does not need to check documents with a _deletedOnce
 again if they were not modified after the last successful GC run. If they
 didn't change and were considered existing during the last run, then they
 must still exist in the current GC run. To make this work, we'd need to
 track the last successful revision GC run. 
 {quote}
 Lowest last validated _modified can be possibly saved in settings collection 
 and reused for next run



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3036) DocumentRootBuilder: revisit update.limit default

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3036:
---
Labels: resilience  (was: )

 DocumentRootBuilder: revisit update.limit default
 -

 Key: OAK-3036
 URL: https://issues.apache.org/jira/browse/OAK-3036
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Julian Reschke
  Labels: resilience
 Fix For: 1.3.5


 update.limit decides whether a commit is persisted using a branch or not. The 
 default is 1 (and can be overridden using the system property).
 A typical call pattern in JCR is to persist batches of ~1024 nodes. These 
 translate to more than 1 changes (see PackageImportIT), due to JCR 
 properties, and also indexing commit hooks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3079) LastRevRecoveryAgent can update _lastRev of children but not the root

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3079:
---
Labels: resilience  (was: )

 LastRevRecoveryAgent can update _lastRev of children but not the root
 -

 Key: OAK-3079
 URL: https://issues.apache.org/jira/browse/OAK-3079
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.3.2
Reporter: Stefan Egli
  Labels: resilience
 Fix For: 1.4

 Attachments: NonRootUpdatingLastRevRecoveryTest.java


 As mentioned in 
 [OAK-2131|https://issues.apache.org/jira/browse/OAK-2131?focusedCommentId=14616391page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14616391]
  there can be a situation wherein the LastRevRecoveryAgent updates some nodes 
 in the tree but not the root. This seems to happen due to OAK-2131's change 
 in the Commit.applyToCache (where paths to update are collected via 
 tracker.track): in that code, paths which are non-root and for which no 
 content has changed (and mind you, a content change includes adding _deleted, 
 which happens by default for nodes with children) are not 'tracked', ie for 
 those the _lastRev is not update by subsequent backgroundUpdate operations - 
 leaving them 'old/out-of-date'. This seems correct as per 
 description/intention of OAK-2131 where the last revision can be determined 
 via the commitRoot of the parent. But it has the effect that the 
 LastRevRecoveryAgent then finds those intermittent nodes to be updated while 
 as the root has already been updated (which is at first glance non-intuitive).
 I'll attach a test case to reproduce this.
 Perhaps this is a bug, perhaps it's ok. [~mreutegg] wdyt?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3270) Improve DocumentMK resilience

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3270:
---
Fix Version/s: 1.3.6

 Improve DocumentMK resilience
 -

 Key: OAK-3270
 URL: https://issues.apache.org/jira/browse/OAK-3270
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Michael Marth
  Labels: resilience
 Fix For: 1.3.6


 Collection of DocMK resilience improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3256) Release Oak 1.0.19

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella closed OAK-3256.
-

 Release Oak 1.0.19
 --

 Key: OAK-3256
 URL: https://issues.apache.org/jira/browse/OAK-3256
 Project: Jackrabbit Oak
  Issue Type: Task
Reporter: Davide Giannella
Assignee: Davide Giannella





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3273) ColdStandby JMX Status

2015-08-24 Thread Valentin Olteanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentin Olteanu updated OAK-3273:
--
Attachment: OAK-3273.patch

 ColdStandby JMX Status 
 ---

 Key: OAK-3273
 URL: https://issues.apache.org/jira/browse/OAK-3273
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: tarmk-standby
Reporter: Valentin Olteanu
Priority: Minor
 Attachments: OAK-3273.patch


 OAK-3113 introduced two fields in the ColdStandby MBean: SyncStartTimestamp 
 and SyncEndTimestamp. This is much more useful than the old 
 SecondsSinceLastSuccess, yet, there are situations in which it's hard to 
 interpret them since they are updated independently:
  - it's impossible to correlate the start with the end
  - in case of fail, the start still reflects the failed cycle
 It would be even better if the two would be updated atomically, to reflect 
 the start and end of the last successful cycle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3273) ColdStandby JMX Status

2015-08-24 Thread Valentin Olteanu (JIRA)
Valentin Olteanu created OAK-3273:
-

 Summary: ColdStandby JMX Status 
 Key: OAK-3273
 URL: https://issues.apache.org/jira/browse/OAK-3273
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: tarmk-standby
Reporter: Valentin Olteanu
Priority: Minor


OAK-3113 introduced two fields in the ColdStandby MBean: SyncStartTimestamp and 
SyncEndTimestamp. This is much more useful than the old 
SecondsSinceLastSuccess, yet, there are situations in which it's hard to 
interpret them since they are updated independently:
 - it's impossible to correlate the start with the end
 - in case of fail, the start still reflects the failed cycle

It would be even better if the two would be updated atomically, to reflect the 
start and end of the last successful cycle. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3256) Release Oak 1.0.19

2015-08-24 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella resolved OAK-3256.
---
Resolution: Fixed

 Release Oak 1.0.19
 --

 Key: OAK-3256
 URL: https://issues.apache.org/jira/browse/OAK-3256
 Project: Jackrabbit Oak
  Issue Type: Task
Reporter: Davide Giannella
Assignee: Davide Giannella





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709349#comment-14709349
 ] 

Stefan Egli commented on OAK-2844:
--

[~alex.parvulescu], aah, typo nr 2 .. :S. fixed. or so I hope ;)

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709356#comment-14709356
 ] 

Marcel Reutegger commented on OAK-2844:
---

It also looks like DocumentDiscoveryLiteServiceTest messes up the build because 
it changes the system property 'user.dir'.

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709466#comment-14709466
 ] 

Stefan Egli commented on OAK-2844:
--

FYI: reactivated the test (http://svn.apache.org/r1697438) - that user.dir was 
no longer needed and a left-over of initial prototyping. 

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3259) Optimize NodeDocument.getNewestRevision()

2015-08-24 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709386#comment-14709386
 ] 

Marcel Reutegger commented on OAK-3259:
---

Added a cluster test: http://svn.apache.org/r1697410

 Optimize NodeDocument.getNewestRevision()
 -

 Key: OAK-3259
 URL: https://issues.apache.org/jira/browse/OAK-3259
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
  Labels: performance
 Fix For: 1.3.5


 Most of the time NodeDocument.getNewestRevision() is able to quickly identify 
 the newest revision, but sometimes the code falls to a more expensive 
 calculation, which attempts to read through available {{_revisions}} and 
 {{_commitRoot}} entries. If either of those maps are empty, the method will 
 go through the entire revision history.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-2875) Namespaces keep references to old node states

2015-08-24 Thread Alex Parvulescu (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu resolved OAK-2875.
--
Resolution: Fixed

fixed with http://svn.apache.org/r1697423

 Namespaces keep references to old node states
 -

 Key: OAK-2875
 URL: https://issues.apache.org/jira/browse/OAK-2875
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: core, jcr
Reporter: Alex Parvulescu
Assignee: Alex Parvulescu
 Fix For: 1.3.5

 Attachments: OAK-2875-v1.patch, OAK-2875-v2.patch


 As described on the parent issue OA2849, the session namespaces keep a 
 reference to a Tree instance which will make GC inefficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3235) Deadlock when closing a concurrently used FileStore

2015-08-24 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709347#comment-14709347
 ] 

Francesco Mari commented on OAK-3235:
-

{{SegmentWriter.flush()}} is not so different in the 1.0 and 1.2 branches, 
backporting shouldn't be a problem.

 Deadlock when closing a concurrently used FileStore
 ---

 Key: OAK-3235
 URL: https://issues.apache.org/jira/browse/OAK-3235
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segmentmk
Affects Versions: 1.3.3
Reporter: Francesco Mari
Assignee: Michael Dürig
Priority: Critical
 Fix For: 1.3.5

 Attachments: OAK-3235-01.patch


 A deadlock was detected while stopping the {{SegmentCompactionIT}} using the 
 exposed MBean.
 {noformat}
 Found one Java-level deadlock:
 =
 pool-1-thread-23:
   waiting to lock monitor 0x7fa8cf1f0488 (object 0x0007a0081e48, a 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore),
   which is held by main
 main:
   waiting to lock monitor 0x7fa8cc015ff8 (object 0x0007a011f750, a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter),
   which is held by pool-1-thread-23
 Java stack information for the threads listed above:
 ===
 pool-1-thread-23:
   at 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore.writeSegment(FileStore.java:948)
   - waiting to lock 0x0007a0081e48 (a 
 org.apache.jackrabbit.oak.plugins.segment.file.FileStore)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.flush(SegmentWriter.java:228)
   - locked 0x0007a011f750 (a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.prepare(SegmentWriter.java:329)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeListBucket(SegmentWriter.java:447)
   - locked 0x0007a011f750 (a 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeList(SegmentWriter.java:698)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1190)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter$2.childNodeChanged(SegmentWriter.java:1135)
   at 
 org.apache.jackrabbit.oak.plugins.memory.ModifiedNodeState.compareAgainstBaseState(ModifiedNodeState.java:400)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1126)
   at 
 org.apache.jackrabbit.oak.plugins.segment.SegmentWriter.writeNode(SegmentWriter.java:1154)
   at 
 

[jira] [Commented] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Stefan Egli (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709369#comment-14709369
 ] 

Stefan Egli commented on OAK-2844:
--

oups, thx for spotting [~mreutegg]!! disabled the test for now and will find an 
alternative (http://svn.apache.org/r1697407)

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3261) consider existing locks when creating new ones

2015-08-24 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3261:

Attachment: OAK-3261.diff

proposed patch

 consider existing locks when creating new ones
 --

 Key: OAK-3261
 URL: https://issues.apache.org/jira/browse/OAK-3261
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: jcr
Affects Versions: 1.2.3, 1.3.3, 1.0.18
Reporter: Julian Reschke
Assignee: Julian Reschke
 Fix For: 1.4

 Attachments: OAK-3261.diff


 When creating new locks, existing locks need to be checked:
 - on ancestor nodes, when deep locks
 - on descendant nodes
 (Note that the check on descendant nodes might be costly as long as we have 
 to walk to whole subtree)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-2844) Introducing a simple document-based discovery-light service (to circumvent documentMk's eventual consistency delays)

2015-08-24 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved OAK-2844.
--
Resolution: Fixed

introduced in http://svn.apache.org/r1697355 (in trunk)
feedback incorporated, thx!

 Introducing a simple document-based discovery-light service (to circumvent 
 documentMk's eventual consistency delays)
 

 Key: OAK-2844
 URL: https://issues.apache.org/jira/browse/OAK-2844
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: mongomk
Reporter: Stefan Egli
Assignee: Stefan Egli
  Labels: resilience
 Fix For: 1.3.5

 Attachments: InstanceStateChangeListener.java, OAK-2844.WIP-02.patch, 
 OAK-2844.patch, OAK-2844.v3.patch, OAK-2844.v4.patch


 When running discovery.impl on a mongoMk-backed jcr repository, there are 
 risks of hitting problems such as described in SLING-3432 
 pseudo-network-partitioning: this happens when a jcr-level heartbeat does 
 not reach peers within the configured heartbeat timeout - it then treats that 
 affected instance as dead, removes it from the topology, and continues with 
 the remainings, potentially electing a new leader, running the risk of 
 duplicate leaders. This happens when delays in mongoMk grow larger than the 
 (configured) heartbeat timeout. These problems ultimately are due to the 
 'eventual consistency' nature of, not only mongoDB, but more so of mongoMk. 
 The only alternative so far is to increase the heartbeat timeout to match the 
 expected or measured delays that mongoMk can produce (under say given 
 load/performance scenarios).
 Assuming that mongoMk will always carry a risk of certain delays and a 
 maximum, reasonable (for discovery.impl timeout that is) maximum cannot be 
 guaranteed, a better solution is to provide discovery with more 'real-time' 
 like information and/or privileged access to mongoDb.
 Here's a summary of alternatives that have so far been floating around as a 
 solution to circumvent eventual consistency:
  # expose existing (jmx) information about active 'clusterIds' - this has 
 been proposed in SLING-4603. The pros: reuse of existing functionality. The 
 cons: going via jmx, binding of exposed functionality as 'to be maintained 
 API'
  # expose a plain mongo db/collection (via osgi injection) such that a higher 
 (sling) level discovery could directly write heartbeats there. The pros: 
 heartbeat latency would be minimal (assuming the collection is not sharded). 
 The cons: exposes a mongo db/collection potentially also to anyone else, with 
 the risk of opening up to unwanted possibilities
  # introduce a simple 'discovery-light' API to oak which solely provides 
 information about which instances are active in a cluster. The implementation 
 of this is not exposed. The pros: no need to expose a mongoDb/collection, 
 allows any other jmx-functionality to remain unchanged. The cons: a new API 
 that must be maintained
 This ticket is about the 3rd option, about a new mongo-based discovery-light 
 service that is introduced to oak. The functionality in short:
  * it defines a 'local instance id' that is non-persisted, ie can change at 
 each bundle activation.
  * it defines a 'view id' that uniquely identifies a particular incarnation 
 of a 'cluster view/state' (which is: a list of active instance ids)
  * and it defines a list of active instance ids
  * the above attributes are passed to interested components via a listener 
 that can be registered. that listener is called whenever the discovery-light 
 notices the cluster view has changed.
 While the actual implementation could in fact be based on the existing 
 {{getActiveClusterNodes()}} {{getClusterId()}} of the 
 {{DocumentNodeStoreMBean}}, the suggestion is to not fiddle with that part, 
 as that has dependencies to other logic. But instead, the suggestion is to 
 create a dedicated, other, collection ('discovery') where heartbeats as well 
 as the currentView are stored.
 Will attach a suggestion for an initial version of this for review.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3267) Add discovery-lite descriptor for segmentNodeStore

2015-08-24 Thread Stefan Egli (JIRA)
Stefan Egli created OAK-3267:


 Summary: Add discovery-lite descriptor for segmentNodeStore
 Key: OAK-3267
 URL: https://issues.apache.org/jira/browse/OAK-3267
 Project: Jackrabbit Oak
  Issue Type: Task
Affects Versions: 1.3.4
Reporter: Stefan Egli
Assignee: Stefan Egli
 Fix For: 1.3.5


With OAK-2844 the DocumentNodeStore now exposes a repository descriptor 
'oak.discoverylite.clusterview' - this should also be done for SegmentNodeStore 
- although that one will be a trivial static thingy - but upper layers should 
not have to worry about whether they are on document or segment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3268) Improve datastore resilience

2015-08-24 Thread Michael Marth (JIRA)
Michael Marth created OAK-3268:
--

 Summary: Improve datastore resilience
 Key: OAK-3268
 URL: https://issues.apache.org/jira/browse/OAK-3268
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: blob
Reporter: Michael Marth
 Fix For: 1.3.6


As discussed bilaterally grouping the improvements for datastore resilience in 
this issue for easier tracking



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3090) Caching BlobStore implementation

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3090:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 Caching BlobStore implementation 
 -

 Key: OAK-3090
 URL: https://issues.apache.org/jira/browse/OAK-3090
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: blob
Reporter: Chetan Mehrotra
  Labels: performance, resilience
 Fix For: 1.3.6


 Storing binaries in Mongo puts lots of pressure on the MongoDB for reads. To 
 reduce the read load it would be useful to have a filesystem based cache of 
 frequently used binaries. 
 This would be similar to CachingFDS (OAK-3005) but would be implemented on 
 top of BlobStore API. 
 Requirements
 * Specify the max binary size which can be cached on file system
 * Limit the size of all binary content present in the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3031) [Blob GC] Mbean for reporting shared repository GC stats

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3031:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 [Blob GC] Mbean for reporting shared repository GC stats
 

 Key: OAK-3031
 URL: https://issues.apache.org/jira/browse/OAK-3031
 Project: Jackrabbit Oak
  Issue Type: Sub-task
  Components: blob
Reporter: Amit Jain
Assignee: Amit Jain
  Labels: resilience, tooling
 Fix For: 1.3.6


 For GC on a shared repository (OAK-1849) it is beneficial to add a JMX Mbean 
 which can provide visibility on the state of GC. It could possibly show:
 * Various repositories registered in the DataStore
 * State of the blob reference collection for the registered repositories
 * Time of the reference files for each registered repository
 * Time interval for the earliest and the latest reference file of the 
 registered repositories. This could be used to possibly automate the sweep 
 phase if the time interval is less than a configured value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3183) [Blob GC] Improvements/tools for blob garbage collection

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3183:
---
Fix Version/s: (was: 1.4)
   1.3.6

 [Blob GC] Improvements/tools for blob garbage collection
 

 Key: OAK-3183
 URL: https://issues.apache.org/jira/browse/OAK-3183
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: blob
Reporter: Amit Jain
Assignee: Amit Jain
  Labels: resilience, tooling
 Fix For: 1.3.6


 Container issue for improvements and reporting tools for the blob garbage 
 collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3183) [Blob GC] Improvements/tools for blob garbage collection

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3183:
---
Labels: resilience tooling  (was: tooling)

 [Blob GC] Improvements/tools for blob garbage collection
 

 Key: OAK-3183
 URL: https://issues.apache.org/jira/browse/OAK-3183
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: blob
Reporter: Amit Jain
Assignee: Amit Jain
  Labels: resilience, tooling
 Fix For: 1.3.6


 Container issue for improvements and reporting tools for the blob garbage 
 collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3269) Improve Lucene indexer resilience

2015-08-24 Thread Michael Marth (JIRA)
Michael Marth created OAK-3269:
--

 Summary: Improve Lucene indexer resilience
 Key: OAK-3269
 URL: https://issues.apache.org/jira/browse/OAK-3269
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: lucene
Reporter: Michael Marth


As discussed bilaterally grouping the improvements for Lucene indexer 
resilience in this issue for easier tracking



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3269) Improve Lucene indexer resilience

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-3269:
---
Fix Version/s: 1.3.6

 Improve Lucene indexer resilience
 -

 Key: OAK-3269
 URL: https://issues.apache.org/jira/browse/OAK-3269
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: lucene
Reporter: Michael Marth
  Labels: resilience
 Fix For: 1.3.6


 As discussed bilaterally grouping the improvements for Lucene indexer 
 resilience in this issue for easier tracking



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2556) do intermediate commit during async indexing

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2556:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 do intermediate commit during async indexing
 

 Key: OAK-2556
 URL: https://issues.apache.org/jira/browse/OAK-2556
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: lucene
Affects Versions: 1.0.11
Reporter: Stefan Egli
  Labels: resilience
 Fix For: 1.3.6


 A recent issue found at a customer unveils a potential issue with the async 
 indexer. Reading the AsyncIndexUpdate.updateIndex it looks like it is doing 
 the entire update of the async indexer *in one go*, ie in one commit.
 When there is - for some reason - however, a huge diff that the async indexer 
 has to process, the 'one big commit' can become gigantic. There is no limit 
 to the size of the commit in fact.
 So the suggestion is to do intermediate commits while the async indexer is 
 going on. The reason this is acceptable is the fact that by doing async 
 indexing, that index is anyway not 100% up-to-date - so it would not make 
 much of a difference if it would commit after every 100 or 1000 changes 
 either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2722) IndexCopier fails to delete older index directory upon reindex

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2722:
---
Fix Version/s: (was: 1.3.5)
   1.3.6

 IndexCopier fails to delete older index directory upon reindex
 --

 Key: OAK-2722
 URL: https://issues.apache.org/jira/browse/OAK-2722
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: lucene
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
  Labels: resilience
 Fix For: 1.3.6


 {{IndexCopier}} tries to remove the older index directory incase of reindex. 
 This might fails on platform like Windows if the files are still memory 
 mapped or are locked.
 For deleting directories we would need to take similar approach like being 
 done with deleting old index files i.e. do retries later.
 Due to this following test fails on Windows (Per [~julian.resc...@gmx.de] )
 {noformat}
 Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec  
 FAILURE!
 deleteOldPostReindex(org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest)
   Time elapsed: 0.02 sec   FAILURE!
 java.lang.AssertionError: Old index directory should have been removed
 at org.junit.Assert.fail(Assert.java:93)
 at org.junit.Assert.assertTrue(Assert.java:43)
 at org.junit.Assert.assertFalse(Assert.java:68)
 at 
 org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.deleteOldPostReindex(IndexCopierTest.java:160)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2556) do intermediate commit during async indexing

2015-08-24 Thread Michael Marth (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Marth updated OAK-2556:
---
Issue Type: Improvement  (was: Bug)

 do intermediate commit during async indexing
 

 Key: OAK-2556
 URL: https://issues.apache.org/jira/browse/OAK-2556
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: lucene
Affects Versions: 1.0.11
Reporter: Stefan Egli
  Labels: resilience
 Fix For: 1.3.6


 A recent issue found at a customer unveils a potential issue with the async 
 indexer. Reading the AsyncIndexUpdate.updateIndex it looks like it is doing 
 the entire update of the async indexer *in one go*, ie in one commit.
 When there is - for some reason - however, a huge diff that the async indexer 
 has to process, the 'one big commit' can become gigantic. There is no limit 
 to the size of the commit in fact.
 So the suggestion is to do intermediate commits while the async indexer is 
 going on. The reason this is acceptable is the fact that by doing async 
 indexing, that index is anyway not 100% up-to-date - so it would not make 
 much of a difference if it would commit after every 100 or 1000 changes 
 either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3270) Improve DocumentMK resilience

2015-08-24 Thread Michael Marth (JIRA)
Michael Marth created OAK-3270:
--

 Summary: Improve DocumentMK resilience
 Key: OAK-3270
 URL: https://issues.apache.org/jira/browse/OAK-3270
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk, rdbmk
Reporter: Michael Marth


Collection of DocMK resilience improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3263) Support including and excluding paths for PropertyIndex

2015-08-24 Thread Manfred Baedke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710467#comment-14710467
 ] 

Manfred Baedke edited comment on OAK-3263 at 8/25/15 2:37 AM:
--

Added incomplete patch OAK-3263-prelimary.patch (based on branch 1.0) for 
reference purposes.
[~chetanm], would you take a look to tell me if this goes into the right 
direction?
Also I'm unsure about the equivalent of the IndexPlanner patch from OAK-2599 
(the opt-out in case of query path mismatch); any pointer is appreciated.


was (Author: baedke):
Added incomplete patch OAK-3263-prelimary.patch for reference purposes.
[~chetanm], would you take a look to tell me if this goes into the right 
direction?
Also I'm unsure about the equivalent of the IndexPlanner patch from OAK-2599 
(the opt-out in case of query path mismatch); any pointer is appreciated.

 Support including and excluding paths for PropertyIndex
 ---

 Key: OAK-3263
 URL: https://issues.apache.org/jira/browse/OAK-3263
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: query
Reporter: Chetan Mehrotra
 Fix For: 1.3.6

 Attachments: OAK-3263-prelimary.patch


 As part of OAK-2599 support for excluding and including paths were added to 
 Lucene index. It would be good to have such a support enabled for 
 PropertyIndexe also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3263) Support including and excluding paths for PropertyIndex

2015-08-24 Thread Manfred Baedke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manfred Baedke updated OAK-3263:

Attachment: OAK-3263-prelimary.patch

Added incomplete patch OAK-3263-prelimary.patch for reference purposes.
[~chetanm], would you take a look to tell me if this goes into the right 
direction?
Also I'm unsure about the equivalent of the IndexPlanner patch from OAK-2599 
(the opt-out in case of query path mismatch); any pointer is appreciated.

 Support including and excluding paths for PropertyIndex
 ---

 Key: OAK-3263
 URL: https://issues.apache.org/jira/browse/OAK-3263
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: query
Reporter: Chetan Mehrotra
 Fix For: 1.3.6

 Attachments: OAK-3263-prelimary.patch


 As part of OAK-2599 support for excluding and including paths were added to 
 Lucene index. It would be good to have such a support enabled for 
 PropertyIndexe also



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)