[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975881#comment-14975881
 ] 

Marcel Reutegger commented on OAK-3554:
---

+1

> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-3554:


 Summary: Use write concern of w:majority when connected to a 
replica set
 Key: OAK-3554
 URL: https://issues.apache.org/jira/browse/OAK-3554
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: mongomk
Reporter: Chetan Mehrotra
 Fix For: 1.3.10


Currently while connecting to Mongo MongoDocumentStore relies on default write 
concern provided as part of mongouri. 

Recently some issues were seen where Mongo based Oak was connecting to 3 member 
replica set and there were frequent replica state changes due to use of VM for 
Mongo. This caused data loss and corruption of data in Oak.

To avoid such situation Oak should default to write concern of majority by 
default. If some write concern is specified as part of mongouri then that 
should take precedence. This would allow system admin to take the call of 
tweaking write concern if required and at same time allows Oak to use the safe 
write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3555) Remove usage of deprecated mongo-java-driver methods

2015-10-27 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3555:
-

 Summary: Remove usage of deprecated mongo-java-driver methods
 Key: OAK-3555
 URL: https://issues.apache.org/jira/browse/OAK-3555
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: core, mongomk
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
 Fix For: 1.3.10


This will allow for a smoother update of the driver to 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3555) Remove usage of deprecated mongo-java-driver methods

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3555:
--
Attachment: OAK-3555.patch

Attached patch removes all usage of deprecated mongo-java-driver methods.

> Remove usage of deprecated mongo-java-driver methods
> 
>
> Key: OAK-3555
> URL: https://issues.apache.org/jira/browse/OAK-3555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.10
>
> Attachments: OAK-3555.patch
>
>
> This will allow for a smoother update of the driver to 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975820#comment-14975820
 ] 

Chetan Mehrotra commented on OAK-3554:
--

Due to OAK-2592 even setting the write concern would not be effective in all 
cases but chances of corruption would be reduced. So it would be good to have 
this change done in parallel to OAK-2592

> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3489:
--
Attachment: OAK-3489-mreutegg.patch

The patch looks very good. I attached a slightly modified version with equals() 
and notEquals() methods that annotate the value parameter with Nullable.

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.0.21, 1.2.6, 1.3.7
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-2106) Optimize reads from secondaries

2015-10-27 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-2106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976097#comment-14976097
 ] 

Tomek Rękawek commented on OAK-2106:


{quote}Let's say the estimator measures a lag of 2 seconds at time T. That is, 
secondaries have synced up to T-2s. At T+5s the secondaries still lag behind at 
T-2s.{quote}

Let's have S - secondary optime, P - primery optime, T - current time. The lag 
is measured as S-P, not S-T. It should allow to avoid the case in which the lag 
is large, but we happen to measure it right after some operation has been 
applied.

If we want to make it more reliable we can measure eg. 10 last values and 
return the largest one.

{quote}I'm also a bit concerned about introducing a dependency from 
MongoDocumentStore to classes like UnmergedBranches and UnsavedModifications.
I would rather like to see a solution where the client of the DocumentStore can 
express how fresh the document needs to be when it reads from the store.{quote}

It concerns me as well (as this is some kind of circular dependency), but I 
wasn't able to find something better. The access to unmerged branches is 
necessary so we won't ask the secondary about the path belonging to branch. It 
doesn't depend on the time, as user may modify many nodes (which'll result in 
creating branch) and keep the changes unmerged for a very long time.

Situation looks a bit different with the UnsavedModifications, as they are 
saved on a regular basis ({{asyncDelay}}) - we can add this value to the 
estimated lag to be sure that background update thread has run and the changes 
has been replicated.

{quote}I would rather like to see a solution where the client of the 
DocumentStore can express how fresh the document needs to be when it reads from 
the store. I think this also means the decision whether a read can be directed 
to a secondary must not depend on the lag as a duration, but should rather 
calculate a time when it is safe to read from a secondary.{quote}

We can take the {{find(maxCacheAge)}} parameter into consideration in the 
{{getMongoReadPreference}}, however it doesn't solve the issue with the 
unmerged branches.

{quote}The tricky part here is how to handle time differences on the machines 
where the Oak cluster nodes are running and the MongoDB replica set. Each 
change on a document is associated with a revision, where the timestamp of the 
revision is tied to the local clock where the revision was created. The oplog 
timestamp on the other hand is derived from the primary replica set member 
clock, I assume.{quote}

The replication set status is taken from the primary. For each secondary member 
we have 3 times available:

* optime - secondary time of the last operation applied,
* lastHeartbeat - secondary time of the last heartbeat sent,
* lastHeartbeatRecv - primary time of the last heartbeat received.

Primary member provides:

* optime,
* current timestamp.

As stated above, I estimate lag by subtracting primary optime from secondary 
optime. These two times comes from different machines and therefore clock 
differences will make it less accurate.

The other way of measuring the lag would be comparing lastHeartBeatRecv and 
current time stamp. These two times comes from the same machine (primary). It 
tells us how often the secondary ask for changes, but not how long does it take 
to apply them. Maybe the first thing is more important - if so, I can change 
the estimation method.

> Optimize reads from secondaries
> ---
>
> Key: OAK-2106
> URL: https://issues.apache.org/jira/browse/OAK-2106
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>  Labels: performance, scalability
>
> OAK-1645 introduced support for reads from secondaries under certain
> conditions. The current implementation checks the _lastRev on a potentially
> cached parent document and reads from a secondary if it has not been
> modified in the last 6 hours. This timespan is somewhat arbitrary but
> reflects the assumption that the replication lag of a secondary shouldn't
> be more than 6 hours.
> This logic should be optimized to take the actual replication lag into
> account. MongoDB provides information about the replication lag with
> the command rs.status().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3236) integration test that simulates influence of clock drift

2015-10-27 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-3236:
-
Fix Version/s: (was: 1.3.9)
   1.3.10

> integration test that simulates influence of clock drift
> 
>
> Key: OAK-3236
> URL: https://issues.apache.org/jira/browse/OAK-3236
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: core
>Affects Versions: 1.3.4
>Reporter: Stefan Egli
>Assignee: Stefan Egli
> Fix For: 1.3.10
>
>
> Spin-off of OAK-2739 [of this 
> comment|https://issues.apache.org/jira/browse/OAK-2739?focusedCommentId=14693398=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14693398]
>  - ie there should be an integration test that show cases the issues with 
> clock drift and why it is a good idea to have a lease-check (that refuses to 
> let the document store be used any further once the lease times out locally)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3001) Simplify JournalGarbageCollector using a dedicated timestamp property

2015-10-27 Thread Stefan Egli (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated OAK-3001:
-
Fix Version/s: (was: 1.3.9)
   1.3.10

> Simplify JournalGarbageCollector using a dedicated timestamp property
> -
>
> Key: OAK-3001
> URL: https://issues.apache.org/jira/browse/OAK-3001
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Critical
>  Labels: scalability
> Fix For: 1.3.10, 1.2.8
>
>
> This subtask is about spawning out a 
> [comment|https://issues.apache.org/jira/browse/OAK-2829?focusedCommentId=14585733=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14585733]
>  from [~chetanm] re JournalGC:
> {quote}
> Further looking at JournalGarbageCollector ... it would be simpler if you 
> record the journal entry timestamp as an attribute in JournalEntry document 
> and then you can delete all the entries which are older than some time by a 
> simple query. This would avoid fetching all the entries to be deleted on the 
> Oak side
> {quote}
> and a corresponding 
> [reply|https://issues.apache.org/jira/browse/OAK-2829?focusedCommentId=14585870=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14585870]
>  from myself:
> {quote}
> Re querying by timestamp: that would indeed be simpler. With the current set 
> of DocumentStore API however, I believe this is not possible. But: 
> [DocumentStore.query|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/DocumentStore.java#L127]
>  comes quite close: it would probably just require the opposite of that 
> method too: 
> {code}
> public  List query(Collection collection,
>   String fromKey,
>   String toKey,
>   String indexedProperty,
>   long endValue,
>   int limit) {
> {code}
> .. or what about generalizing this method to have both a {{startValue}} and 
> an {{endValue}} - with {{-1}} indicating when one of them is not used?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976166#comment-14976166
 ] 

Marcel Reutegger commented on OAK-3494:
---

Thanks a lot Vikas. Patch looks good. I will apply it after the 1.3.9 release.

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: performance
> Fix For: 1.3.10
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3348) Cross gc sessions might introduce references to pre-compacted segments

2015-10-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-3348:
---
Attachment: cross-gc-refs.pdf

Attaching a visualisation of the cross gc references: the big circle shows the 
segments before compaction. The small circle shows the segments created by 
compaction and the medium circle shows the segments written after compaction. 
Red edges show references out of a segment and blue ones references into a 
segment. The trouble spot here are the references from the medium circle to the 
big circle. Those shouldn't be there.

> Cross gc sessions might introduce references to pre-compacted segments
> --
>
> Key: OAK-3348
> URL: https://issues.apache.org/jira/browse/OAK-3348
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segmentmk
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: cleanup, compaction, gc
> Fix For: 1.4
>
> Attachments: OAK-3348-1.patch, OAK-3348-2.patch, OAK-3348.patch, 
> cross-gc-refs.pdf, image.png
>
>
> I suspect that certain write operations during compaction can cause 
> references from compacted segments to pre-compacted ones. This would 
> effectively prevent the pre-compacted segments from getting evicted in 
> subsequent cleanup phases. 
> The scenario is as follows:
> * A session is opened and a lot of content is written to it such that the 
> update limit is exceeded. This causes the changes to be written to disk. 
> * Revision gc runs causing a new, compacted root node state to be written to 
> disk.
> * The session saves its changes. This causes rebasing of its changes onto the 
> current root (the compacted one). At this point any node that has been added 
> will be added again in the sub-tree rooted at the current root. Such nodes 
> however might have been written to disk *before* revision gc ran and might 
> thus be contained in pre-compacted segments. As I suspect the node-add 
> operation in the rebasing process *not* to create a deep copy of such nodes 
> but to rather create a *reference* to them, a reference to a pre-compacted 
> segment is introduced here. 
> Going forward we need to validate above hypothesis, assess its impact if 
> necessary come up with a solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3556) MongoDocumentStore may create incomplete document

2015-10-27 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3556:
-

 Summary: MongoDocumentStore may create incomplete document
 Key: OAK-3556
 URL: https://issues.apache.org/jira/browse/OAK-3556
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.2, 1.0
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
Priority: Minor
 Fix For: 1.3.10


The document is incomplete when there are multiple set-map-entry operations
for the same name but with different revisions.

Right now the DocumentNodeStore does not create such documents, which means 
this is not a problem in practice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3494:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: performance
> Fix For: 1.3.10
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3436) Prevent missing checkpoint due to unstable topology from causing complete reindexing

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3436:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Prevent missing checkpoint due to unstable topology from causing complete 
> reindexing
> 
>
> Key: OAK-3436
> URL: https://issues.apache.org/jira/browse/OAK-3436
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.3.10, 1.0.23, 1.2.8
>
> Attachments: AsyncIndexUpdateClusterTest.java, OAK-3436-0.patch
>
>
> Async indexing logic relies on embedding application to ensure that async 
> indexing job is run as a singleton in a cluster. For Sling based apps it 
> depends on Sling Discovery support. At times it is being seen that if 
> topology is not stable then different cluster nodes can consider them as 
> leader and execute the async indexing job concurrently.
> This can cause problem as both cluster node might not see same repository 
> state (due to write skew and eventual consistency) and might remove the 
> checkpoint which other cluster node is still relying upon. For e.g. consider 
> a 2 node cluster N1 and N2 where both are performing async indexing.
> # Base state - CP1 is the checkpoint for "async" job
> # N2 starts indexing and removes changes CP1 to CP2. For Mongo the 
> checkpoints are saved in {{settings}} collection
> # N1 also decides to execute indexing but has yet not seen the latest 
> repository state so still thinks that CP1 is the base checkpoint and tries to 
> read it. However CP1 is already removed from {{settings}} and this makes N1 
> think that checkpoint is missing and it decides to reindex everything!
> To avoid this topology must be stable but at Oak level we should still handle 
> such a case and avoid doing a full reindexing. So we would need to have a 
> {{MissingCheckpointStrategy}} similar to {{MissingIndexEditorStrategy}} as 
> done in OAK-2203 
> Possible approaches
> # A1 - Fail the indexing run if checkpoint is missing - Checkpoint being 
> missing can have valid reason and invalid reason. Need to see what are valid 
> scenarios where a checkpoint can go missing
> # A2 - When a checkpoint is created also store the creation time. When a 
> checkpoint is found to be missing and its a *recent* checkpoint then fail the 
> run. For e.g. we would fail the run till checkpoint found to be missing is 
> less than an hour old (for just started take startup time into account)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3335) RepositorySidegrade has runtime dependency on jackrabbit-core

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3335:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> RepositorySidegrade has runtime dependency on jackrabbit-core
> -
>
> Key: OAK-3335
> URL: https://issues.apache.org/jira/browse/OAK-3335
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: upgrade
>Affects Versions: 1.3.5
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.3.10
>
>
> It should be possible to run {{RepositorySidegrade}} from a runnable jar file 
> that does not embed {{jackrabbit-core}}. E.g. once {{RepositorySidegrade}} is 
> enabled in {{oak-run}}, it shoudl work in the variant that does not embed 
> jackrabbit-core.
> OAK-3239 introduced a transitive runtime dependency from 
> {{RepositorySidegrade}} to {{RepositoryUpgrade}} (via static imports), which 
> has dependencies to classes in {{jackrabbit-core}}. This leads to failure 
> with a {{ClassNotFoundException}}.
> Moving the constants and static method to {{RepositorySidegrade}} and 
> importing them in {{RepositoryUpgrade}} instead should resolve the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2477) Move suggester specific config to own configuration node

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2477:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Move suggester specific config to own configuration node
> 
>
> Key: OAK-2477
> URL: https://issues.apache.org/jira/browse/OAK-2477
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> Currently suggester configuration is controlled via properties defined on 
> main config / props node but it'd be good if we would have its own place to 
> configure the whole suggest feature to not mix up configuration of other 
> features / parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1819) oak-solr-core test failures on Java 8

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-1819:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> oak-solr-core test failures on Java 8
> -
>
> Key: OAK-1819
> URL: https://issues.apache.org/jira/browse/OAK-1819
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Affects Versions: 1.0
> Environment: {noformat}
> Apache Maven 3.1.0 (893ca28a1da9d5f51ac03827af98bb730128f9f2; 2013-06-27 
> 22:15:32-0400)
> Maven home: c:\Program Files\apache-maven-3.1.0
> Java version: 1.8.0, vendor: Oracle Corporation
> Java home: c:\Program Files\Java\jdk1.8.0\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
> {noformat}
>Reporter: Jukka Zitting
>Assignee: Tommaso Teofili
>Priority: Minor
>  Labels: java8, test
> Fix For: 1.3.10
>
>
> The following {{oak-solr-core}} test failures occur when building Oak with 
> Java 8:
> {noformat}
> Failed tests:
>   
> testNativeMLTQuery(org.apache.jackrabbit.oak.plugins.index.solr.query.SolrIndexQueryTest):
>  expected: but was:
>   
> testNativeMLTQueryWithStream(org.apache.jackrabbit.oak.plugins.index.solr.query.SolrIndexQueryTest):
>  expected: but was:
> {noformat}
> The cause of this might well be something as simple as the test case 
> incorrectly expecting a specific ordering of search results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976109#comment-14976109
 ] 

Chetan Mehrotra commented on OAK-3554:
--

Some more thoughts - It might happen that setting w:majority globally for 
*every* write can lead to significant performance loss. We can probably look 
into using a higher write concern for more critical writes. For example writes 
which mark a revision as committed on commit root can be set with w:majority 
while at other places we use default write concern.

Per [1] write concern relies on replication log i.e. writes are not sent to 
individual servers directly but instead are pushed to replication log and then 
from there the call would wait untill a write op with a specific concern 
matches the required

[1] https://docs.mongodb.org/manual/core/replica-set-write-concern/

> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3149) SuggestHelper should manage a suggestor per index definition

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3149:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> SuggestHelper should manage a suggestor per index definition
> 
>
> Key: OAK-3149
> URL: https://issues.apache.org/jira/browse/OAK-3149
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Tommaso Teofili
> Fix For: 1.3.10
>
>
> {{SuggestHelper}} currently keeps a static reference to suggestor and thus 
> have a singleton suggestor for whole repo. Instead it should be implemented 
> in such a way that a suggestor instance is associated with index definition. 
> Logically the suggestor instance should be part of IndexNode similar to how 
> {{IndexSearcher}} instances are managed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2065) JMX stats for operations being performed in DocumentStore

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2065:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> JMX stats for operations being performed in DocumentStore
> -
>
> Key: OAK-2065
> URL: https://issues.apache.org/jira/browse/OAK-2065
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: tooling
> Fix For: 1.3.10
>
> Attachments: 
> 0001-OAK-2065-JMX-stats-for-operations-being-performed-in.patch, 
> OAK-2065-1.patch
>
>
> Currently DocumentStore performs various background operations like
> # Cache consistency check
> # Pushing the lastRev updates
> # Synchrnizing the root node version
> We should capture some stats like time taken in various task and expose them 
> over JMX to determine if those background operations are performing well or 
> not. For example its important that all tasks performed in background task 
> should be completed under 1 sec (default polling interval). If the time taken 
> increases then it would be cause of concern
> See http://markmail.org/thread/57fax4nyabbubbef



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2722) IndexCopier fails to delete older index directory upon reindex

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2722:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> IndexCopier fails to delete older index directory upon reindex
> --
>
> Key: OAK-2722
> URL: https://issues.apache.org/jira/browse/OAK-2722
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
>  Labels: resilience
> Fix For: 1.3.10
>
>
> {{IndexCopier}} tries to remove the older index directory incase of reindex. 
> This might fails on platform like Windows if the files are still memory 
> mapped or are locked.
> For deleting directories we would need to take similar approach like being 
> done with deleting old index files i.e. do retries later.
> Due to this following test fails on Windows (Per [~julian.resc...@gmx.de] )
> {noformat}
> Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec <<< 
> FAILURE!
> deleteOldPostReindex(org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest)
>   Time elapsed: 0.02 sec  <<< FAILURE!
> java.lang.AssertionError: Old index directory should have been removed
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.assertTrue(Assert.java:43)
> at org.junit.Assert.assertFalse(Assert.java:68)
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.deleteOldPostReindex(IndexCopierTest.java:160)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2478) Move spellcheck config to own configuration node

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2478:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Move spellcheck config to own configuration node
> 
>
> Key: OAK-2478
> URL: https://issues.apache.org/jira/browse/OAK-2478
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tommaso Teofili
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> Currently spellcheck configuration is controlled via properties defined on 
> main config / props node but it'd be good if we would have its own place to 
> configure the whole spellcheck feature to not mix up configuration of other 
> features / parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3450) Configuration to have case insensitive suggestions

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3450:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Configuration to have case insensitive suggestions
> --
>
> Key: OAK-3450
> URL: https://issues.apache.org/jira/browse/OAK-3450
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.3.10
>
>
> Currently suggestions follow the same case as requested in query parameter. 
> It makes sense to allow for it to be case insensitive. e.g. Asking for 
> suggestions for {{cat}} should give {{Cat is an animal}} as well as 
> {{category needs to be assigned}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2891) Use more efficient approach to manage in memory map in LengthCachingDataStore

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2891:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Use more efficient approach to manage in memory map in LengthCachingDataStore
> -
>
> Key: OAK-2891
> URL: https://issues.apache.org/jira/browse/OAK-2891
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: upgrade
>Reporter: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.10
>
>
> LengthCachingDataStore introduced in OAK-2882 has an in memory map for 
> keeping the mapping between blobId and length. This would pose issue when 
> number of binaries are very large.
> Instead of in memory map we should use some off heap store like MVStore of 
> MapDB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3092) Cache recently extracted text to avoid duplicate extraction

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3092:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Cache recently extracted text to avoid duplicate extraction
> ---
>
> Key: OAK-3092
> URL: https://issues.apache.org/jira/browse/OAK-3092
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: performance
> Fix For: 1.3.10
>
>
> It can happen that text can be extracted from same binary multiple times in a 
> given indexing cycle. This can happen due to 2 reasons
> # Multiple Lucene indexes indexing same node - A system might have multiple 
> Lucene indexes e.g. a global Lucene index and an index for specific nodeType. 
> In a given indexing cycle same file would be picked up by both index 
> definition and both would extract same text
> # Aggregation - With Index time aggregation same file get picked up multiple 
> times due to aggregation rules
> To avoid the wasted effort for duplicate text extraction from same file in a 
> given indexing cycle it would be better to have an expiring cache which can 
> hold on to extracted text content for some time. The cache should have 
> following features
> # Limit on total size
> # Way to expire the content using [Timed 
> Evicition|https://code.google.com/p/guava-libraries/wiki/CachesExplained#Timed_Eviction]
>  - As chances of same file getting picked up are high only for a given 
> indexing cycle it would be better to expire the cache entries after some time 
> to avoid hogging memory unnecessarily 
> Such a cache would provide following benefit
> # Avoid duplicate text extraction - Text extraction is costly and has to be 
> minimized on critical path of {{indexEditor}}
> # Avoid expensive IO specially if binary content are to be fetched from a 
> remote {{BlobStore}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2797) Closeable aspect of Analyzer should be accounted for

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2797:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Closeable aspect of Analyzer should be accounted for
> 
>
> Key: OAK-2797
> URL: https://issues.apache.org/jira/browse/OAK-2797
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> Lucene {{Analyzer}} implements {{Closeable}} [1] interface and internally it 
> has a ThreadLocal storage of some persistent resource
> So far in oak-lucene we do not take care of closing any analyzer. In fact we 
> use a singleton Analyzer in all cases. Opening this bug to think about this 
> aspect and see if our usage of Analyzer follows the best practices
> [1] 
> http://lucene.apache.org/core/4_7_0/core/org/apache/lucene/analysis/Analyzer.html#close%28%29
> /cc [~teofili] [~alex.parvulescu]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3176) Provide an option to include a configured boost query while searching

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3176:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Provide an option to include a configured boost query while searching
> -
>
> Key: OAK-3176
> URL: https://issues.apache.org/jira/browse/OAK-3176
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.3.10, 1.2.8
>
>
> For tweaking relevancy it's sometimes useful to include a boost query that 
> gets applied at query time and modifies the ranking accordingly.
> This can be done also by setting it by hand as a default parameter to the 
> /select request handler, but for convenience it'd be good if the Solr 
> instance configuration files wouldn't be touched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1610) Improved default indexing by JCR type in SolrIndexEditor

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-1610:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Improved default indexing by JCR type in SolrIndexEditor
> 
>
> Key: OAK-1610
> URL: https://issues.apache.org/jira/browse/OAK-1610
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
> Fix For: 1.3.10
>
>
> It'd be good to provide a typed indexing default so that properties of a 
> certain type can be mapped to certain Solr dynamic fields with dedicated 
> types. The infrastructure for doing that is already in place as per 
> OakSolrConfiguration#getFieldNameFor(Type) but the default configuration is 
> not properly set with a good mapping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976114#comment-14976114
 ] 

Marcel Reutegger commented on OAK-3499:
---

I ran CollisionTest again with ClusterNodeInfo set to DEBUG and no network 
connectivity. The log output is:

{noformat}
10:51:13.049 DEBUG [main] ClusterNodeInfo.java:822  getMachineId(): 
discovered addresses: [] []
10:51:13.143 INFO  [main] DocumentNodeStore.java:540Initialized 
DocumentNodeStore with clusterNodeId: 1 (id: 1, startTime: 1445939473064, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
4b364641-cbf0-4212-a701-f63c8ebd886a, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.145 INFO  [main] DocumentNodeStore.java:540Initialized 
DocumentNodeStore with clusterNodeId: 2 (id: 2, startTime: 1445939473143, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
911fb138-9077-46eb-873c-f11783970c4d, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.163 INFO  [main] DocumentNodeStore.java:552Starting disposal 
of DocumentNodeStore with clusterNodeId: 1 (id: 1, startTime: 1445939473064, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
4b364641-cbf0-4212-a701-f63c8ebd886a, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.164 INFO  [main] DocumentNodeStore.java:609Disposed 
DocumentNodeStore with clusterNodeId: 1
10:51:13.165 DEBUG [main] ClusterNodeInfo.java:463  Cleaned up cluster 
node info for clusterNodeId 1 [machineId: 
random:297454f8-8691-48f8-bcf1-81831206d87d, leaseEnd: n/a]
10:51:13.165 INFO  [main] DocumentNodeStore.java:552Starting disposal 
of DocumentNodeStore with clusterNodeId: 1 (id: 1, startTime: 1445939473064, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
4b364641-cbf0-4212-a701-f63c8ebd886a, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.165 INFO  [main] DocumentNodeStore.java:552Starting disposal 
of DocumentNodeStore with clusterNodeId: 2 (id: 2, startTime: 1445939473143, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
911fb138-9077-46eb-873c-f11783970c4d, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.166 INFO  [main] DocumentNodeStore.java:609Disposed 
DocumentNodeStore with clusterNodeId: 2
10:51:13.167 INFO  [main] UnmergedBranches.java:92  Purged [1] 
uncommitted branch revision entries
10:51:13.167 INFO  [main] UnmergedBranches.java:96  Purged [1] 
collision markers
10:51:13.168 INFO  [main] DocumentNodeStore.java:540Initialized 
DocumentNodeStore with clusterNodeId: 1 (id: 1, startTime: 1445939473166, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
ec8b1c17-5be6-49ca-bc2c-5eaa34f5df5b, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.168 INFO  [main] DocumentNodeStore.java:552Starting disposal 
of DocumentNodeStore with clusterNodeId: 1 (id: 1, startTime: 1445939473166, 
machineId: random:297454f8-8691-48f8-bcf1-81831206d87d, instanceId: 
/Users/mreutegg/devel/apache/oak/trunk/oak-core, pid: 24190, uuid: 
ec8b1c17-5be6-49ca-bc2c-5eaa34f5df5b, readWriteMode: null, state: NONE, 
revLock: NONE, oakVersion: SNAPSHOT)
10:51:13.169 INFO  [main] DocumentNodeStore.java:609Disposed 
DocumentNodeStore with clusterNodeId: 1
{noformat}

The exception from the test:
{noformat}
purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest)  Time elapsed: 
0.419 sec  <<< ERROR!
org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: Configured 
cluster node id 1 already in use: 
at 
org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.createInstance(ClusterNodeInfo.java:490)
at 
org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.getInstance(ClusterNodeInfo.java:377)
at 
org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.getInstance(ClusterNodeInfo.java:338)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.(DocumentNodeStore.java:431)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentMK$Builder.getNodeStore(DocumentMK.java:659)
at 
org.apache.jackrabbit.oak.plugins.document.DocumentMKBuilderProvider$DisposingDocumentMKBuilder.getNodeStore(DocumentMKBuilderProvider.java:73)
at 

[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976125#comment-14976125
 ] 

Marcel Reutegger commented on OAK-3554:
---

I was thinking about this as well in context of OAK-2592, but came to the 
conclusion that it doesn't work reliably.

Let's say non-commit writes use 'ack' and only the final commit write uses 
'majority'. In case of a replica failover during the non-commit writes, some of 
those changes may be rolled back without Oak noticing it. The commit would 
continue and happily update the commit root with 'majority' on the new primary. 
The transaction will be incomplete.

> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2719) Warn about local copy size different than remote copy in oak-lucene with copyOnRead enabled

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2719:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Warn about local copy size different than remote copy in oak-lucene with 
> copyOnRead enabled
> ---
>
> Key: OAK-2719
> URL: https://issues.apache.org/jira/browse/OAK-2719
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
>  Labels: resilience
> Fix For: 1.3.10
>
>
> At times following warning is seen in logs
> {noformat}
> 31.03.2015 14:04:57.610 *WARN* [pool-6-thread-7] 
> org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Found local copy 
> for _0.cfs in 
> NIOFSDirectory@/path/to/index/e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293/1
>  
> lockFactory=NativeFSLockFactory@/path/to/index/e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293/1
>  but size of local 1040384 differs from remote 1958385. Content would be read 
> from remote file only
> {noformat}
> The file length check provides a weak check around index file consistency. In 
> some cases this warning is misleading. For e.g. 
> # Index version Rev1 - Task submitted to copy index file F1 
> # Index updated to Rev2 - Directory bound to Rev1 is closed
> # Read is performed with Rev2 for F1 - Here as the file would be locally 
> created the size would be different as the copying is in progress
> In such a case the logic should ensure that once copy is done the local file 
> gets used



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3151) Lucene Version should be based on IndexFormatVersion

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3151:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Lucene Version should be based on IndexFormatVersion
> 
>
> Key: OAK-3151
> URL: https://issues.apache.org/jira/browse/OAK-3151
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> Currently in oak-lucene where ever call is made to Lucene it passes 
> Version.LUCENE_47 as hardcoded version. To enable easier upgrade of Lucene 
> and hence change of defaults for fresh setup this version should be instead 
> based on {{IndexFormatVersion}}.
> Say
> * For IndexFormatVersion set to V2 (current default) - Lucene version used is 
> LUCENE_47
> * For IndexFormatVersion set to V3 (proposed) - Lucene version used would be 
> per Lucene library version
> If the index is reindexed then it would automatically be updated to the 
> latest revision



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2835) TARMK Cold Standby inefficient cleanup

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2835:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> TARMK Cold Standby inefficient cleanup
> --
>
> Key: OAK-2835
> URL: https://issues.apache.org/jira/browse/OAK-2835
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk, tarmk-standby
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Critical
>  Labels: compaction, gc, production, resilience
> Fix For: 1.3.10
>
> Attachments: OAK-2835.patch
>
>
> Following OAK-2817, it turns out that patching the data corruption issue 
> revealed an inefficiency of the cleanup method. similar to the online 
> compaction situation, the standby has issues clearing some of the in-memory 
> references to old revisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2556) do intermediate commit during async indexing

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2556:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> do intermediate commit during async indexing
> 
>
> Key: OAK-2556
> URL: https://issues.apache.org/jira/browse/OAK-2556
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.0.11
>Reporter: Stefan Egli
>  Labels: resilience
> Fix For: 1.3.10
>
>
> A recent issue found at a customer unveils a potential issue with the async 
> indexer. Reading the AsyncIndexUpdate.updateIndex it looks like it is doing 
> the entire update of the async indexer *in one go*, ie in one commit.
> When there is - for some reason - however, a huge diff that the async indexer 
> has to process, the 'one big commit' can become gigantic. There is no limit 
> to the size of the commit in fact.
> So the suggestion is to do intermediate commits while the async indexer is 
> going on. The reason this is acceptable is the fact that by doing async 
> indexing, that index is anyway not 100% up-to-date - so it would not make 
> much of a difference if it would commit after every 100 or 1000 changes 
> either.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2911) Analyze inter package dependency in oak-core

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2911:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Analyze inter package dependency in oak-core
> 
>
> Key: OAK-2911
> URL: https://issues.apache.org/jira/browse/OAK-2911
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
>  Labels: modularization, technical_debt
> Fix For: 1.3.10
>
> Attachments: oak-core-jdepend-report.html
>
>
> For better code health the packages should have proper inter dependency. Its 
> preferable that various {{plugin}} packages within oak-core have minimal 
> inter dependency and should be able to exist independently.
> Following work need to be performed
> # Check whats the current state
> # Look into ways to ensure that such dependency are minimal and at minimum 
> must not have cycle
> See 
> * 
> http://stackoverflow.com/questions/3416547/maven-jdepend-fail-build-with-cycles
> * https://github.com/andrena/no-package-cycles-enforcer-rule



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-937) Query engine index selection tweaks: shortcut and hint

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-937:
-
Fix Version/s: (was: 1.3.9)
   1.3.10

> Query engine index selection tweaks: shortcut and hint
> --
>
> Key: OAK-937
> URL: https://issues.apache.org/jira/browse/OAK-937
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, query
>Reporter: Alex Parvulescu
>Priority: Minor
>  Labels: performance
> Fix For: 1.3.10
>
>
> This issue covers 2 different changes related to the way the QueryEngine 
> selects a query index:
>  Firstly there could be a way to end the index selection process early via a 
> known constant value: if an index returns a known value token (like -1000) 
> then the query engine would effectively stop iterating through the existing 
> index impls and use that index directly.
>  Secondly it would be nice to be able to specify a desired index (if one is 
> known to perform better) thus skipping the existing selection mechanism (cost 
> calculation and comparison). This could be done via certain query hints [0].
> [0] http://en.wikipedia.org/wiki/Hint_(SQL)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3286) Persistent Cache improvements

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3286:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Persistent Cache improvements
> -
>
> Key: OAK-3286
> URL: https://issues.apache.org/jira/browse/OAK-3286
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: cache
>Reporter: Michael Marth
>Priority: Minor
> Fix For: 1.3.10
>
>
> Issue for collecting various improvements to the persistent cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2847) Dependency cleanup

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2847:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Dependency cleanup 
> ---
>
> Key: OAK-2847
> URL: https://issues.apache.org/jira/browse/OAK-2847
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Michael Dürig
>Assignee: Vikas Saurabh
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> Early in the next release cycle we should go through the list of Oak's 
> dependencies and decide whether we have candidates we want to upgrade and 
> remove orphaned dependencies. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3071) Add a compound index for _modified + _id

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3071:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Add a compound index for _modified + _id
> 
>
> Key: OAK-3071
> URL: https://issues.apache.org/jira/browse/OAK-3071
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: performance, resilience
> Fix For: 1.3.10
>
>
> As explained in OAK-1966 diff logic makes a call like
> bq. db.nodes.find({ _id: { $gt: "3:/content/foo/01/", $lt: 
> "3:/content/foo010" }, _modified: { $gte: 1405085300 } }).sort({_id:1})
> For better and deterministic query performance we would need to create a 
> compound index like \{_modified:1, _id:1\}. This index would ensure that 
> Mongo does not have to perform object scan while evaluating such a query.
> Care must be taken that index is only created by default for fresh setup. For 
> existing setup we should expose a JMX operation which can be invoked by 
> system admin to create the required index as per maintenance window



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3319) Disabling IndexRule inheritence is not working in all cases

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3319:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Disabling IndexRule inheritence is not working in all cases
> ---
>
> Key: OAK-3319
> URL: https://issues.apache.org/jira/browse/OAK-3319
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.3.10
>
>
> IndexRules are inherited by default i.e. a rule defined for nt:hierrachyNode 
> is also applicable for nt:folder (nt:folder extends nt:hierrachyNode). Lucene 
> indexing supports {{inherited}} floag (defaults to true). If this is set to 
> false then inheritance is disabled.
> As per current implementation disabling works fine on indexing side. An node 
> which is not having an explicit indexRule defined is not indexed. However 
> same is not working on query side i.e. IndexPlanner would still opt in for a 
> given query ignoring the fact that inheritance is disabled 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3185) Port and refactor jackrabbit-webapp module to Oak

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3185:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Port and refactor jackrabbit-webapp module to Oak 
> --
>
> Key: OAK-3185
> URL: https://issues.apache.org/jira/browse/OAK-3185
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: webapp
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> As mentioned at [1] we should port the jackrabbit-webapp [2] module to Oak 
> and refactor it to run complete Oak stack. Purpose of this module would be to 
> demonstrate
> # How to embed Oak in standalone web applications which are not based on OSGi
> # Configure various aspect of Oak via config
> h3. Proposed Appraoch
> # Copy jackrabbit-webapp to Oak repo under oak-webapp
> # Refactor the repository initialization logic to use Oak Pojosr to configure 
> Repository [3]
> # Bonus configure Felix WebConsole to enable users to see what all OSGi 
> services are exposed and what config options are supported
> This would also enable us to document what all thirdparty dependencies are 
> required for getting Oak to work in such environments
> [1] 
> http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201508.mbox/%3CCAHCW-mkbpS6qSkgFe1h1anFcD-dYWFrcr9xBWx9dpKaxr91Q3Q%40mail.gmail.com%3E
> [2] 
> https://jackrabbit.apache.org/jcr/components/jackrabbit-web-application.html
> [3] https://github.com/apache/jackrabbit-oak/tree/trunk/oak-pojosr



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3362) Estimate compaction based on diff to previous compacted head state

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3362:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Estimate compaction based on diff to previous compacted head state
> --
>
> Key: OAK-3362
> URL: https://issues.apache.org/jira/browse/OAK-3362
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>Priority: Minor
>  Labels: compaction, gc
> Fix For: 1.3.10
>
>
> Food for thought: try to base the compaction estimation on a diff between the 
> latest compacted state and the current state.
> Pros
> * estimation duration would be proportional to number of changes on the 
> current head state
> * using the size on disk as a reference, we could actually stop the 
> estimation early when we go over the gc threshold.
> * data collected during this diff could in theory be passed as input to the 
> compactor so it could focus on compacting a specific subtree
> Cons
> * need to keep a reference to a previous compacted state. post-startup and 
> pre-compaction this might prove difficult (except maybe if we only persist 
> the revision similar to what the async indexer is doing currently)
> * coming up with a threshold for running compaction might prove difficult
> * diff might be costly, but still cheaper than the current full diff



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2910) oak-jcr bundle should be usable as a standalone bundle

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2910:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> oak-jcr bundle should be usable as a standalone bundle
> --
>
> Key: OAK-2910
> URL: https://issues.apache.org/jira/browse/OAK-2910
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: jcr
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>  Labels: modularization, osgi, technical_debt
> Fix For: 1.3.10
>
>
> Currently oak-jcr bundle needs to be embedded within some other bundle if the 
> Oak needs to be properly configured in OSGi env. Need to revisit this aspect 
> and see what needs to be done to enable Oak to be properly configured without 
> requiring the oak-jcr bundle to be embedded in the repo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3090) Caching BlobStore implementation

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3090:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Caching BlobStore implementation 
> -
>
> Key: OAK-3090
> URL: https://issues.apache.org/jira/browse/OAK-3090
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: blob
>Reporter: Chetan Mehrotra
>  Labels: performance, resilience
> Fix For: 1.3.10
>
>
> Storing binaries in Mongo puts lots of pressure on the MongoDB for reads. To 
> reduce the read load it would be useful to have a filesystem based cache of 
> frequently used binaries. 
> This would be similar to CachingFDS (OAK-3005) but would be implemented on 
> top of BlobStore API. 
> Requirements
> * Specify the max binary size which can be cached on file system
> * Limit the size of all binary content present in the cache



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-318) Excerpt support

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-318:
-
Fix Version/s: (was: 1.3.9)
   1.3.10

> Excerpt support
> ---
>
> Key: OAK-318
> URL: https://issues.apache.org/jira/browse/OAK-318
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: core, query
>Reporter: Alex Parvulescu
> Fix For: 1.3.10
>
>
> Test class: ExcerptTest.
> Right now I only see parse errors:
> Caused by: java.text.ParseException: Query:
> {noformat}
> testroot/*[jcr:contains(., 'jackrabbit')]/rep:excerpt((*).); expected: 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3532:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.10
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1828) Improved SegmentWriter

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-1828:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Improved SegmentWriter
> --
>
> Key: OAK-1828
> URL: https://issues.apache.org/jira/browse/OAK-1828
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: segmentmk
>Reporter: Jukka Zitting
>Assignee: Alex Parvulescu
>Priority: Minor
>  Labels: technical_debt
> Fix For: 1.3.10
>
>
> At about 1kLOC and dozens of methods, the SegmentWriter class currently a bit 
> too complex for one of the key components of the TarMK. It also uses a 
> somewhat non-obvious mix of synchronized and unsynchronized code to 
> coordinate multiple concurrent threads that may be writing content at the 
> same time. The synchronization blocks are also broader than what really would 
> be needed, which in some cases causes unnecessary lock contention in 
> concurrent write loads.
> To improve the readability and maintainability of the code, and to increase 
> performance of concurrent writes, it would be useful to split part of the 
> SegmentWriter functionality to a separate RecordWriter class that would be 
> responsible for writing individual records into a segment. The 
> SegmentWriter.prepare() method would return a new RecordWriter instance, and 
> the higher-level SegmentWriter methods would use the returned instance for 
> all the work that's currently guarded in synchronization blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2618) Improve performance of queries with ORDER BY and multiple OR filters

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2618:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Improve performance of queries with ORDER BY and multiple OR filters
> 
>
> Key: OAK-2618
> URL: https://issues.apache.org/jira/browse/OAK-2618
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Amit Jain
>Assignee: Amit Jain
>  Labels: performance
> Fix For: 1.3.10
>
>
> When multiple OR constraints are specified in the XPATH query, itis broken up 
> into union of multiple clauses. If query includes an order by clause, the 
> sorting in this case is done by traversing the result set in memory leading 
> to slow query performance.
> Possible improvements could include:
> * For indexes which can support multiple filters (like lucene, solr) such 
> queries should be efficient and the query engine can pass-thru the query as 
> is.
> ** Possibly also needed for other cases also. So, we can have some sort of 
> capability advertiser for indexes which can hint the query engine 
> and/or
> * Batched merging of the sorted iterators returned for the multiple union 
> queries (possible externally).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3368) Speed up ExternalPrivateStoreIT and ExternalSharedStoreIT

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3368:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Speed up ExternalPrivateStoreIT and ExternalSharedStoreIT
> -
>
> Key: OAK-3368
> URL: https://issues.apache.org/jira/browse/OAK-3368
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: tarmk-standby
>Reporter: Marcel Reutegger
>Assignee: Manfred Baedke
> Fix For: 1.3.10
>
>
> Both tests run for more than 5 minutes. Most of the time the tests are 
> somehow stuck in shutting down the server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3215) Solr test often fail with No such core: oak

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3215:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Solr test often fail with  No such core: oak
> 
>
> Key: OAK-3215
> URL: https://issues.apache.org/jira/browse/OAK-3215
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Reporter: Chetan Mehrotra
>Assignee: Tommaso Teofili
>Priority: Minor
>  Labels: CI
> Fix For: 1.3.10
>
>
> Often it can be seen that all test from oak-solr module fail. And in all such 
> failure following error is reported 
> {noformat}
> org.apache.solr.common.SolrException: No such core: oak
>   at 
> org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:112)
>   at 
> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:118)
>   at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
>   at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
>   at 
> org.apache.jackrabbit.oak.plugins.index.solr.query.SolrQueryIndexTest.testQueryOnIgnoredExistingProperty(SolrQueryIndexTest.java:330)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> {noformat}
> Most recent failure in 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/325/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3193) Integrate with Felix WebConsole

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3193:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Integrate with Felix WebConsole
> ---
>
> Key: OAK-3193
> URL: https://issues.apache.org/jira/browse/OAK-3193
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: webapp
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> To allow better debugging support of repository setup it would be useful if 
> Felix WebConsole is configured with the webapp. This would allow easier 
> access to OSGi runtime state



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3509) Lucene suggestion results should have 1 row per suggestion with appropriate column names

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3509:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Lucene suggestion results should have 1 row per suggestion with appropriate 
> column names
> 
>
> Key: OAK-3509
> URL: https://issues.apache.org/jira/browse/OAK-3509
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Tommaso Teofili
>Priority: Minor
> Fix For: 1.3.10
>
>
> Currently suggest query returns just one row with {{rep:suggest()}} column 
> containing a string that needs to be parsed.
> It'd better if each suggestion is returned as individual row with column 
> names such as {{suggestion}}, {{weight}}(???), etc.
> (cc [~teofili])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3159) Extend documentation for SegmentNodeStoreService in http://jackrabbit.apache.org/oak/docs/osgi_config.html#SegmentNodeStore

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3159:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Extend documentation for SegmentNodeStoreService in 
> http://jackrabbit.apache.org/oak/docs/osgi_config.html#SegmentNodeStore
> ---
>
> Key: OAK-3159
> URL: https://issues.apache.org/jira/browse/OAK-3159
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: doc
>Reporter: Konrad Windszus
> Fix For: 1.3.10
>
>
> Currently the documentation at 
> http://jackrabbit.apache.org/oak/docs/osgi_config.html#SegmentNodeStore only 
> documents the properties
> # repository.home and
> # tarmk.size
> All the other properties like customBlobStore, tarmk.mode,  are not 
> documented. Please extend that. Also it would be good, if the table could be 
> extended with what type is supported for the individual properties.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3303) FileStore flush thread can get stuck

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3303:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> FileStore flush thread can get stuck
> 
>
> Key: OAK-3303
> URL: https://issues.apache.org/jira/browse/OAK-3303
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.10
>
>
> In some very rare circumstances the flush thread was seen as possibly stuck 
> for a while following a restart of the system. This results in data loss on 
> restart (the system will roll back to the latest persisted revision on 
> restart), and worse off there's no way of extracting the latest head revision 
> using the tar files, so recovery is not (yet) possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2629) Cleanup Oak Travis jobs

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2629:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Cleanup Oak Travis jobs
> ---
>
> Key: OAK-2629
> URL: https://issues.apache.org/jira/browse/OAK-2629
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: it
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>  Labels: CI
> Fix For: 1.3.10
>
>
> Since we're moving toward Jenkins, let's remove the Travis jobs for Oak. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2675) Include change type information in perf logs for diff logic

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-2675:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Include change type information in perf logs for diff logic
> ---
>
> Key: OAK-2675
> URL: https://issues.apache.org/jira/browse/OAK-2675
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Reporter: Chetan Mehrotra
>Priority: Minor
>  Labels: observation, performance, resilience, tooling
> Fix For: 1.3.10
>
>
> Currently the diff perf logs in {{NodeObserver}} does not indicate what type 
> of change was processed i.e. was the change an internal one or an external 
> one. 
> Having this information would allow us to determine how the cache is being 
> used. For e.g. if we see slower number even for local changes then that would 
> indicate that there is some issue with the diff cache and its not be being 
> utilized effectively 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3253) Support caching in FileDataStoreService

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3253:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Support caching in FileDataStoreService
> ---
>
> Key: OAK-3253
> URL: https://issues.apache.org/jira/browse/OAK-3253
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: blob
>Affects Versions: 1.3.3
>Reporter: Shashank Gupta
>Assignee: Shashank Gupta
>  Labels: candidate_oak_1_0, candidate_oak_1_2, docs-impacting, 
> features, performance
> Fix For: 1.3.10
>
>
> FDS on SAN/NAS storage is not efficient as it involves network call. In OAK. 
> indexes are stored SAN/NAS and even idle system does lot of read system 
> generated data. 
> Enable caching in FDS so the reads are done locally and async upload to 
> SAN/NAS
> See [previous 
> discussions|https://issues.apache.org/jira/browse/OAK-3005?focusedCommentId=14700801=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14700801]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3406) Configuration to rank exact match suggestions over partial match suggestions

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-3406:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Configuration to rank exact match suggestions over partial match suggestions
> 
>
> Key: OAK-3406
> URL: https://issues.apache.org/jira/browse/OAK-3406
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Minor
> Fix For: 1.3.10
>
>
> Currently, a suggestion query ranks the results according to popularity. But, 
> at times, it's intended to have suggested phrases based on exact matches to 
> be ranked above a more popular suggestion based on a partial match. e.g. a 
> repository might have a 1000 docs with {{windows is a very popular OS}} and 
> say 4 with {{win over them}} - it's a useful case to configure suggestions 
> such that for a suggestions query for {{win}}, we'd get {{win over them}} as 
> a higher ranked suggestion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-1695) Document Solr index

2015-10-27 Thread Davide Giannella (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davide Giannella updated OAK-1695:
--
Fix Version/s: (was: 1.3.9)
   1.3.10

> Document Solr index
> ---
>
> Key: OAK-1695
> URL: https://issues.apache.org/jira/browse/OAK-1695
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: doc, solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>  Labels: documentation
> Fix For: 1.3.10
>
>
> Provide documentation about the Oak Solr index. That should contain 
> information about the design and how to configure it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976326#comment-14976326
 ] 

Julian Reschke edited comment on OAK-3532 at 10/27/15 12:58 PM:


trunk: several commits
1.2: http://svn.apache.org/r1710798
1.0: http://svn.apache.org/r1710804


was (Author: reschke):
trunk: several commits
1.2: svn.apache.org/r1710798
1.0: svn.apache.org/r1710804

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7, 1.0.22
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.0.23, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3494) MemoryDiffCache should also check parent paths before falling to Loader (or returning null)

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-3494.
---
Resolution: Fixed

Applied Vikas' patch: http://svn.apache.org/r1710800

> MemoryDiffCache should also check parent paths before falling to Loader (or 
> returning null)
> ---
>
> Key: OAK-3494
> URL: https://issues.apache.org/jira/browse/OAK-3494
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>  Labels: performance
> Fix For: 1.3.10
>
> Attachments: OAK-3494-1.patch, OAK-3494-2.patch, 
> OAK-3494-TestCase.patch, OAK-3494.patch
>
>
> Each entry in {{MemoryDiffCache}} is keyed with {{(path, fromRev, toRev)}} 
> for the list of modified children at {{path}}. A diff calcualted by 
> {{DocumentNodeStore.diffImpl}} at '/' (passively via loader) or 
> {{JournalEntry.applyTo}} (actively) fill each path for which there are 
> modified children (including the hierarchy)
> But, if an observer calls {{compareWithBaseState}} on a unmodified sub-tree, 
> the observer will still go down to {{diffImpl}} although cached parent entry 
> can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976326#comment-14976326
 ] 

Julian Reschke commented on OAK-3532:
-

trunk: several commits
1.2: svn.apache.org/r1710798
1.0: svn.apache.org/r1710804

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7, 1.0.22
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.0.23, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3532:

Fix Version/s: 1.0.23

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7, 1.0.22
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.0.23, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3532:

Affects Version/s: 1.0.22

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7, 1.0.22
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.0.23, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3558) oak-core imports every package with the optional resolution policy

2015-10-27 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-3558:
---

 Summary: oak-core imports every package with the optional 
resolution policy
 Key: OAK-3558
 URL: https://issues.apache.org/jira/browse/OAK-3558
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core
Reporter: Francesco Mari


The oak-core bundle declares that every imported package has an optional 
resolution policy. Because of this, the OSGi framework will not attempt to 
resolve any imported package, but will happily resolve the bundle even if some 
dependencies are missing. This may generate {{NoClassDefFoundError}} at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3536) Indexing with Lucene and copy-on-read generate too much garbage in the BlobStore

2015-10-27 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976338#comment-14976338
 ] 

Thomas Mueller commented on OAK-3536:
-

Lucene files all start with the same bytes, maybe we can find out what type of 
files cause the problem. The first 6 bytes should be enough to find out, and 
the first 4 bytes always seem to be "3f d7 6c 17" except for del files, which 
start with "ff ff ff fe":

{noformat}
hexdump -n 6 -C 
3f d7 6c 17 19 43 : cfe / cfs
3f d7 6c 17 19 4c : doc
3f d7 6c 17 18 4c : fdt
3f d7 6c 17 08 73 : segments
3f d7 6c 17 13 4c : si
ff ff ff fe 3f d7 : del
{noformat}


> Indexing with Lucene and copy-on-read generate too much garbage in the 
> BlobStore
> 
>
> Key: OAK-3536
> URL: https://issues.apache.org/jira/browse/OAK-3536
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.3.9
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.4
>
>
> The copy-on-read strategy when using Lucene indexing performs too many copies 
> of the index files from the filesystem to the repository. Every copy discards 
> the previously stored binary, that sits there as garbage until the binary 
> garbage collection kicks in. When the load on the system is particularly 
> intense, this behaviour makes the repository grow at an unreasonable high 
> pace. 
> I spotted this on a system where some content is generated every day at a 
> specific time. The content generation process creates approx. 6 millions new 
> nodes, where each node contains 5 properties with small string, random 
> values. Nodes were saved in batches of 1000 nodes each. At the end of the 
> content generation process, the nodes are deleted to deliberately generate 
> garbage in the Segment Store. This is part of a testing effort to assess the 
> efficiency of the online compaction.
> I was never able to complete the tests because the system run out of disk 
> space due to a lot of unused binary values. When debugging the system, on a 
> 400 GB (full) disk, the segments containing nodes and property values 
> occupied approx. 3 GB. The rest of the space was occupied by binary values in 
> form of bulk segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976312#comment-14976312
 ] 

Chetan Mehrotra edited comment on OAK-3554 at 10/27/15 1:18 PM:


Had a discussion with Marcel and Vikas on this and below is the revised 
proposal. 

To avoid use w:majority for *all* operations and still be able to guarantee 
that a Commit does not end up in partial state due to rollback in case of 
replica state change following approach can be taken

# Ensure that all operation made as part of given commit are routed to same 
primary server
# Ensure that final operation of updating the commit root is done with 
w:majority

Caveats/Assumptions
* This is only proposed for non sharded env where Oak instance is directly 
communicating with Mongo replica set and not via mongos
* It assume that a for seq of operation \[o1->w:1, o2->w:1, o3->w:majority\]. 
If o3 went fine (with above conditions) then o1 and o2 would also have made it 
to the majority of replica set nodes i.e. order of operation done is maintained 
in replication log
This would ensure that in case of any partial rollback a commit is not marked 
as valid. How it is implemented would need to be worked out as it depends on 
# Mongo Java Driver providing information around where the writes ended up
# Ability to manage transaction state in DocumentNodeStore




was (Author: chetanm):
Had a discussion with Marcel and Vikas on this and below is the revised 
proposal. 

To avoid use w:majority for *all* operations and still be able to guarantee 
that a Commit does not end up in partial state due to rollback in case of 
replica state change following approach can be taken

# Ensure that all operation made as part of given commit are routed to same 
primary server
# Ensure that final operation of updating the commit root is done with 
w:majority

This would ensure that in case of any partial rollback a commit is not marked 
as valid. How it is implemented would need to be worked out as it depends on 
# Mongo Java Driver providing information around where the writes ended up
# Ability to manage transaction state in DocumentNodeStore



> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3555) Remove usage of deprecated mongo-java-driver methods

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-3555.
---
Resolution: Fixed

Applied patch to trunk: http://svn.apache.org/r1710806

> Remove usage of deprecated mongo-java-driver methods
> 
>
> Key: OAK-3555
> URL: https://issues.apache.org/jira/browse/OAK-3555
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.10
>
> Attachments: OAK-3555.patch
>
>
> This will allow for a smoother update of the driver to 3.x.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3557) NodeDocument.isConflicting() reads complete revision history for changed property

2015-10-27 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-3557:
-

 Summary: NodeDocument.isConflicting() reads complete revision 
history for changed property
 Key: OAK-3557
 URL: https://issues.apache.org/jira/browse/OAK-3557
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: core, mongomk
Affects Versions: 1.2, 1.0
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
 Fix For: 1.3.10


The method is only called when there is a concurrent change, but it will be 
rather expensive when there are many changes for a given property.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3489:

Fix Version/s: 1.2.8

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.3.8
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3489:

Affects Version/s: (was: 1.0.21)
   (was: 1.2.6)

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.3.8
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3536) Indexing with Lucene and copy-on-read generate too much garbage in the BlobStore

2015-10-27 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976281#comment-14976281
 ] 

Francesco Mari commented on OAK-3536:
-

I used both the default segment store and the FileDataStore. I observed the 
same behaviour with both.

> Indexing with Lucene and copy-on-read generate too much garbage in the 
> BlobStore
> 
>
> Key: OAK-3536
> URL: https://issues.apache.org/jira/browse/OAK-3536
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.3.9
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.4
>
>
> The copy-on-read strategy when using Lucene indexing performs too many copies 
> of the index files from the filesystem to the repository. Every copy discards 
> the previously stored binary, that sits there as garbage until the binary 
> garbage collection kicks in. When the load on the system is particularly 
> intense, this behaviour makes the repository grow at an unreasonable high 
> pace. 
> I spotted this on a system where some content is generated every day at a 
> specific time. The content generation process creates approx. 6 millions new 
> nodes, where each node contains 5 properties with small string, random 
> values. Nodes were saved in batches of 1000 nodes each. At the end of the 
> content generation process, the nodes are deleted to deliberately generate 
> garbage in the Segment Store. This is part of a testing effort to assess the 
> efficiency of the online compaction.
> I was never able to complete the tests because the system run out of disk 
> space due to a lot of unused binary values. When debugging the system, on a 
> 400 GB (full) disk, the segments containing nodes and property values 
> occupied approx. 3 GB. The rest of the space was occupied by binary values in 
> form of bulk segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3489:

Affects Version/s: 1.2.7
   1.0.22

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.3.8, 1.2.7, 1.0.22
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3550) Add meta data to segments

2015-10-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig resolved OAK-3550.

   Resolution: Fixed
Fix Version/s: 1.3.10

Fixed at http://svn.apache.org/viewvc?rev=1710780=rev

[~alex.parvulescu], please have a look when you can spare some time

> Add meta data to segments
> -
>
> Key: OAK-3550
> URL: https://issues.apache.org/jira/browse/OAK-3550
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segmentmk
>Reporter: Michael Dürig
>Assignee: Michael Dürig
> Fix For: 1.3.10
>
>
> For various kinds of tooling it would be good to have additional meta data 
> available within segments. Like which segment writer wrote (i.e. system or 
> compaction) the segment, a time stamp, the revision gc generation etc. 
> Such meta data could live in a regular String record right at the beginning 
> of the segment. As such it would always occupy the first slot of the root 
> records and would be very easy to get to in tooling. Additionally it would be 
> written as regular content and thus wouldn't require any changes in the 
> segment format. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-3489.
-
Resolution: Fixed

trunk : svn.apache.org/r1710789

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.0.21, 1.2.6, 1.3.8
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3489:

Affects Version/s: (was: 1.3.7)
   1.3.8

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.0.21, 1.2.6, 1.3.8
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3532:

Fix Version/s: 1.2.8

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3532:

Affects Version/s: 1.3.8
   1.2.7

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Affects Versions: 1.3.8, 1.2.7
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9, 1.2.8
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976312#comment-14976312
 ] 

Chetan Mehrotra commented on OAK-3554:
--

Had a discussion with Marcel and Vikas on this and below is the revised 
proposal. 

To avoid use w:majority for *all* operations and still be able to guarantee 
that a Commit does not end up in partial state due to rollback in case of 
replica state change following approach can be taken

# Ensure that all operation made as part of given commit are routed to same 
primary server
# Ensure that final operation of updating the commit root is done with 
w:majority

This would ensure that in case of any partial rollback a commit is not marked 
as valid. How it is implemented would need to be worked out as it depends on 
# Mongo Java Driver providing information around where the writes ended up
# Ability to manage transaction state in DocumentNodeStore



> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-3532.
-
Resolution: Fixed

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3532) create RDB export tool

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3532:

Fix Version/s: (was: 1.3.10)
   1.3.9

> create RDB export tool
> --
>
> Key: OAK-3532
> URL: https://issues.apache.org/jira/browse/OAK-3532
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.3.9
>
>
> Create a super-simple utility that can export JSON representations of 
> selected rows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3489) DocumentStore: introduce a "NotEquals" condition

2015-10-27 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-3489:

Fix Version/s: 1.3.9

> DocumentStore: introduce a "NotEquals" condition
> 
>
> Key: OAK-3489
> URL: https://issues.apache.org/jira/browse/OAK-3489
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk
>Affects Versions: 1.0.21, 1.2.6, 1.3.8
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.3.9
>
> Attachments: OAK-3489-mreutegg.patch, OAK-3489.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3536) Indexing with Lucene and copy-on-read generate too much garbage in the BlobStore

2015-10-27 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976261#comment-14976261
 ] 

Thomas Mueller commented on OAK-3536:
-

You wrote "in the BlobStore", did you configure a blob store (FileDataStore or 
other), or did you use the default segment store.

> Indexing with Lucene and copy-on-read generate too much garbage in the 
> BlobStore
> 
>
> Key: OAK-3536
> URL: https://issues.apache.org/jira/browse/OAK-3536
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.3.9
>Reporter: Francesco Mari
>Priority: Critical
> Fix For: 1.4
>
>
> The copy-on-read strategy when using Lucene indexing performs too many copies 
> of the index files from the filesystem to the repository. Every copy discards 
> the previously stored binary, that sits there as garbage until the binary 
> garbage collection kicks in. When the load on the system is particularly 
> intense, this behaviour makes the repository grow at an unreasonable high 
> pace. 
> I spotted this on a system where some content is generated every day at a 
> specific time. The content generation process creates approx. 6 millions new 
> nodes, where each node contains 5 properties with small string, random 
> values. Nodes were saved in batches of 1000 nodes each. At the end of the 
> content generation process, the nodes are deleted to deliberately generate 
> garbage in the Segment Store. This is part of a testing effort to assess the 
> efficiency of the online compaction.
> I was never able to complete the tests because the system run out of disk 
> space due to a lot of unused binary values. When debugging the system, on a 
> 400 GB (full) disk, the segments containing nodes and property values 
> occupied approx. 3 GB. The rest of the space was occupied by binary values in 
> form of bulk segments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976470#comment-14976470
 ] 

Julian Reschke edited comment on OAK-3499 at 10/27/15 2:58 PM:
---

Yup, can be reproduced by commenting out the code that finds hardware 
interfaces.

What happens here is that in that case, when the DocumentNodeStore 
(setClusterId(1)) is initialized the second time, it won't be able to re-use 
the existing ClusterNodeInfo, as the machineId (set to a random UUID) does not 
match. That was indeed intentional.

We probably need to tune the test case in some way. 


was (Author: reschke):
Yup, can be reproduced by commenting out the code that finds hardware 
interfaces. Will investigate.

> Test failures when there is no network interface
> 
>
> Key: OAK-3499
> URL: https://issues.apache.org/jira/browse/OAK-3499
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Julian Reschke
>Priority: Minor
>
> There are test failures when no network interface is available.
> {noformat}
> Tests in error: 
>   purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest): Configured 
> cluster node id 1 already in use: 
>   
> inactiveClusterId(org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreTest):
>  Configured cluster node id 2 already in use: 
>   
> purgeUnmergedBranch(org.apache.jackrabbit.oak.plugins.document.UnmergedBranchTest):
>  Configured cluster node id 1 already in use: 
> {noformat}
> I'm quite confident these used to work before.
> [~reschke], could this be caused by recent changes for OAK-3449?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976566#comment-14976566
 ] 

Marcel Reutegger commented on OAK-3499:
---

But isn't this rather strange behaviour? The DocumentNodeStore is disposed, 
which means the clusterId is not in use anymore. There is no lease on the 
clusterId. AFAICS, the entry is even cleaned up in 
{{ClusterNodeInfo.createInstance()}}.

> Test failures when there is no network interface
> 
>
> Key: OAK-3499
> URL: https://issues.apache.org/jira/browse/OAK-3499
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Julian Reschke
>Priority: Minor
>
> There are test failures when no network interface is available.
> {noformat}
> Tests in error: 
>   purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest): Configured 
> cluster node id 1 already in use: 
>   
> inactiveClusterId(org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreTest):
>  Configured cluster node id 2 already in use: 
>   
> purgeUnmergedBranch(org.apache.jackrabbit.oak.plugins.document.UnmergedBranchTest):
>  Configured cluster node id 1 already in use: 
> {noformat}
> I'm quite confident these used to work before.
> [~reschke], could this be caused by recent changes for OAK-3449?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3372) Collapsing external events in BackgroundObserver even before queue is full leads to JournalEntry not getting used

2015-10-27 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976607#comment-14976607
 ] 

Vikas Saurabh commented on OAK-3372:


Discussed a few options of detecting segment/document with [~chetanm] and 
[~mreutegg].
* Background observer can be passed the configuration to collapse/not-collapse 
external events early using descriptors
* Since external event can't have non-null commit info, we might hack around 
and add some extra data into local events - that extra info essentially can 
configure the background observer. Apart from this being very hacky, it's also 
hard to generate solve the case when there are no local events getting generated

It seems that the solutions to differentiate segment and document store inside 
background observer is non-trivial. Added to that there's no concept of 
external event wrt segment anyway. So, it might make sense to resolve this 
issue with {{collapse only when queue is full}} strategy and open a new one to 
track {{detect mode and decide collapse behavior}} strategy. [~mduerig], wdyt?

> Collapsing external events in BackgroundObserver even before queue is full 
> leads to JournalEntry not getting used
> -
>
> Key: OAK-3372
> URL: https://issues.apache.org/jira/browse/OAK-3372
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.3.5
>Reporter: Vikas Saurabh
>  Labels: resilience
>
> BackgroundObserver currently merges external events if the last one in queue 
> is also an external event. This leads to diff being done for a revision pair 
> which is different from the ones pushed actively into cache during backgroud 
> read (using JournalEntry) i.e. diff queries for {{diff("/a/b", rA, rC)}} 
> while background read had pushed results of {{diff("/a/b", rA, rB)}} and 
> {{diff("/a/b", rB, rC)}}.
> (cc [~mreutegg], [~egli], [~chetanm], [~mduerig])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976583#comment-14976583
 ] 

Marcel Reutegger commented on OAK-3499:
---

Yes, just those with random IDs.

> Test failures when there is no network interface
> 
>
> Key: OAK-3499
> URL: https://issues.apache.org/jira/browse/OAK-3499
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Julian Reschke
>Priority: Minor
>
> There are test failures when no network interface is available.
> {noformat}
> Tests in error: 
>   purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest): Configured 
> cluster node id 1 already in use: 
>   
> inactiveClusterId(org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreTest):
>  Configured cluster node id 2 already in use: 
>   
> purgeUnmergedBranch(org.apache.jackrabbit.oak.plugins.document.UnmergedBranchTest):
>  Configured cluster node id 1 already in use: 
> {noformat}
> I'm quite confident these used to work before.
> [~reschke], could this be caused by recent changes for OAK-3449?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976572#comment-14976572
 ] 

Julian Reschke commented on OAK-3499:
-

With random IDs? Might be a plan. Good idea.

> Test failures when there is no network interface
> 
>
> Key: OAK-3499
> URL: https://issues.apache.org/jira/browse/OAK-3499
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Julian Reschke
>Priority: Minor
>
> There are test failures when no network interface is available.
> {noformat}
> Tests in error: 
>   purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest): Configured 
> cluster node id 1 already in use: 
>   
> inactiveClusterId(org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreTest):
>  Configured cluster node id 2 already in use: 
>   
> purgeUnmergedBranch(org.apache.jackrabbit.oak.plugins.document.UnmergedBranchTest):
>  Configured cluster node id 1 already in use: 
> {noformat}
> I'm quite confident these used to work before.
> [~reschke], could this be caused by recent changes for OAK-3449?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3499) Test failures when there is no network interface

2015-10-27 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976568#comment-14976568
 ] 

Marcel Reutegger commented on OAK-3499:
---

Maybe we should remove such entries on DocumentNodeStore.dispose()?

> Test failures when there is no network interface
> 
>
> Key: OAK-3499
> URL: https://issues.apache.org/jira/browse/OAK-3499
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Reporter: Marcel Reutegger
>Assignee: Julian Reschke
>Priority: Minor
>
> There are test failures when no network interface is available.
> {noformat}
> Tests in error: 
>   purge(org.apache.jackrabbit.oak.plugins.document.CollisionTest): Configured 
> cluster node id 1 already in use: 
>   
> inactiveClusterId(org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreTest):
>  Configured cluster node id 2 already in use: 
>   
> purgeUnmergedBranch(org.apache.jackrabbit.oak.plugins.document.UnmergedBranchTest):
>  Configured cluster node id 1 already in use: 
> {noformat}
> I'm quite confident these used to work before.
> [~reschke], could this be caused by recent changes for OAK-3449?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-3560) Tooling for writing segment graphs to a file

2015-10-27 Thread JIRA
Michael Dürig created OAK-3560:
--

 Summary: Tooling for writing segment graphs to a file
 Key: OAK-3560
 URL: https://issues.apache.org/jira/browse/OAK-3560
 Project: Jackrabbit Oak
  Issue Type: New Feature
  Components: run
Reporter: Michael Dürig
Assignee: Michael Dürig


[Gephi|https://gephi.org/] turned out to be very valuable for examining segment 
graphs. I would like to add some tooling so we could dump the segment graph of 
a {{FileStore}} to a file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3560) Tooling for writing segment graphs to a file

2015-10-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig resolved OAK-3560.

   Resolution: Fixed
Fix Version/s: 1.3.10

Fixed at http://svn.apache.org/viewvc?rev=1710862=rev.

The 'graph' run mode mode exports the segment graph of a file store to a text 
file in the [Guess GDF 
format|https://gephi.github.io/users/supported-graph-formats/gdf-format/], 
which is easily imported into [Gephi|https://gephi.github.io].

As the GDF format only supports integer values but the segment time stamps are 
encoded as long values an optional 'epoch' argument can be specified. If no 
epoch is given on the command line the start of the day of the last modified 
date of the 'journal.log' is used. The epoch specifies a negative offset 
translating all timestamps into a valid int range.

{noformat}
$ java -jar oak-run-*.jar graph [File] 

[File] -- Path to segment store (required)

Option   Description
--   ---
--epochEpoch of the segment time stamps
   (derived from journal.log if not
   given)
--output   Output file (default: segments.gdf)
{noformat}

Sample output:
{noformat}
nodedef>name VARCHAR, label VARCHAR, type VARCHAR, wid VARCHAR, gc INT, t INT, 
head BOOLEAN
aa871660-786d-4af5-a6e1-df7b8fa4b0ef,74,data,c-01,0,44051986,true
6afcaed2-0545-4d9e-a440-aac08038b5ac,72,data,c-01,0,44051892,true
dba85de3-19ac-488d-a99b-f608dfc811ff,79,data,c-01,0,44052183,true
edgedef>node1 VARCHAR, node2 VARCHAR, head BOOLEAN
dba85de3-19ac-488d-a99b-f608dfc811ff,aa871660-786d-4af5-a6e1-df7b8fa4b0ef,true
aa871660-786d-4af5-a6e1-df7b8fa4b0ef,6afcaed2-0545-4d9e-a440-aac08038b5ac,false
{noformat}

> Tooling for writing segment graphs to a file
> 
>
> Key: OAK-3560
> URL: https://issues.apache.org/jira/browse/OAK-3560
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: run
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: tooling
> Fix For: 1.3.10
>
>
> [Gephi|https://gephi.org/] turned out to be very valuable for examining 
> segment graphs. I would like to add some tooling so we could dump the segment 
> graph of a {{FileStore}} to a file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3561) Query Constraints of "reference" property

2015-10-27 Thread Jessie (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jessie updated OAK-3561:

Description: 
When working with Oak 1.2.7, I find the following query shows 0 result 
(Jackrabbit 2.10 outputs the correct rows).

"SELECT case.[jcr:uuid] as uuid FROM [xx.Patient] as patient INNER JOIN 
[xx:Case] as case ON patient.[jcr:uuid] = case.[patient] WHERE patient.[name] 
LIKE 'someone%'

case.[patient] is defined as weakreference type, representing patient

I just wonder whether it is a bug or such query should be expressed in another 
way in OAK?

Thank you very much!

  was:
When working with Oak 1.2.7, I find the following query shows 0 result 
(Jackrabbit 2.10 outputs the correct rows).

"SELECT case.[jcr:uuid] as uuid FROM [xx.Patient] as patient INNER JOIN 
[xx:Case] as case ON patient.[jcr:uuid] = case.[patient] WHERE patient.[name] = 
'someone%'

case.[patient] is defined as weakreference type, representing patient

I just wonder whether it is a bug or such query should be expressed in another 
way in OAK?

Thank you very much!


> Query Constraints of "reference" property
> -
>
> Key: OAK-3561
> URL: https://issues.apache.org/jira/browse/OAK-3561
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: query
>Affects Versions: 1.2.7
> Environment: Java 1.8.0_11 with Mongodb 3.0.6 on Windows 7
>Reporter: Jessie
>
> When working with Oak 1.2.7, I find the following query shows 0 result 
> (Jackrabbit 2.10 outputs the correct rows).
> "SELECT case.[jcr:uuid] as uuid FROM [xx.Patient] as patient INNER JOIN 
> [xx:Case] as case ON patient.[jcr:uuid] = case.[patient] WHERE patient.[name] 
> LIKE 'someone%'
> case.[patient] is defined as weakreference type, representing patient
> I just wonder whether it is a bug or such query should be expressed in 
> another way in OAK?
> Thank you very much!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3556) MongoDocumentStore may create incomplete document

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3556:
--
Fix Version/s: 1.2.8
   1.0.23

Merged into 1.2 branch: http://svn.apache.org/r1710826

and 1.0 branch: http://svn.apache.org/r1710831

> MongoDocumentStore may create incomplete document
> -
>
> Key: OAK-3556
> URL: https://issues.apache.org/jira/browse/OAK-3556
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Affects Versions: 1.0, 1.2
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.10, 1.0.23, 1.2.8
>
>
> The document is incomplete when there are multiple set-map-entry operations
> for the same name but with different revisions.
> Right now the DocumentNodeStore does not create such documents, which means 
> this is not a problem in practice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3335) RepositorySidegrade has runtime dependency on jackrabbit-core

2015-10-27 Thread Julian Sedding (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Sedding resolved OAK-3335.
-
Resolution: Won't Fix

Since {{oak-upgrade}} is a runnable jar file now, and it embeds JR2 classes, 
this is no longer a problem.

> RepositorySidegrade has runtime dependency on jackrabbit-core
> -
>
> Key: OAK-3335
> URL: https://issues.apache.org/jira/browse/OAK-3335
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: upgrade
>Affects Versions: 1.3.5
>Reporter: Julian Sedding
>Assignee: Julian Sedding
> Fix For: 1.3.10
>
>
> It should be possible to run {{RepositorySidegrade}} from a runnable jar file 
> that does not embed {{jackrabbit-core}}. E.g. once {{RepositorySidegrade}} is 
> enabled in {{oak-run}}, it shoudl work in the variant that does not embed 
> jackrabbit-core.
> OAK-3239 introduced a transitive runtime dependency from 
> {{RepositorySidegrade}} to {{RepositoryUpgrade}} (via static imports), which 
> has dependencies to classes in {{jackrabbit-core}}. This leads to failure 
> with a {{ClassNotFoundException}}.
> Moving the constants and static method to {{RepositorySidegrade}} and 
> importing them in {{RepositoryUpgrade}} instead should resolve the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-3556) MongoDocumentStore may create incomplete document

2015-10-27 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-3556.
---
Resolution: Fixed

Fixed in trunk: http://svn.apache.org/r1710816

> MongoDocumentStore may create incomplete document
> -
>
> Key: OAK-3556
> URL: https://issues.apache.org/jira/browse/OAK-3556
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: core, mongomk
>Affects Versions: 1.0, 1.2
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.3.10
>
>
> The document is incomplete when there are multiple set-map-entry operations
> for the same name but with different revisions.
> Right now the DocumentNodeStore does not create such documents, which means 
> this is not a problem in practice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3554) Use write concern of w:majority when connected to a replica set

2015-10-27 Thread Norberto Leite (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976408#comment-14976408
 ] 

Norberto Leite commented on OAK-3554:
-

Hi gentelman, 

Actually you should wait for version 3.2 to enable this:
https://jira.mongodb.org/browse/SERVER-18022

We are going to introduce a "*read committed*" read preference that will avoid  
_stale_ reads from a replica set. 
There are some considerations to be made, in terms of what is *the last* insert 
in a distributed cluster, but for this particular exercise one can guarantee 
that once reading from a replica set, if we want to avoid reading information 
that can potentially be rollbacked, this read preference will have that 
guarantee. 
If you wish to follow the implementation details on the java driver just follow 
this ticket -> https://jira.mongodb.org/browse/JAVA-2002

Please ping me if you need further information

> Use write concern of w:majority when connected to a replica set
> ---
>
> Key: OAK-3554
> URL: https://issues.apache.org/jira/browse/OAK-3554
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Chetan Mehrotra
> Fix For: 1.3.10
>
>
> Currently while connecting to Mongo MongoDocumentStore relies on default 
> write concern provided as part of mongouri. 
> Recently some issues were seen where Mongo based Oak was connecting to 3 
> member replica set and there were frequent replica state changes due to use 
> of VM for Mongo. This caused data loss and corruption of data in Oak.
> To avoid such situation Oak should default to write concern of majority by 
> default. If some write concern is specified as part of mongouri then that 
> should take precedence. This would allow system admin to take the call of 
> tweaking write concern if required and at same time allows Oak to use the 
> safe write concern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)