[jira] [Created] (OAK-7094) Log cli arguments and vm arguments passed to indexer command

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7094:


 Summary: Log cli arguments and vm arguments passed to indexer 
command
 Key: OAK-7094
 URL: https://issues.apache.org/jira/browse/OAK-7094
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.7.14, 1.8


It would be useful to also log the cli arguments to the indexing.log as that 
would help in analysing any customer reported issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7094) Log cli arguments and vm arguments passed to indexer command

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7094.
--
Resolution: Fixed

Done with 1818746

> Log cli arguments and vm arguments passed to indexer command
> 
>
> Key: OAK-7094
> URL: https://issues.apache.org/jira/browse/OAK-7094
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.8, 1.7.14
>
>
> It would be useful to also log the cli arguments to the indexing.log as that 
> would help in analysing any customer reported issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7095) NodeStoreFixtureProvider should use BlobStore from DocumentNodeStore if no DataStore configured

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7095:


 Summary: NodeStoreFixtureProvider should use BlobStore from 
DocumentNodeStore if no DataStore configured
 Key: OAK-7095
 URL: https://issues.apache.org/jira/browse/OAK-7095
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.8, 1.7.14


NodeStoreFixtureProvider currently works fine for explicitly configured 
BlobStore. However for setups like Mongo where is no external DataStore is 
configured an implicit one is created then that BlobStore is not exposed.

So NodeStoreFixtureProvider should expose such a BlobStore



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7095) NodeStoreFixtureProvider should use BlobStore from DocumentNodeStore if no DataStore configured

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7095.
--
Resolution: Fixed

Done with 1818751

> NodeStoreFixtureProvider should use BlobStore from DocumentNodeStore if no 
> DataStore configured
> ---
>
> Key: OAK-7095
> URL: https://issues.apache.org/jira/browse/OAK-7095
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.14
>
>
> NodeStoreFixtureProvider currently works fine for explicitly configured 
> BlobStore. However for setups like Mongo where is no external DataStore is 
> configured an implicit one is created then that BlobStore is not exposed.
> So NodeStoreFixtureProvider should expose such a BlobStore



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6973) Define public/internal packages

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6973:
--
Fix Version/s: (was: 1.8)
   1.10

> Define public/internal packages
> ---
>
> Key: OAK-6973
> URL: https://issues.apache.org/jira/browse/OAK-6973
> Project: Jackrabbit Oak
>  Issue Type: Task
>Reporter: Marcel Reutegger
> Fix For: 1.10
>
>
> As part of the Oak modularization packages previously exported without a 
> version will at some point have to adhere to proper semantic versioning. See 
> also OAK-3919 and its sub-tasks.
> Since some of those packages are not meant to be used outside of Oak, there 
> should be a mechanism to define which exported packages are public and which 
> are considered internal. While semantic versioning rules apply to both 
> categories, we may want to provide different guarantees/guidance to consumers 
> of those packages. E.g. increasing the major version of a package used only 
> by Oak has less impact compared to a major version increase of a 'public' 
> package used by many applications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7066) Active deletion blob list files can grow too large due to inlined blobs

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-7066:
--
Fix Version/s: (was: 1.8.)
   1.8

> Active deletion blob list files can grow too large due to inlined blobs
> ---
>
> Key: OAK-7066
> URL: https://issues.apache.org/jira/browse/OAK-7066
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Vikas Saurabh
>Assignee: Vikas Saurabh
>Priority: Blocker
> Fix For: 1.7.13, 1.8
>
>
> This is follow up from OAK-7052 where we noticed that deleted blob list files 
> collected by active deletion logic can grow very large due to inlined blobs.
> One potential way (not sure how yet though) is to not actively delete inlined 
> blobs.
> Here are some stats which might help us take a call (based on raw numbers 
> collected at \[0])
> ||file-name||large_lines||large_size||small_lines||small_size||small_lines/total_lines||small_size/total_size||
> |blobs-1512664032264.txt|245301|3310224358|173096|35473656|0.413712335413495|0.010602766852107|
> |blobs-1512698405656.txt|370373|4443957885|256775|52997864|0.409432861142824|0.011785275852845|
> |blobs-1512987450004.txt|660669|6214740439|461168|92017554|0.411082893504137|0.014590309966251|
> |blobs-1513130410963.txt|569083|5490965583|406756|80124598|0.416826956085994|0.014382211631264|
> |blobs-1513216819447.txt|69876|1413561892|46238|9221956|0.398212101899857|0.006481628262061|
> \[0]:
> file sizes
> {noformat}
> repository/index/deleted-blobs$ ls -l blobs-151*
> -rw-r--r-- 1 root root 3369065620 Dec  8 01:59 blobs-1512664032264.txt
> -rw-r--r-- 1 root root 4532250073 Dec  9 01:59 blobs-1512698405656.txt
> -rw-r--r-- 1 root root 6370201955 Dec 13 01:59 blobs-1512987450004.txt
> -rw-r--r-- 1 root root 1916223582 Dec 13 11:52 blobs-1513130410963.txt
> {noformat}
> number of entries
> {noformat}
> repository/index/deleted-blobs$ wc -l blobs-151*
>  418397 blobs-1512664032264.txt
>  627148 blobs-1512698405656.txt
> 1121837 blobs-1512987450004.txt
>  308292 blobs-1513130410963.txt
> 2475674 total
> {noformat}
> number of entries and sizes split on threshold of 500 bytes of blob ids
> {noformat}
> repository/index/deleted-blobs$ for i in blobs-151*;do echo $i;awk 'BEGIN 
> {FS="|"} {len = length($1); if (len > 500) {large++; largeSize+=len} else 
> {small++; smallSize+=len}} END {print large, largeSize, small, smallSize}' 
> $i;done
> blobs-1512664032264.txt
> 245301 3310224358 173096 35473656
> blobs-1512698405656.txt
> 370373 4443957885 256775 52997864
> blobs-1512987450004.txt
> 660669 6214740439 461168 92017554
> blobs-1513130410963.txt
> 569083 5490965583 406756 80124598
> blobs-1513216819447.txt
> 69876 1413561892 46238 9221956
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7096) Compation should log generation info

2017-12-20 Thread JIRA
Michael Dürig created OAK-7096:
--

 Summary: Compation should log generation info
 Key: OAK-7096
 URL: https://issues.apache.org/jira/browse/OAK-7096
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segment-tar
Reporter: Michael Dürig
Assignee: Michael Dürig
Priority: Minor
 Fix For: 1.8


When compaction starts it should also log the current gc generation and the new 
gc generation it is going to create. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7080) Segment-Tar-Cold fixture should have options for secure communication and one shot runs

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-7080:
--
Fix Version/s: (was: 1.8.)
   1.8

> Segment-Tar-Cold fixture should have options for secure communication and one 
> shot runs
> ---
>
> Key: OAK-7080
> URL: https://issues.apache.org/jira/browse/OAK-7080
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: benchmarks, segment-tar, tarmk-standby
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: cold-standby
> Fix For: 1.7.13, 1.8
>
> Attachments: OAK-7080.patch
>
>
> The newly introduced {{Segment-Tar-Cold}} fixture should support secure 
> communication between primary and standby via a {{--secure}} option. 
> Moreover, the current implementation allows only for continuous sync between 
> primary and standby. It should be possible to allow a "one-shot run" of the 
> sync to easily measure and compare specific metrics ({{--oneShotRun}} option).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7097) DocumentStoreIndexer should clear the index state prior to indexing

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7097:


 Summary: DocumentStoreIndexer should clear the index state prior 
to indexing
 Key: OAK-7097
 URL: https://issues.apache.org/jira/browse/OAK-7097
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.8, 1.7.14


DocumentStoreIndexer currently implements some part of logic which is present 
in IndexUpdate. However it misses on 2 things

# Removing the hidden index state
# Resetting the reindexing flag

Those should be implemented



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7097) DocumentStoreIndexer should clear the index state prior to indexing

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-7097:
-
Affects Version/s: 1.7.13

> DocumentStoreIndexer should clear the index state prior to indexing
> ---
>
> Key: OAK-7097
> URL: https://issues.apache.org/jira/browse/OAK-7097
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: run
>Affects Versions: 1.7.13
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.14
>
>
> DocumentStoreIndexer currently implements some part of logic which is present 
> in IndexUpdate. However it misses on 2 things
> # Removing the hidden index state
> # Resetting the reindexing flag
> Those should be implemented



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7098) Refcator common logic between IndexUpdate and DocumentStoreIndexer

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7098:


 Summary: Refcator common logic between IndexUpdate and 
DocumentStoreIndexer
 Key: OAK-7098
 URL: https://issues.apache.org/jira/browse/OAK-7098
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: indexing, run
Reporter: Chetan Mehrotra
 Fix For: 1.10


DocumentStoreIndexer implements an alternative way of indexing which differs 
from diff based indexing done by IndexUpdate. However some part of logic is 
commong

We should refactor them and abstract them out so both can share same logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7098) Refactor common logic between IndexUpdate and DocumentStoreIndexer

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-7098:
-
Summary: Refactor common logic between IndexUpdate and DocumentStoreIndexer 
 (was: Refcator common logic between IndexUpdate and DocumentStoreIndexer)

> Refactor common logic between IndexUpdate and DocumentStoreIndexer
> --
>
> Key: OAK-7098
> URL: https://issues.apache.org/jira/browse/OAK-7098
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: indexing, run
>Reporter: Chetan Mehrotra
> Fix For: 1.10
>
>
> DocumentStoreIndexer implements an alternative way of indexing which differs 
> from diff based indexing done by IndexUpdate. However some part of logic is 
> commong
> We should refactor them and abstract them out so both can share same logic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7096) Compaction should log generation info

2017-12-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig updated OAK-7096:
---
Summary: Compaction should log generation info  (was: Compation should log 
generation info)

> Compaction should log generation info
> -
>
> Key: OAK-7096
> URL: https://issues.apache.org/jira/browse/OAK-7096
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Minor
>  Labels: compaction, gc
> Fix For: 1.8
>
>
> When compaction starts it should also log the current gc generation and the 
> new gc generation it is going to create. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7096) Compaction should log generation info

2017-12-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Dürig resolved OAK-7096.

Resolution: Fixed

Fixed at http://svn.apache.org/viewvc?rev=1818759&view=rev

> Compaction should log generation info
> -
>
> Key: OAK-7096
> URL: https://issues.apache.org/jira/browse/OAK-7096
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Minor
>  Labels: compaction, gc
> Fix For: 1.8
>
>
> When compaction starts it should also log the current gc generation and the 
> new gc generation it is going to create. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7099) DocumentStoreIndexer should log estimate of ETA for dumping phase

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7099:


 Summary: DocumentStoreIndexer should log estimate of ETA for 
dumping phase
 Key: OAK-7099
 URL: https://issues.apache.org/jira/browse/OAK-7099
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
Priority: Minor
 Fix For: 1.8, 1.7.14


DocumentStoreIndexer currently does not log ETA for dumping phase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-4112) Replace the query exclusive lock with a cache tracker

2017-12-20 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264015#comment-15264015
 ] 

Julian Reschke edited comment on OAK-4112 at 12/20/17 9:36 AM:
---

I notice that the RDB variant had some DEBUG level logic so we can inspect the 
Bloom filter's performance when done.

{noformat}
if (LOG.isDebugEnabled()) {
if (filter != null) {
LOG.debug("Disposing QueryContext for range " + fromKey + 
"..." + toKey + " - filter fpp was: "
+ filter.expectedFpp());
} else {
LOG.debug("Disposing QueryContext for range " + fromKey + 
"..." + toKey + " - no filter was needed");
}
}
{noformat}

Should we re-add that?


was (Author: reschke):
I notice that the RDB variant had some DEBUG level logic so we can onpsect the 
Bloom filter's performance when done.

{noformat}
if (LOG.isDebugEnabled()) {
if (filter != null) {
LOG.debug("Disposing QueryContext for range " + fromKey + 
"..." + toKey + " - filter fpp was: "
+ filter.expectedFpp());
} else {
LOG.debug("Disposing QueryContext for range " + fromKey + 
"..." + toKey + " - no filter was needed");
}
}
{noformat}

Should we re-add that?

> Replace the query exclusive lock with a cache tracker
> -
>
> Key: OAK-4112
> URL: https://issues.apache.org/jira/browse/OAK-4112
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>  Labels: performance
> Fix For: 1.4.6, 1.5.2, 1.6.0
>
> Attachments: OAK-4112-1.patch, OAK-4112-2.patch, OAK-4112-3.patch, 
> OAK-4112-4.patch, OAK-4112-putifnewer.patch, OAK-4112.patch
>
>
> The {{MongoDocumentStore#query()}} method uses an expensive 
> {{TreeLock#acquireExclusive}} method, introduced in OAK-1897 to avoid caching 
> outdated documents.
> It should be possible to avoid acquiring the exclusive lock, by tracking the 
> cache changes that occurs during the Mongo find() operation. When the find() 
> is done, we can update the cache with the received documents if they haven't 
> been invalidated in the meantime.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (OAK-4112) Replace the query exclusive lock with a cache tracker

2017-12-20 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306268#comment-15306268
 ] 

Julian Reschke edited comment on OAK-4112 at 12/20/17 9:36 AM:
---

Just for record - Current BloomFilter usage involves synchronized access. If it 
turns out that it had quite adverse impact on concurrency we might need to 
revisit this. Couple of related links around this
* http://stackoverflow.com/questions/11720111/thread-safe-bloomfilter
* https://github.com/google/guava/issues/1090 - Guava issue for having a 
concurrent BloomFilter
* A possible concurrent BloomFilter implementation 
https://github.com/ifesdjeen/blomstre adapted from Cassandra implementation

Note it impacts reads also but only those key ranges for which some update is 
in progress. Implementation wise it would have similar behaviour as TreeLock 
used by MongoDocumentStore earlier


was (Author: chetanm):
Just for record - Current BloomFilter usage involves synchornized access. If it 
turns out that it had quite adverse impact on concurrency we might need to 
revisit this. Couple of related links around this
* http://stackoverflow.com/questions/11720111/thread-safe-bloomfilter
* https://github.com/google/guava/issues/1090 - Guava issue for having a 
concurrent BloomFilter
* A possible concurrent BloomFilter implementation 
https://github.com/ifesdjeen/blomstre adapted from Cassandra implementation

Note it impacts reads also but only those key ranges for which some update is 
in progress. Implementation wise it would have similar behaviour as TreeLock 
used by MongoDocumentStore earlier

> Replace the query exclusive lock with a cache tracker
> -
>
> Key: OAK-4112
> URL: https://issues.apache.org/jira/browse/OAK-4112
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>  Labels: performance
> Fix For: 1.4.6, 1.5.2, 1.6.0
>
> Attachments: OAK-4112-1.patch, OAK-4112-2.patch, OAK-4112-3.patch, 
> OAK-4112-4.patch, OAK-4112-putifnewer.patch, OAK-4112.patch
>
>
> The {{MongoDocumentStore#query()}} method uses an expensive 
> {{TreeLock#acquireExclusive}} method, introduced in OAK-1897 to avoid caching 
> outdated documents.
> It should be possible to avoid acquiring the exclusive lock, by tracking the 
> cache changes that occurs during the Mongo find() operation. When the find() 
> is done, we can update the cache with the received documents if they haven't 
> been invalidated in the meantime.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7097) DocumentStoreIndexer should clear the index state prior to indexing

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7097.
--
Resolution: Fixed

Done with 1818758

> DocumentStoreIndexer should clear the index state prior to indexing
> ---
>
> Key: OAK-7097
> URL: https://issues.apache.org/jira/browse/OAK-7097
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: run
>Affects Versions: 1.7.13
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.14
>
>
> DocumentStoreIndexer currently implements some part of logic which is present 
> in IndexUpdate. However it misses on 2 things
> # Removing the hidden index state
> # Resetting the reindexing flag
> Those should be implemented



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7099) DocumentStoreIndexer should log estimate of ETA for dumping phase

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7099.
--
Resolution: Fixed

Done with 1818768

> DocumentStoreIndexer should log estimate of ETA for dumping phase
> -
>
> Key: OAK-7099
> URL: https://issues.apache.org/jira/browse/OAK-7099
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.8, 1.7.14
>
>
> DocumentStoreIndexer currently does not log ETA for dumping phase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain updated OAK-7093:
---
Fix Version/s: 1.7.14

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Amit Jain
> Fix For: 1.8, 1.7.14
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6881) indirect test dependencies through tika-parser to vulnerable version of xmpcore

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6881:

Labels: candidate_oak_1_6  (was: )

> indirect test dependencies through tika-parser to vulnerable version of 
> xmpcore
> ---
>
> Key: OAK-6881
> URL: https://issues.apache.org/jira/browse/OAK-6881
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: parent
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
>
> We should upgrade to a version of tika-parser that fixes 
> https://issues.apache.org/jira/browse/TIKA-2486



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6881) indirect test dependencies through tika-parser to vulnerable version of xmpcore

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6881:

Component/s: parent

> indirect test dependencies through tika-parser to vulnerable version of 
> xmpcore
> ---
>
> Key: OAK-6881
> URL: https://issues.apache.org/jira/browse/OAK-6881
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: parent
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
>
> We should upgrade to a version of tika-parser that fixes 
> https://issues.apache.org/jira/browse/TIKA-2486



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6501) Support adding or updating index definitions via oak-run: JSON data format

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-6501:

Fix Version/s: (was: 1.8)
   1.10

> Support adding or updating index definitions via oak-run: JSON data format
> --
>
> Key: OAK-6501
> URL: https://issues.apache.org/jira/browse/OAK-6501
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
> Fix For: 1.10
>
>
> In OAK-6471 we have support for index definitions via JSON.
> I'm not happy with the escaping (OAK-6476) ("If the string starts with 
> namespace..."), I think it's a bit dangerous. Need to investigate whether 
> this prevents importing index definitions exported via JSON 
> (localhost:/oak:index/lucene.tidy.-1.json).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6892) Query: ability to "nicely" traverse

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-6892:

Fix Version/s: (was: 1.9.0)
   1.10

> Query: ability to "nicely" traverse
> ---
>
> Key: OAK-6892
> URL: https://issues.apache.org/jira/browse/OAK-6892
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
> Fix For: 1.10
>
>
> Currently, queries that traverse many nodes log a warning, or can even fail 
> (if configured). This is to ensure system resources are not blocked (CPU, 
> I/O, memory).
> But there are cases where it doesn't make sense to create an index, but 
> traverse (a certain path structure, or sometimes even the whole repository). 
> For example, finding a text with "like '%xxx%'". The problem isn't that it's 
> slow; the problem is that it's blocking / slowing down other users. Another 
> example is during migration, where the alternative is to create an index 
> (which also traverses the repository).
> One option is to allow such queries to run, but throttle them. We could add 
> the hint {{option(traversal throttle)}} to do that. Throttle means: don't use 
> up all I/O, but yield to other tasks depending on config settings (during 
> migration, yield is not needed). As a rule of thumb, the longer the query 
> runs, the more should it yield (up to some value).
> It would be good to allow stopping such queries, and get progress 
> information. The easiest solution might be over JMX, and a more advanced 
> solution is using new API (like, using an interface QueryTraversalObserver, 
> and have our QueryResult implement that interface).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6897) XPath query: option to _not_ convert "or" to "union"

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-6897:

Fix Version/s: (was: 1.9.0)
   1.10

> XPath query: option to _not_ convert "or" to "union"
> 
>
> Key: OAK-6897
> URL: https://issues.apache.org/jira/browse/OAK-6897
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Trivial
> Fix For: 1.10
>
>
> Right now, all XPath queries that contain "or" of the form "@a=1 or @b=2" are 
> converted to SQL-2 "union". In some cases, this is a problem, specially in 
> combination with "order by @jcr:score desc".
> Now that SQL-2 "or" conditions can be converted to union (depending if union 
> has a lower cost), it is no longer strictly needed to do the union conversion 
> in the XPath conversion. Or at least emit different SQL-2 queries and take 
> the one with the lowest cost.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6515) Decouple indexing and upload to datastore

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-6515:

Fix Version/s: (was: 1.9.0)
   1.10

> Decouple indexing and upload to datastore
> -
>
> Key: OAK-6515
> URL: https://issues.apache.org/jira/browse/OAK-6515
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: indexing, lucene, query
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
>Priority: Minor
> Fix For: 1.10
>
>
> Currently the default async index delay is 5 seconds. Using a larger delay 
> (e.g. 15 seconds) reduces index related growth, however diffing is delayed 15 
> seconds, which can reduce indexing performance. 
> One option (which might require bigger changes) is to index every 5 seconds, 
> and store the index every 5 seconds in the local directory, but only write to 
> the datastore / nodestore every 3rd time (that is, every 15 seconds).
> So that other cluster nodes will only see the index update every 15 seconds. 
> The diffing is done every 5 seconds, and the local index could be used every 
> 5 or every 15 seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5923) Document S3 datastore

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-5923:
--
Component/s: doc

> Document S3 datastore
> -
>
> Key: OAK-5923
> URL: https://issues.apache.org/jira/browse/OAK-5923
> Project: Jackrabbit Oak
>  Issue Type: Documentation
>  Components: blob, doc
>Reporter: Alexander Klimetschek
>Assignee: Amit Jain
> Fix For: 1.8
>
>
> The S3 datastore is currently hardly documented.
> The [generic blobstore 
> documentation|http://jackrabbit.apache.org/oak/docs/plugins/blobstore.html] 
> is very much focused about the internal class structures, but quite confusing 
> for someone who wants to configure a specific datastore such as file and s3 
> (the only ones right now). S3 settings are not documented at all, the [config 
> page|http://jackrabbit.apache.org/oak/docs/osgi_config.html#config-blobstore] 
> only mentions the generic maxCachedBinarySize and cacheSizeInMB.
> The best bet is the [Adobe AEM product 
> documentation|https://docs.adobe.com/docs/en/aem/6-2/deploy/platform/data-store-config.html],
>  but that is for an older version and a few things changed since then.
> Specific items below. Some have been confusing people using oak-blob-cloud 
> 1.5.15:
> - "secret" property unclear (new)
> - secretKey & accessKey can be omitted to leverage IAM roles (new)
> - drop of proactiveCaching property (new)
> - aws bucket/region/etc. settings
> - config options (timeout, retries, threads)
> - understanding caching behavior and performance optimization
> - shared vs. non-shared options
> - migrating from a previous version, how to update the config
> - requirements on the AWS (account) side



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6373) oak-run check should also check checkpoints

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6373:
--
Component/s: run

> oak-run check should also check checkpoints 
> 
>
> Key: OAK-6373
> URL: https://issues.apache.org/jira/browse/OAK-6373
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: run, segment-tar
>Reporter: Michael Dürig
>Assignee: Andrei Dulceanu
>  Labels: tooling
> Fix For: 1.8
>
>
> {{oak-run check}} does currently *not* traverse and check the items in the 
> checkpoint. I think we should change this and add an option to traverse all, 
> some or none of the checkpoints. When doing this we need to keep in mind the 
> interaction of this new feature with the {{filter}} option: the paths passed 
> through this option need then be prefixed with {{/root}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5792) TarMK: Implement tooling to repair broken nodes

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-5792:
--
Component/s: run

> TarMK: Implement tooling to repair broken nodes
> ---
>
> Key: OAK-5792
> URL: https://issues.apache.org/jira/browse/OAK-5792
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: run, segment-tar
>Reporter: Michael Dürig
>Assignee: Andrei Dulceanu
>  Labels: production, tooling
> Fix For: 1.8
>
>
> With {{oak-run check}} we can determine the last good revision of a 
> repository and use it to manually roll back a corrupted segment store. 
> Complementary to this we should implement a tool to roll forward a broken 
> revision to a fixed new revision. Such a tool needs to detect which items are 
> affected by a corruption and replace these items with markers. With this the 
> repository could brought back online and the markers could be used to 
> identify the locations in the tree where further manual action might be 
> needed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6737) Standby server should send timely responses to all client requests

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298249#comment-16298249
 ] 

Marcel Reutegger commented on OAK-6737:
---

[~dulceanu], we are approaching the 1.8 release, do you still want to include 
this in the release? Otherwise please re-schedule. 

> Standby server should send timely responses to all client requests
> --
>
> Key: OAK-6737
> URL: https://issues.apache.org/jira/browse/OAK-6737
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar, tarmk-standby
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: cold-standby
> Fix For: 1.8
>
>
> Currently all the {{GetXXXRequestHandler}} (where XXX stands for Blob, Head, 
> References and Segment), on the server discard client requests which cannot 
> be satisfied (i.e. the requested object does not exist (yet) on the server). 
> A more transparent approach would be to timely respond to all client 
> requests, clearly stating that the object was not found. This would improve a 
> lot debugging for example, because all requests and their responses could be 
> easily followed from the client log, without needing to know what actually 
> happened on the server.
> Below, a possible implementation for {{GetHeadRequestHandler}}, suggested by 
> [~frm] in a comment on OAK-6678:
> {noformat}
> String id = reader.readHeadRecordId();
> if (id == null) {
> ctx.writeAndFlush(new NotFoundGetHeadResponse(msg.getClientId(), id));
> return;
> }
> ctx.writeAndFlush(new GetHeadResponse(msg.getClientId(), id));
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6674) Create a more complex IT for cold standby

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298251#comment-16298251
 ] 

Marcel Reutegger commented on OAK-6674:
---

[~dulceanu], we are approaching the 1.8 release, do you still want to include 
this in the release? Otherwise please re-schedule. 

> Create a more complex IT for cold standby
> -
>
> Key: OAK-6674
> URL: https://issues.apache.org/jira/browse/OAK-6674
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, tarmk-standby
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>  Labels: cold-standby
> Fix For: 1.8
>
>
> At the moment all integration tests for cold standby are using the same 
> scenario in their tests: some content is created on the server (including 
> binaries), a standby sync cycle is started and then the content is checked on 
> the client. The only twist here is using/not using a data store for storing 
> binaries.
> Although good, this model could be extended to cover many more cases. For 
> example, {{StandbyDiff}} covers the following 6 cases node/property 
> added/changed/deleted. From these, with the scenario described, the removal 
> part is never tested (and the change part is covered in only one test). 
> It would be nice to have an IT which would add content on the server, do a 
> sync, remove some of the content, do a sync and then call OnRC. This way all 
> cases will be covered, including if cleanup works as expected on the client.
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6674) Create a more complex IT for cold standby

2017-12-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298253#comment-16298253
 ] 

Michael Dürig commented on OAK-6674:


This affects tests only. 

> Create a more complex IT for cold standby
> -
>
> Key: OAK-6674
> URL: https://issues.apache.org/jira/browse/OAK-6674
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, tarmk-standby
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>  Labels: cold-standby, test
> Fix For: 1.8
>
>
> At the moment all integration tests for cold standby are using the same 
> scenario in their tests: some content is created on the server (including 
> binaries), a standby sync cycle is started and then the content is checked on 
> the client. The only twist here is using/not using a data store for storing 
> binaries.
> Although good, this model could be extended to cover many more cases. For 
> example, {{StandbyDiff}} covers the following 6 cases node/property 
> added/changed/deleted. From these, with the scenario described, the removal 
> part is never tested (and the change part is covered in only one test). 
> It would be nice to have an IT which would add content on the server, do a 
> sync, remove some of the content, do a sync and then call OnRC. This way all 
> cases will be covered, including if cleanup works as expected on the client.
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6674) Create a more complex IT for cold standby

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6674:
--
Labels: cold-standby test  (was: cold-standby)

> Create a more complex IT for cold standby
> -
>
> Key: OAK-6674
> URL: https://issues.apache.org/jira/browse/OAK-6674
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, tarmk-standby
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>  Labels: cold-standby, test
> Fix For: 1.8
>
>
> At the moment all integration tests for cold standby are using the same 
> scenario in their tests: some content is created on the server (including 
> binaries), a standby sync cycle is started and then the content is checked on 
> the client. The only twist here is using/not using a data store for storing 
> binaries.
> Although good, this model could be extended to cover many more cases. For 
> example, {{StandbyDiff}} covers the following 6 cases node/property 
> added/changed/deleted. From these, with the scenario described, the removal 
> part is never tested (and the change part is covered in only one test). 
> It would be nice to have an IT which would add content on the server, do a 
> sync, remove some of the content, do a sync and then call OnRC. This way all 
> cases will be covered, including if cleanup works as expected on the client.
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-5884) Evaluate utility of RepositoryGrowthTest benchmark

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298254#comment-16298254
 ] 

Marcel Reutegger commented on OAK-5884:
---

[~dulceanu], we are approaching the 1.8 release, do you still want to include 
this in the release? Otherwise please re-schedule. 

> Evaluate utility of RepositoryGrowthTest benchmark
> --
>
> Key: OAK-5884
> URL: https://issues.apache.org/jira/browse/OAK-5884
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: benchmark
> Fix For: 1.8
>
>
> {{RepositoryGrowthTest}} is a benchmark which makes use of the deprecated 
> {{SegmentFixture}}. Since OAK-5834 removes the old {{oak-segment}} module and 
> the code associated with it, {{RepositoryGrowthTest}} was also removed. If 
> there's value in it, we can adapt it to work with the new 
> {{SegmentTarFixture}}.
> /cc [~chetanm]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-4177) Tests on Mongo should fail if mongo is not available

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-4177:
--
Fix Version/s: (was: 1.8)
   1.10

> Tests on Mongo should fail if mongo is not available
> 
>
> Key: OAK-4177
> URL: https://issues.apache.org/jira/browse/OAK-4177
> Project: Jackrabbit Oak
>  Issue Type: Test
>Reporter: Davide Giannella
>Assignee: Davide Giannella
> Fix For: 1.10
>
>
> Most if not all of the IT/UT that run against mongodb have an
> assumption at class level that if mongodb is not available the tests
> are skipped.
> The tests should fail instead if mongodb is not available and we
> explicitly said that, via the {{nsfixtures}} flags, we want to run the
> tests against mongodb.
> We currently have 4 fixtures/flags: DOCUMENT_NS, SEGMENT_MK,
> DOCUMENT_RDB, MEMORY_NS.
> https://github.com/apache/jackrabbit-oak/blob/f957b6787eb7a70eba454ceb1cae90bd4d47f15c/oak-commons/src/test/java/org/apache/jackrabbit/oak/commons/FixturesHelper.java#L46
> We may have the need to introduce a new Fixture/Flag that indicate
> that we want to run the tests against Document using the in-memory
> implementation. For example: DOCUMENT_NS_IM.
> This will be useful on the Apache Jenkins as we don't have mongo there
> but we still want to run all the possible Document NS tests against
> the in-memory implementation when this is possible.
> /cc [~mreutegg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6460) Index related tooling

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298256#comment-16298256
 ] 

Marcel Reutegger commented on OAK-6460:
---

[~chetanm], can we resolve this epic for 1.8 and move remaining issue to a new 
epic for 1.10?

> Index related tooling
> -
>
> Key: OAK-6460
> URL: https://issues.apache.org/jira/browse/OAK-6460
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: indexing, run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8
>
>
> To enable better management for indexing related operation specially around 
> reindexing indexes on large repository setup we should implement some tooling 
> as part of oak-run. This epic is meant to track all work done in this area



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-5884) Evaluate utility of RepositoryGrowthTest benchmark

2017-12-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298260#comment-16298260
 ] 

Michael Dürig commented on OAK-5884:


This is test / benchmark code only

> Evaluate utility of RepositoryGrowthTest benchmark
> --
>
> Key: OAK-5884
> URL: https://issues.apache.org/jira/browse/OAK-5884
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: run, segment-tar
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: benchmark
> Fix For: 1.8
>
>
> {{RepositoryGrowthTest}} is a benchmark which makes use of the deprecated 
> {{SegmentFixture}}. Since OAK-5834 removes the old {{oak-segment}} module and 
> the code associated with it, {{RepositoryGrowthTest}} was also removed. If 
> there's value in it, we can adapt it to work with the new 
> {{SegmentTarFixture}}.
> /cc [~chetanm]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-4857) Support space chars common in CJK inside node names

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-4857:
--
Fix Version/s: (was: 1.8)
   1.10

Rescheduling to 1.10 because there was no progress for over a year.

> Support space chars common in CJK inside node names
> ---
>
> Key: OAK-4857
> URL: https://issues.apache.org/jira/browse/OAK-4857
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.4.7, 1.5.10
>Reporter: Alexander Klimetschek
>Assignee: Julian Reschke
> Fix For: 1.10
>
> Attachments: OAK-4857-tests.patch
>
>
> Oak (like Jackrabbit) does not allow spaces commonly used in CJK like 
> {{u3000}} (ideographic space) or {{u00A0}} (no-break space) _inside_ a node 
> name, while allowing some of them (the non breaking spaces) at the _beginning 
> or end_.
> They should be supported for better globalization readiness, and filesystems 
> allow them, making common filesystem to JCR mappings unnecessarily hard. 
> Escaping would be an option for applications, but there is currently no 
> utility method for it 
> ([Text.escapeIllegalJcrChars|https://jackrabbit.apache.org/api/2.8/org/apache/jackrabbit/util/Text.html#escapeIllegalJcrChars(java.lang.String)]
>  will not escape these spaces), nor is it documented for applications how to 
> do so.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7075) Document oak-run compact arguments and system properties

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-7075:
--
Component/s: doc

> Document oak-run compact arguments and system properties
> 
>
> Key: OAK-7075
> URL: https://issues.apache.org/jira/browse/OAK-7075
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: doc, segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: documentation
> Fix For: 1.8
>
>
> Ensure {{oak-doc}} is up to date with the current version of {{oak-run 
> compact}}, its current command line arguments and system properties. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7024) java.security.acl deprecated in Java 10, marked for removal in Java 11

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-7024:

Description: 
See  and 
.

Need to understand how this affects public Oak APIs, and what to do with them 
on Java 11 (which will be an LTS release we probably need to support with Oak 
1.10).


  was:
See  and 
.



> java.security.acl deprecated in Java 10, marked for removal in Java 11
> --
>
> Key: OAK-7024
> URL: https://issues.apache.org/jira/browse/OAK-7024
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: security
>Reporter: Julian Reschke
> Fix For: 1.10
>
>
> See  and 
> .
> Need to understand how this affects public Oak APIs, and what to do with them 
> on Java 11 (which will be an LTS release we probably need to support with Oak 
> 1.10).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6710) CommitMitigated merge policy should not reduce performance significantly

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298270#comment-16298270
 ] 

Marcel Reutegger commented on OAK-6710:
---

[~teofili], you committed some changes for this issue. Is there still some work 
pending or can this issue be resolved?

> CommitMitigated merge policy should not reduce performance significantly
> 
>
> Key: OAK-6710
> URL: https://issues.apache.org/jira/browse/OAK-6710
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.7.7
>Reporter: Vikas Saurabh
>Assignee: Tommaso Teofili
>  Labels: performance, scalability
> Fix For: 1.8
>
>
> While running performance tests (internally) using latest unstable releases 
> having {{CommitMitigatedTieredMergePolicy}} we observed significant drop in 
> performance (OAK-6704).
> The policy, although bad for performance, showed dramatic drop in churn 
> created in data store size. So, clearly, OAK-5192 did what it intended to do.
> Opening this issue to factor in drop in performance and then set the default 
> back to {{CommitMitigatedTieredMergePolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-3355) Test failure: SpellcheckTest.testSpellcheckMultipleWords

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3355:
--
Labels: ci jenkins test test-failure  (was: ci jenkins test-failure)

> Test failure: SpellcheckTest.testSpellcheckMultipleWords
> 
>
> Key: OAK-3355
> URL: https://issues.apache.org/jira/browse/OAK-3355
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Affects Versions: 1.0.24
> Environment: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
>Reporter: Michael Dürig
>Assignee: Tommaso Teofili
>  Labels: ci, jenkins, test, test-failure
> Fix For: 1.8
>
>
> {{org.apache.jackrabbit.oak.jcr.query.SpellcheckTest.testSpellcheckMultipleWords}}
>  fails on Jenkins.
> Failure seen at builds: 389, 392, 395, 396, 562
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/396/jdk=jdk-1.6u45,label=Ubuntu,nsfixtures=DOCUMENT_RDB,profile=unittesting/console
> {noformat}
> testSpellcheckMultipleWords(org.apache.jackrabbit.oak.jcr.query.SpellcheckTest)
>   Time elapsed: 0.907 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected:<[voting[ in] ontario]> but 
> was:<[voting[, voted,] ontario]>
>   at junit.framework.Assert.assertEquals(Assert.java:85)
>   at junit.framework.Assert.assertEquals(Assert.java:91)
>   at 
> org.apache.jackrabbit.oak.jcr.query.SpellcheckTest.testSpellcheckMultipleWords(SpellcheckTest.java:86)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7024) java.security.acl deprecated in Java 10, marked for removal in Java 11

2017-12-20 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298271#comment-16298271
 ] 

Julian Reschke commented on OAK-7024:
-

[~anchela], [~stillalex] - can one of you two have a look at this?

> java.security.acl deprecated in Java 10, marked for removal in Java 11
> --
>
> Key: OAK-7024
> URL: https://issues.apache.org/jira/browse/OAK-7024
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: security
>Reporter: Julian Reschke
> Fix For: 1.10
>
>
> See  and 
> .
> Need to understand how this affects public Oak APIs, and what to do with them 
> on Java 11 (which will be an LTS release we probably need to support with Oak 
> 1.10).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-2727) NodeStateSolrServersObserver should be filtering path selectively

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-2727:
--
Fix Version/s: (was: 1.8)
   1.10

> NodeStateSolrServersObserver should be filtering path selectively
> -
>
> Key: OAK-2727
> URL: https://issues.apache.org/jira/browse/OAK-2727
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Affects Versions: 1.1.8
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>  Labels: performance
> Fix For: 1.10
>
>
> As discussed in OAK-2718 it'd be good to be able to selectively find Solr 
> indexes by path, as done in Lucene index, see also OAK-2570.
> This would avoid having to do full diffs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-2538) Support index time aggregation in Solr index

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-2538:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduling to 1.10 because there was no progress for quite some time.

> Support index time aggregation in Solr index
> 
>
> Key: OAK-2538
> URL: https://issues.apache.org/jira/browse/OAK-2538
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
>  Labels: performance
> Fix For: 1.10
>
>
> Solr index is only able to do query time aggregation while that "would not 
> perform well for multi term searches as each term involves a separate call 
> and with intersection cursor being used the operation might result in reading 
> up all match terms even when user accesses only first page", therefore it'd 
> be good to implement index time aggregation like in Lucene index. (/cc 
> [~chetanm])



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-3809) Test failure: FacetTest

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3809:
--
Labels: ci jenkins test test-failure  (was: ci jenkins test-failure)

> Test failure: FacetTest
> ---
>
> Key: OAK-3809
> URL: https://issues.apache.org/jira/browse/OAK-3809
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
> Environment: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
>Reporter: Michael Dürig
>Assignee: Tommaso Teofili
>  Labels: ci, jenkins, test, test-failure
> Fix For: 1.8
>
>
> {{org.apache.jackrabbit.oak.jcr.query.FacetTest}} keeps failing on Jenkins:
> {noformat}
> testFacetRetrievalMV(org.apache.jackrabbit.oak.jcr.query.FacetTest)  Time 
> elapsed: 5.927 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected: (2), aem (1), apache (1), cosmetics (1), furniture (1)], tags:[repository 
> (2), software (2), aem (1), apache (1), cosmetics (1), furniture (1)], 
> tags:[repository (2), software (2), aem (1), apache (1), cosmetics (1), 
> furniture (1)], tags:[repository (2), software (2), aem (1), apache (1), 
> cosmetics (1), furniture (1)]]> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:100)
>   at junit.framework.Assert.assertEquals(Assert.java:107)
>   at junit.framework.TestCase.assertEquals(TestCase.java:269)
>   at 
> org.apache.jackrabbit.oak.jcr.query.FacetTest.testFacetRetrievalMV(FacetTest.java:80)
> {noformat}
> Failure seen at builds: 628, 629, 630, 633, 634, 636, 642, 643, 644, 645, 
> 648, 651, 656, 659, 660, 663, 666
> See e.g. 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/634/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-5927) Load excerpt lazily

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-5927:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled for 1.10 because we are approaching 1.8.

> Load excerpt lazily
> ---
>
> Key: OAK-5927
> URL: https://issues.apache.org/jira/browse/OAK-5927
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>  Labels: performance
> Fix For: 1.10
>
>
> Currently LucenePropertyIndex loads the excerpt eagerly in batch as part of 
> loadDocs call. The load docs batch size doubles starting from 50 (max 100k) 
> as more data is read. 
> We should look into ways to make the excerpt loaded lazily as and when caller 
> ask for excerpt.
> Note that currently the excerpt are only loaded when query request for 
> excerpt i.e. there is a not null property restriction for {{rep:excerpt}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6710) CommitMitigated merge policy should not reduce performance significantly

2017-12-20 Thread Tommaso Teofili (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298285#comment-16298285
 ] 

Tommaso Teofili commented on OAK-6710:
--

I think this can be resolved. There's a bit more work to be done to address 
reindexing case better, but I'll track in another issue.

> CommitMitigated merge policy should not reduce performance significantly
> 
>
> Key: OAK-6710
> URL: https://issues.apache.org/jira/browse/OAK-6710
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.7.7
>Reporter: Vikas Saurabh
>Assignee: Tommaso Teofili
>  Labels: performance, scalability
> Fix For: 1.8
>
>
> While running performance tests (internally) using latest unstable releases 
> having {{CommitMitigatedTieredMergePolicy}} we observed significant drop in 
> performance (OAK-6704).
> The policy, although bad for performance, showed dramatic drop in churn 
> created in data store size. So, clearly, OAK-5192 did what it intended to do.
> Opening this issue to factor in drop in performance and then set the default 
> back to {{CommitMitigatedTieredMergePolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6710) CommitMitigated merge policy should not reduce performance significantly

2017-12-20 Thread Tommaso Teofili (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tommaso Teofili updated OAK-6710:
-
Fix Version/s: (was: 1.8)
   1.7.13

> CommitMitigated merge policy should not reduce performance significantly
> 
>
> Key: OAK-6710
> URL: https://issues.apache.org/jira/browse/OAK-6710
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.7.7
>Reporter: Vikas Saurabh
>Assignee: Tommaso Teofili
>  Labels: performance, scalability
> Fix For: 1.7.13
>
>
> While running performance tests (internally) using latest unstable releases 
> having {{CommitMitigatedTieredMergePolicy}} we observed significant drop in 
> performance (OAK-6704).
> The policy, although bad for performance, showed dramatic drop in churn 
> created in data store size. So, clearly, OAK-5192 did what it intended to do.
> Opening this issue to factor in drop in performance and then set the default 
> back to {{CommitMitigatedTieredMergePolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6881) indirect test dependencies through tika-parser to vulnerable version of xmpcore

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-6881.
-
   Resolution: Fixed
Fix Version/s: 1.7.13

> indirect test dependencies through tika-parser to vulnerable version of 
> xmpcore
> ---
>
> Key: OAK-6881
> URL: https://issues.apache.org/jira/browse/OAK-6881
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: parent
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_6
> Fix For: 1.8, 1.7.13
>
>
> We should upgrade to a version of tika-parser that fixes 
> https://issues.apache.org/jira/browse/TIKA-2486



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (OAK-6881) indirect test dependencies through tika-parser to vulnerable version of xmpcore

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke reopened OAK-6881:
-

> indirect test dependencies through tika-parser to vulnerable version of 
> xmpcore
> ---
>
> Key: OAK-6881
> URL: https://issues.apache.org/jira/browse/OAK-6881
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: parent
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
>
> We should upgrade to a version of tika-parser that fixes 
> https://issues.apache.org/jira/browse/TIKA-2486



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6881) indirect test dependencies through tika-parser to vulnerable version of xmpcore

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-6881:

Fix Version/s: (was: 1.7.13)

> indirect test dependencies through tika-parser to vulnerable version of 
> xmpcore
> ---
>
> Key: OAK-6881
> URL: https://issues.apache.org/jira/browse/OAK-6881
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: parent
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_6
> Fix For: 1.8
>
>
> We should upgrade to a version of tika-parser that fixes 
> https://issues.apache.org/jira/browse/TIKA-2486



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-6710) CommitMitigated merge policy should not reduce performance significantly

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-6710.
---
   Resolution: Fixed
Fix Version/s: 1.8

> CommitMitigated merge policy should not reduce performance significantly
> 
>
> Key: OAK-6710
> URL: https://issues.apache.org/jira/browse/OAK-6710
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.7.7
>Reporter: Vikas Saurabh
>Assignee: Tommaso Teofili
>  Labels: performance, scalability
> Fix For: 1.8, 1.7.13
>
>
> While running performance tests (internally) using latest unstable releases 
> having {{CommitMitigatedTieredMergePolicy}} we observed significant drop in 
> performance (OAK-6704).
> The policy, although bad for performance, showed dramatic drop in churn 
> created in data store size. So, clearly, OAK-5192 did what it intended to do.
> Opening this issue to factor in drop in performance and then set the default 
> back to {{CommitMitigatedTieredMergePolicy}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-5877) Oak upgrade usage note refers to oak-run

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-5877.
---
   Resolution: Fixed
Fix Version/s: 1.7.13

Fixed in trunk: http://svn.apache.org/r1818784

> Oak upgrade usage note refers to oak-run
> 
>
> Key: OAK-5877
> URL: https://issues.apache.org/jira/browse/OAK-5877
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: upgrade
>Reporter: Michael Dürig
>Priority: Minor
>  Labels: production, tooling, usability
> Fix For: 1.8, 1.7.13
>
>
> Running {{java -jar oak-upgrade*.jar}} prints 
> {noformat}
> Usage: java -jar oak-run-*-jr2.jar upgrade [options] jcr2_source [destination]
>(to upgrade a JCR 2 repository)
>java -jar oak-run-*-jr2.jar upgrade [options] source destination
>(to migrate an Oak repository)
> {noformat}
> Which incorrectly refers to {{oak-run upgrade}}. The latter will send me back 
> to {{oak-run}}: "This command was moved to the oak-upgrade module". 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (OAK-5877) Oak upgrade usage note refers to oak-run

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger reassigned OAK-5877:
-

Assignee: Marcel Reutegger

> Oak upgrade usage note refers to oak-run
> 
>
> Key: OAK-5877
> URL: https://issues.apache.org/jira/browse/OAK-5877
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: upgrade
>Reporter: Michael Dürig
>Assignee: Marcel Reutegger
>Priority: Minor
>  Labels: production, tooling, usability
> Fix For: 1.7.13, 1.8
>
>
> Running {{java -jar oak-upgrade*.jar}} prints 
> {noformat}
> Usage: java -jar oak-run-*-jr2.jar upgrade [options] jcr2_source [destination]
>(to upgrade a JCR 2 repository)
>java -jar oak-run-*-jr2.jar upgrade [options] source destination
>(to migrate an Oak repository)
> {noformat}
> Which incorrectly refers to {{oak-run upgrade}}. The latter will send me back 
> to {{oak-run}}: "This command was moved to the oak-upgrade module". 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6956) RepositoryUpgrade hardcodes SecurityProvider

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298374#comment-16298374
 ] 

Marcel Reutegger commented on OAK-6956:
---

[~tomek.rekawek], we are approaching the 1.8 release, do you still want to 
include this in the release? Otherwise please re-schedule.

> RepositoryUpgrade hardcodes SecurityProvider
> 
>
> Key: OAK-6956
> URL: https://issues.apache.org/jira/browse/OAK-6956
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: upgrade
>Reporter: angela
>Assignee: Tomek Rękawek
>Priority: Critical
> Fix For: 1.8
>
>
> [~tomek.rekawek] Looking at non-test usage of the (to be deprecated) 
> {{SecurityProviderImpl}} I noticed one usage in the {{RepositoryUpgrade}} 
> that looks troublesome to me.
> I would strongly recommend to fix that given the fact that we no longer use 
> {{SecurityProviderImpl}} in production ready setup scenarios.
> cc: [~stillalex]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6508) Make it possible to exclude subtrees by configuration in Solr index

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6508:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Make it possible to exclude subtrees by configuration in Solr index
> ---
>
> Key: OAK-6508
> URL: https://issues.apache.org/jira/browse/OAK-6508
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> While it's possible to configure per subtree Solr indexes (via persisted 
> configuration), sometimes it's also useful to define index on a larger tree 
> but still exclude some subtrees (e.g. /jcr:system, /var, etc.) for 
> performance reasons (unneeded content in the index, performance).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-2976) Oak percolator

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-2976:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Oak percolator
> --
>
> Key: OAK-2976
> URL: https://issues.apache.org/jira/browse/OAK-2976
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: query
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> Inspired by [Elasticsearch 
> percolator|https://www.elastic.co/guide/en/elasticsearch/reference/current/search-percolate.html]
>  we may implement an Oak percolator that would basically store queries and 
> perform specific tasks upon indexing of documents matching those queries.
> The reasons for possibly having that are that such a mechanism could be used 
> to run common but slow queries automatically whenever batches of matching 
> documents get indexed, to eventually warm up the underlying indexes caches.
> Also such a percolator could be used as a notification mechanism (alerting, 
> monitoring).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-3336) Abstract a full text index implementation to be extended by Lucene and Solr

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3336:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Abstract a full text index implementation to be extended by Lucene and Solr
> ---
>
> Key: OAK-3336
> URL: https://issues.apache.org/jira/browse/OAK-3336
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene, query, solr
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> Current Lucene and Solr indexes implement quite a no. of features according 
> to their specific APIs, design and implementation. However in the long run, 
> while differences in APIs and implementations will / can of course stay, the 
> difference in design can make it hard to keep those features on par.
> It'd be therefore nice to make it possible to abstract as much of design and 
> implementation bits as possible in an abstract full text implementation which 
> Lucene and Solr would extend according to their specifics.
> An example advantage of this is that index time aggregation will be 
> implemented only once and therefore any bugfixes and improvements in that 
> area will be done in the abstract implementation rather than having to do 
> that in two places.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-2182) Specify collection to be used by Solr index

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-2182:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Specify collection to be used by Solr index
> ---
>
> Key: OAK-2182
> URL: https://issues.apache.org/jira/browse/OAK-2182
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: solr
>Affects Versions: 1.1.0
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> Currently all the information to hit a Solr server is hold by the singleton 
> SolrServerProvider while there are some use cases where more than one query 
> index definition for a Solr index may be done, targeting different content, 
> and therefore it'd be good to be able to specify which collection should be 
> used by each of these indexes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-3866) Sorting on relative properties doesn't work in Solr

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3866:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Sorting on relative properties doesn't work in Solr
> ---
>
> Key: OAK-3866
> URL: https://issues.apache.org/jira/browse/OAK-3866
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Affects Versions: 1.0.22, 1.2.9, 1.3.13
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> Executing a query like 
> {noformat}
> /jcr:root/content/foo//*[(@sling:resourceType = 'x' or @sling:resourceType = 
> 'y') and jcr:contains(., 'bar*~')] order by jcr:content/@jcr:primaryType 
> descending
> {noformat}
> would assume sorting on the _jcr:primaryType_ property of resulting nodes' 
> _jcr:content_ children.
> That is currently not supported in Solr, while it is in Lucene as the latter 
> supports index time aggregation.
> We should inspect if it's possible to extend support for Solr too, most 
> probably via index time aggregation.
> The query should not fail but at least log a warning about that limitation 
> for the time being.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-3437) Regression in org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5 when enabling OAK-1617

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298386#comment-16298386
 ] 

Marcel Reutegger commented on OAK-3437:
---

[~teofili], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> Regression in org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5 when 
> enabling OAK-1617
> --
>
> Key: OAK-3437
> URL: https://issues.apache.org/jira/browse/OAK-3437
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Reporter: Davide Giannella
>Assignee: Tommaso Teofili
> Fix For: 1.8
>
>
> When enabling OAK-1617 (still to be committed) there's a regression in the 
> {{oak-solr-core}} unit tests 
> - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR3}} 
> - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR4}} 
> - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5}} 
> The WIP of the feature can be found in 
> https://github.com/davidegiannella/jackrabbit-oak/tree/OAK-1617 and a full 
> patch will be attached shortly for review in OAK-1617 itself.
> The feature is currently disabled, in order to enable it for unit testing an 
> approach like this can be taken 
> https://github.com/davidegiannella/jackrabbit-oak/blob/177df1a8073b1237857267e23d12a433e3d890a4/oak-core/src/test/java/org/apache/jackrabbit/oak/query/SQL2OptimiseQueryTest.java#L142
>  or setting the system property {{-Doak.query.sql2optimisation}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-3717) Make it possible to declare SynonymFilter within Analyzer with WN dictionary

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-3717:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Make it possible to declare SynonymFilter within Analyzer with WN dictionary
> 
>
> Key: OAK-3717
> URL: https://issues.apache.org/jira/browse/OAK-3717
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> Currently one can compose Lucene Analyzers via 
> [composition|http://jackrabbit.apache.org/oak/docs/query/lucene.html#Create_analyzer_via_composition]
>  within an index definition. It'd be good to be able to also use 
> {{SynonymFIlter}} in there, eventually decorated with 
> {{WordNetSynonymParser}} to leverage WordNet synonym files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-4524) LucenePropertyIndexTest#longRepExcerpt sometimes failing

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298391#comment-16298391
 ] 

Marcel Reutegger commented on OAK-4524:
---

[~teofili], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> LucenePropertyIndexTest#longRepExcerpt sometimes failing
> 
>
> Key: OAK-4524
> URL: https://issues.apache.org/jira/browse/OAK-4524
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: lucene
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.8
>
>
> As reported by Julian on oak-dev@ it seems _longRepExcerpt_ is still failing 
> sometimes when query takes more than 10s e.g. see this [Jenkins 
> failure|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/1000/jdk=jdk1.8.0_11,label=Ubuntu,nsfixtures=DOCUMENT_NS,profile=unittesting/testReport/junit/org.apache.jackrabbit.oak.plugins.index.lucene/LucenePropertyIndexTest/longRepExcerpt/].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6412) Consider upgrading to newer Lucene versions

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6412:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8.

> Consider upgrading to newer Lucene versions
> ---
>
> Key: OAK-6412
> URL: https://issues.apache.org/jira/browse/OAK-6412
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tommaso Teofili
>Assignee: Tommaso Teofili
> Fix For: 1.10
>
>
> An year ago I had started prototyping the upgrade to Lucene 5 [1], in the 
> meantime version 6 (and 7 soon) has come out.
> I think it'd be very nice to upgrade Lucene version to the latest, this would 
> give us improvements in space consumption and runtime performance.
> In case we want to upgrade to 6.0 or later we need to consider upgrade 
> scenarios because Lucene Codecs are backward compatible with the previous 
> major release, so Lucene 6 can read Lucene 5 but not Lucene 4.x (4.7 in our 
> case) therefore we would need to detect that when reading an index and 
> trigger reindexing using the new format.
> Related to that there's also a patch to upgrade Solr index to version 5 (see 
> OAK-4318).
> [1] : https://github.com/tteofili/jackrabbit-oak/tree/lucene5



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-3355) Test failure: SpellcheckTest.testSpellcheckMultipleWords

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298394#comment-16298394
 ] 

Marcel Reutegger commented on OAK-3355:
---

[~teofili], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> Test failure: SpellcheckTest.testSpellcheckMultipleWords
> 
>
> Key: OAK-3355
> URL: https://issues.apache.org/jira/browse/OAK-3355
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Affects Versions: 1.0.24
> Environment: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
>Reporter: Michael Dürig
>Assignee: Tommaso Teofili
>  Labels: ci, jenkins, test, test-failure
> Fix For: 1.8
>
>
> {{org.apache.jackrabbit.oak.jcr.query.SpellcheckTest.testSpellcheckMultipleWords}}
>  fails on Jenkins.
> Failure seen at builds: 389, 392, 395, 396, 562
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/396/jdk=jdk-1.6u45,label=Ubuntu,nsfixtures=DOCUMENT_RDB,profile=unittesting/console
> {noformat}
> testSpellcheckMultipleWords(org.apache.jackrabbit.oak.jcr.query.SpellcheckTest)
>   Time elapsed: 0.907 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected:<[voting[ in] ontario]> but 
> was:<[voting[, voted,] ontario]>
>   at junit.framework.Assert.assertEquals(Assert.java:85)
>   at junit.framework.Assert.assertEquals(Assert.java:91)
>   at 
> org.apache.jackrabbit.oak.jcr.query.SpellcheckTest.testSpellcheckMultipleWords(SpellcheckTest.java:86)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-3809) Test failure: FacetTest

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298396#comment-16298396
 ] 

Marcel Reutegger commented on OAK-3809:
---

[~teofili], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> Test failure: FacetTest
> ---
>
> Key: OAK-3809
> URL: https://issues.apache.org/jira/browse/OAK-3809
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
> Environment: 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/
>Reporter: Michael Dürig
>Assignee: Tommaso Teofili
>  Labels: ci, jenkins, test, test-failure
> Fix For: 1.8
>
>
> {{org.apache.jackrabbit.oak.jcr.query.FacetTest}} keeps failing on Jenkins:
> {noformat}
> testFacetRetrievalMV(org.apache.jackrabbit.oak.jcr.query.FacetTest)  Time 
> elapsed: 5.927 sec  <<< FAILURE!
> junit.framework.ComparisonFailure: expected: (2), aem (1), apache (1), cosmetics (1), furniture (1)], tags:[repository 
> (2), software (2), aem (1), apache (1), cosmetics (1), furniture (1)], 
> tags:[repository (2), software (2), aem (1), apache (1), cosmetics (1), 
> furniture (1)], tags:[repository (2), software (2), aem (1), apache (1), 
> cosmetics (1), furniture (1)]]> but was:
>   at junit.framework.Assert.assertEquals(Assert.java:100)
>   at junit.framework.Assert.assertEquals(Assert.java:107)
>   at junit.framework.TestCase.assertEquals(TestCase.java:269)
>   at 
> org.apache.jackrabbit.oak.jcr.query.FacetTest.testFacetRetrievalMV(FacetTest.java:80)
> {noformat}
> Failure seen at builds: 628, 629, 630, 633, 634, 636, 642, 643, 644, 645, 
> 648, 651, 656, 659, 660, 663, 666
> See e.g. 
> https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/634/#showFailuresLink



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-1819) oak-solr-core test failures on Java 8 and later

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298398#comment-16298398
 ] 

Marcel Reutegger commented on OAK-1819:
---

[~teofili], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> oak-solr-core test failures on Java 8 and later
> ---
>
> Key: OAK-1819
> URL: https://issues.apache.org/jira/browse/OAK-1819
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: solr
>Affects Versions: 1.0
> Environment: {noformat}
> Apache Maven 3.1.0 (893ca28a1da9d5f51ac03827af98bb730128f9f2; 2013-06-27 
> 22:15:32-0400)
> Maven home: c:\Program Files\apache-maven-3.1.0
> Java version: 1.8.0, vendor: Oracle Corporation
> Java home: c:\Program Files\Java\jdk1.8.0\jre
> Default locale: en_US, platform encoding: Cp1252
> OS name: "windows 7", version: "6.1", arch: "amd64", family: "dos"
> {noformat}
>Reporter: Jukka Zitting
>Assignee: Tommaso Teofili
>Priority: Minor
>  Labels: java8, java9, jenkins, test
> Fix For: 1.8
>
>
> The following {{oak-solr-core}} test failures occur when building Oak with 
> Java 8:
> {noformat}
> Failed tests:
>   
> testNativeMLTQuery(org.apache.jackrabbit.oak.plugins.index.solr.query.SolrIndexQueryTest):
>  expected: but was:
>   
> testNativeMLTQueryWithStream(org.apache.jackrabbit.oak.plugins.index.solr.query.SolrIndexQueryTest):
>  expected: but was:
> {noformat}
> The cause of this might well be something as simple as the test case 
> incorrectly expecting a specific ordering of search results.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-6833) LuceneIndex*Test failures

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298400#comment-16298400
 ] 

Marcel Reutegger commented on OAK-6833:
---

[~catholicon], do you consider this a blocker for the 1.8 release? Otherwise 
please reschedule to e.g. 1.10.

> LuceneIndex*Test failures
> -
>
> Key: OAK-6833
> URL: https://issues.apache.org/jira/browse/OAK-6833
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Reporter: Julian Reschke
>Assignee: Vikas Saurabh
> Fix For: 1.8
>
> Attachments: 
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  
> TEST-org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.xml,
>  org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexAugmentTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.txt, 
> unit-tests.log, unit-tests.log, unit-tests.log
>
>
> {noformat}
> [ERROR] testLuceneWithRelativeProperty[1: useBlobStore 
> (false)](org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest)
>   Time elapsed: 0.063 s  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
> at 
> org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorTest.testLuceneWithRelativeProperty(LuceneIndexEditorTest.java:341)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6408) Review package exports for o.a.j.oak.plugins.index.*

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger updated OAK-6408:
--
Fix Version/s: (was: 1.8)
   1.10

Re-scheduled to 1.10. This doesn't look feasible to me for 1.8. The oak-core 
module has various other exported packages that are not properly versioned. 
This need more work anyway.

> Review package exports for o.a.j.oak.plugins.index.*
> 
>
> Key: OAK-6408
> URL: https://issues.apache.org/jira/browse/OAK-6408
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, indexing
>Reporter: angela
> Fix For: 1.10
>
>
> while working on OAK-6304 and OAK-6355, i noticed that the 
> _o.a.j.oak.plugins.index.*_ contains both internal api/utilities and 
> implementation details which get equally exported (though without having any 
> package export version set).
> in the light of the modularization effort, i would like to suggest that we 
> try to sort that out and separate the _public_ parts from implementation 
> details. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-6501) Support adding or updating index definitions via oak-run: JSON data format

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-6501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-6501:

Sprint:   (was: L16)

> Support adding or updating index definitions via oak-run: JSON data format
> --
>
> Key: OAK-6501
> URL: https://issues.apache.org/jira/browse/OAK-6501
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>Reporter: Thomas Mueller
>Assignee: Thomas Mueller
> Fix For: 1.10
>
>
> In OAK-6471 we have support for index definitions via JSON.
> I'm not happy with the escaping (OAK-6476) ("If the string starts with 
> namespace..."), I think it's a bit dangerous. Need to investigate whether 
> this prevents importing index definitions exported via JSON 
> (localhost:/oak:index/lucene.tidy.-1.json).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7074) Ensure that all Documents are read with document order traversal indexing

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller updated OAK-7074:

Sprint: L16

> Ensure that all Documents are read with document order traversal indexing
> -
>
> Key: OAK-7074
> URL: https://issues.apache.org/jira/browse/OAK-7074
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8
>
>
> With OAK-6353 support was added for document order traversal indexing. In 
> this mode we open a DB cursor and try to read all documents from it using 
> document order traversal. Such a cursor may remain open for long time (2-4 
> hrs) and its possible that document may get reordered by the Mongo storage 
> engine. This would result in 2 aspects to be thought about 
> # Duplicate documents - Same document may appear more than once in result set 
> # Possibly missed document - It may be a possibility that a document got 
> moved and missed becoming part of cursor. 
> Both these aspects would need to be handled



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7100) update htmlunit test dependency

2017-12-20 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-7100:
---

 Summary: update htmlunit test dependency
 Key: OAK-7100
 URL: https://issues.apache.org/jira/browse/OAK-7100
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: examples
Reporter: Julian Reschke
Priority: Minor
 Fix For: 1.8






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (OAK-7013) Replace usage in oak-auth-external

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller reassigned OAK-7013:
---

Assignee: Thomas Mueller  (was: Alex Deparvu)

> Replace usage in oak-auth-external
> --
>
> Key: OAK-7013
> URL: https://issues.apache.org/jira/browse/OAK-7013
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: auth-external
>Reporter: angela
>Assignee: Thomas Mueller
> Fix For: 1.7.13, 1.8
>
> Attachments: oak-auth-external.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (OAK-7013) Replace usage in oak-auth-external

2017-12-20 Thread Thomas Mueller (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Mueller reassigned OAK-7013:
---

Assignee: Alex Deparvu  (was: Thomas Mueller)

> Replace usage in oak-auth-external
> --
>
> Key: OAK-7013
> URL: https://issues.apache.org/jira/browse/OAK-7013
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: auth-external
>Reporter: angela
>Assignee: Alex Deparvu
> Fix For: 1.7.13, 1.8
>
> Attachments: oak-auth-external.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7013) Replace usage in oak-auth-external

2017-12-20 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298505#comment-16298505
 ] 

Thomas Mueller commented on OAK-7013:
-

Sorry... clicked somewhere

> Replace usage in oak-auth-external
> --
>
> Key: OAK-7013
> URL: https://issues.apache.org/jira/browse/OAK-7013
> Project: Jackrabbit Oak
>  Issue Type: Sub-task
>  Components: auth-external
>Reporter: angela
>Assignee: Alex Deparvu
> Fix For: 1.7.13, 1.8
>
> Attachments: oak-auth-external.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger reopened OAK-7093:
---

This causes a test failure in oak-it when there is a local MongoDB running:

{noformat}
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.144 s 
<<< FAILURE! - in 
org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest
[ERROR] 
blobsCanBeReadAfterSwitchingBlobStore(org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest)
  Time elapsed: 0.018 s  <<< ERROR!
java.lang.ExceptionInInitializerError
at 
org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest.setup(DocumentToExternalMigrationTest.java:50)
Caused by: java.lang.NullPointerException
at 
org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest.setup(DocumentToExternalMigrationTest.java:50)

[ERROR] 
blobsExistsOnTheNewBlobStore(org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest)
  Time elapsed: 0.121 s  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class 
java.nio.file.TempFileHelper
at 
org.apache.jackrabbit.oak.plugins.blob.migration.DocumentToExternalMigrationTest.setup(DocumentToExternalMigrationTest.java:50)
{noformat}

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Amit Jain
> Fix For: 1.8, 1.7.14
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298561#comment-16298561
 ] 

Marcel Reutegger commented on OAK-7093:
---

I assume the problem is clearing the {{java.io.tmpdir}} system property in the 
finally clause.

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Amit Jain
> Fix For: 1.8, 1.7.14
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger reassigned OAK-7093:
-

Assignee: Marcel Reutegger  (was: Amit Jain)

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Marcel Reutegger
> Fix For: 1.8, 1.7.14
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain updated OAK-7093:
---
Attachment: OAK_7093.patch

[~mreutegg] Yes that looks the case here. Attached is the proposed patch

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Marcel Reutegger
> Fix For: 1.8, 1.7.14
>
> Attachments: OAK_7093.patch
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7093) ActiveDelete synchronization with BlobTracker leaves temp files

2017-12-20 Thread Marcel Reutegger (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcel Reutegger resolved OAK-7093.
---
Resolution: Fixed

Fixed the test to reset the system property at the end of the test: 
http://svn.apache.org/r1818800

> ActiveDelete synchronization with BlobTracker leaves temp files
> ---
>
> Key: OAK-7093
> URL: https://issues.apache.org/jira/browse/OAK-7093
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: blob-plugins
>Reporter: Amit Jain
>Assignee: Marcel Reutegger
> Fix For: 1.8, 1.7.14
>
> Attachments: OAK_7093.patch
>
>
> When synchronizing active deleted files with blob tracker a temp file is 
> created and not removed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7100) update htmlunit test dependency

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-7100:

Fix Version/s: 1.7.14

> update htmlunit test dependency
> ---
>
> Key: OAK-7100
> URL: https://issues.apache.org/jira/browse/OAK-7100
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: examples
>Reporter: Julian Reschke
>Priority: Minor
> Fix For: 1.8, 1.7.14
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7100) update htmlunit test dependency

2017-12-20 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298656#comment-16298656
 ] 

Julian Reschke commented on OAK-7100:
-

trunk: [r1818808|http://svn.apache.org/r1818808]


> update htmlunit test dependency
> ---
>
> Key: OAK-7100
> URL: https://issues.apache.org/jira/browse/OAK-7100
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: examples
>Reporter: Julian Reschke
>Priority: Minor
> Fix For: 1.8, 1.7.14
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7100) update htmlunit test dependency

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-7100.
-
Resolution: Fixed

> update htmlunit test dependency
> ---
>
> Key: OAK-7100
> URL: https://issues.apache.org/jira/browse/OAK-7100
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: examples
>Reporter: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8, 1.7.14
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7100) update htmlunit test dependency

2017-12-20 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-7100:

Labels: candidate_oak_1_6  (was: )

> update htmlunit test dependency
> ---
>
> Key: OAK-7100
> URL: https://issues.apache.org/jira/browse/OAK-7100
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: examples
>Reporter: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_6
> Fix For: 1.8, 1.7.14
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7101) Stale documents in RDBDocumentStore cache

2017-12-20 Thread Marcel Reutegger (JIRA)
Marcel Reutegger created OAK-7101:
-

 Summary: Stale documents in RDBDocumentStore cache
 Key: OAK-7101
 URL: https://issues.apache.org/jira/browse/OAK-7101
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: rdbmk
Affects Versions: 1.6.0
Reporter: Marcel Reutegger
Assignee: Marcel Reutegger
 Fix For: 1.8


Concurrent query and update operations on RDBDocumentStore may result in stale 
entries in the document cache.

Potentially related issues are OAK-5387 and OAK-6062.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7101) Stale documents in RDBDocumentStore cache

2017-12-20 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16298673#comment-16298673
 ] 

Marcel Reutegger commented on OAK-7101:
---

Added an ignored test: http://svn.apache.org/r1818814

> Stale documents in RDBDocumentStore cache
> -
>
> Key: OAK-7101
> URL: https://issues.apache.org/jira/browse/OAK-7101
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: rdbmk
>Affects Versions: 1.6.0
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
> Fix For: 1.8
>
>
> Concurrent query and update operations on RDBDocumentStore may result in 
> stale entries in the document cache.
> Potentially related issues are OAK-5387 and OAK-6062.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7102) Refactor DocumentIndexer logic to enable different sort approaches

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7102:


 Summary: Refactor DocumentIndexer logic to enable different sort 
approaches
 Key: OAK-7102
 URL: https://issues.apache.org/jira/browse/OAK-7102
 Project: Jackrabbit Oak
  Issue Type: Task
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.7.14, 1.8


DocumentStoreIndexer logic needs to be refactored to support plugging in 
different sort approaches



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7103) Enable compression by default on DocumentStoreIndexer logic

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7103:


 Summary: Enable compression by default on DocumentStoreIndexer 
logic
 Key: OAK-7103
 URL: https://issues.apache.org/jira/browse/OAK-7103
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.7.14, 1.8


While performing tests it appears that enabling end to end compression reduces 
the sorting time by 14 mins (39.87 min to 26.44 min) and disk consumption by 
65GB (87GB to 12.5). Based on that we should enable compression by default for

# Create compressed base store.json written by traversal
# Enable compression for intermediate files created while sorting
# Enable compression for finally sorted json file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7104) Support read and writing to compressed file in ExternalSort

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7104:


 Summary: Support read and writing to compressed file in 
ExternalSort
 Key: OAK-7104
 URL: https://issues.apache.org/jira/browse/OAK-7104
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: commons
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.7.14, 1.8


Currently ExternalSort only support compression for intermediate file created 
in merge phase. It would be good to also support reading and writing to 
compressed file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (OAK-7104) Support read and write to compressed file in ExternalSort

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-7104:
-
Summary: Support read and write to compressed file in ExternalSort  (was: 
Support read and writing to compressed file in ExternalSort)

> Support read and write to compressed file in ExternalSort
> -
>
> Key: OAK-7104
> URL: https://issues.apache.org/jira/browse/OAK-7104
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.7.14, 1.8
>
>
> Currently ExternalSort only support compression for intermediate file created 
> in merge phase. It would be good to also support reading and writing to 
> compressed file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7104) Support read and write to compressed file in ExternalSort

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7104.
--
   Resolution: Fixed
Fix Version/s: (was: 1.7.14)
   1.7.15

Done with http://svn.apache.org/viewvc?rev=1818878&view=rev

[~amjain] Please review the commit once

> Support read and write to compressed file in ExternalSort
> -
>
> Key: OAK-7104
> URL: https://issues.apache.org/jira/browse/OAK-7104
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.7.15, 1.8
>
>
> Currently ExternalSort only support compression for intermediate file created 
> in merge phase. It would be good to also support reading and writing to 
> compressed file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (OAK-7103) Enable compression by default on DocumentStoreIndexer logic

2017-12-20 Thread Chetan Mehrotra (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-7103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra resolved OAK-7103.
--
   Resolution: Fixed
Fix Version/s: (was: 1.7.14)
   1.7.15

Done with 1818879

> Enable compression by default on DocumentStoreIndexer logic
> ---
>
> Key: OAK-7103
> URL: https://issues.apache.org/jira/browse/OAK-7103
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: run
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.15
>
>
> While performing tests it appears that enabling end to end compression 
> reduces the sorting time by 14 mins (39.87 min to 26.44 min) and disk 
> consumption by 65GB (87GB to 12.5). Based on that we should enable 
> compression by default for
> # Create compressed base store.json written by traversal
> # Enable compression for intermediate files created while sorting
> # Enable compression for finally sorted json file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (OAK-7104) Support read and write to compressed file in ExternalSort

2017-12-20 Thread Amit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16299594#comment-16299594
 ] 

Amit Jain commented on OAK-7104:


[~chetanm] Looks good to me.

> Support read and write to compressed file in ExternalSort
> -
>
> Key: OAK-7104
> URL: https://issues.apache.org/jira/browse/OAK-7104
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: commons
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.8, 1.7.15
>
>
> Currently ExternalSort only support compression for intermediate file created 
> in merge phase. It would be good to also support reading and writing to 
> compressed file



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (OAK-7105) Implement a traverse with sort strategy for DocumentStoreIndexer

2017-12-20 Thread Chetan Mehrotra (JIRA)
Chetan Mehrotra created OAK-7105:


 Summary: Implement a traverse with sort strategy for 
DocumentStoreIndexer
 Key: OAK-7105
 URL: https://issues.apache.org/jira/browse/OAK-7105
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: run
Reporter: Chetan Mehrotra
Assignee: Chetan Mehrotra
 Fix For: 1.8, 1.7.15


Currently the DocumentStoreIndexer logic uses a StoreAndSortStrategy in which 
it first dumps all nodestates to a json file -> sort them in batches -> merge 
the sorted file. In whole indexing the sorting phase is taking decent amount of 
time (40 mins out of 3 hr run).

Further this approach suffers with potential OOM while ExternalSort creates in 
memory batches where actual size of batch exceeds the estimated size 
considerably. So we need to constant tweak the "oak.indexer.maxSortMemoryInGB" 
(currently set to 2 GB)

As an improvement we can do following changes

# Implement a traverse with sort strategy - Here instead of first dumping all 
nodestate in a single big json we instead add them to an in memory buffer and 
then at some stage sort the batch and save it to file
# Use better memory checks - Use the approach as implemented in GCBarrier i.e. 
monitor the current memory usage and if it goes below certain threshold trigger 
the batch sort



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)