RE: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Marcel Reutegger
Hi,

 Since the Lucene index is in any case updated asynchronously, it
 should be fine for us to ignore the base NodeState of the current
 session and instead use an IndexSearcher based on the last state as
 updated by the async indexer. This would allow us to reuse the
 IndexSearcher over multiple queries.

I was also wondering if it makes sense to share it across multiple
sessions performing a query to reduce the number of index readers
that may be open at the same time. however, this will likely also
reduce concurrency because we synchronize access to a single
session.

we should also try to re-open the existing reader, which is less
costly than creating a new reader. I'm not familiar anymore with
the most recent lucene version, but with the version used in
Jackrabbit 2.x this was possible and helped a lot.

Regards
 Marcel


Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Thomas Mueller
Hi,

Do we still have the option to store the Lucene files in the file system?
If we have, maybe we could run the test with that option and see if it
improves performance? I'm not suggesting this is a solution, it's just one
step to better analyze things. And it might be easy to do.

Regards,
Thomas



On 08/04/14 17:51, Chetan Mehrotra chetan.mehro...@gmail.com wrote:

Hi,

As part of OAK-1702 I have added a benchmark to compare the
performance of Full text query search with JR2

Based on approach taken (which might be wrong) I get following numbers

Apache Jackrabbit Oak 0.21.0-SNAPSHOT
# FullTextSearchTest   C min 10% 50% 90%
  max   N
Oak-Mongo  1  58  71 101 119
  287 610
Oak-Mongo-FDS  1  50  51  52  58
  1841106
Oak-Tar1  39  40  40  44
   641459
Oak-Tar-FDS1  53  54  55  64
  1971030
Jackrabbit 1   4   4   5   6
  231   11385

Which shows that JR2 performs lot better for full text queries and
subsequent queries are quite faster once Lucene has warmed up.

Looking at current usage of Lucene in Oak and the way we store and
access the Lucene indexes [2] I have couple of doubts

1. Multiple IndexSearcher instances - Current impl would create a new
IndexSearcher for every Lucene query as the OakDirectory uses is bound
to NodeState of executing JCR session. Compared to this in JR2 we
probably had a singleton IndexSearcher which was shared across all the
query execution path. This would potentially cause performance issue
as Lucene is effectively used in a state less way and it has to
perform initialization for every call. As [3] the IndexSearcher must
be shared

2. Index Access - Currently we have custom OakDirectory which provides
access to Lucene indexes stored in NodeStore. Even with SegmentStore
which has memory mapped file the random access used by Lucene would
probably be lot slower with OakDirectory in comparison to default
Lucene MMapDirectory. For small setups where Lucene index can be
accomodated on each node I think it would be better if the index is
access from file system

Are the above concerns valid and should we relook into how we are
using Lucene in Oak?

Chetan Mehrotra
[1] https://issues.apache.org/jira/browse/OAK-1702
[2] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/ja
va/org/apache/jackrabbit/oak/plugins/index/lucene/OakDirectory.java
[3] http://wiki.apache.org/lucene-java/ImproveSearchingSpeed



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Chetan Mehrotra
On Wed, Apr 9, 2014 at 12:25 PM, Marcel Reutegger mreut...@adobe.com wrote:
 Since the Lucene index is in any case updated asynchronously, it
 should be fine for us to ignore the base NodeState of the current
 session and instead use an IndexSearcher based on the last state as
 updated by the async indexer. This would allow us to reuse the
 IndexSearcher over multiple queries.

 I was also wondering if it makes sense to share it across multiple
 sessions performing a query to reduce the number of index readers
 that may be open at the same time. however, this will likely also
 reduce concurrency because we synchronize access to a single
 session.

I tried with one approach where I used a custom SerahcerManager based
on Lucene SearcherManager. It obtains the root NodeState directly from
NodeStore. As NodeStore can be accessed concurrently it should not
have any impact on session concurrency

With this change there is a slight improvement

Oak-Tar1  39  40  40  44
   641459
Oak-Tar(Shared)1  32  33  34  36
   611738

So did not gave much boost (at least with approach taken). As I do not
have much understanding of Lucene internal can someone review the
approach taken and see if there are some major issues with it


Chetan Mehrotra
[1] 
https://issues.apache.org/jira/secure/attachment/12639366/OAK-1702-shared-indexer.patch
[2] 
https://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/SearcherManager.html


Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Chetan Mehrotra
On Wed, Apr 9, 2014 at 3:00 PM, Alex Parvulescu
alex.parvule...@gmail.com wrote:
  - the patch assumes that there is and will be a single lucene index
 directly under the root node, which may not necessarily be the case. I
 agree this assumption holds now, but I would not introduce any changes that
 take away this flexibility.

That is not a problem per se as IndexReader starts with a count of 1.
So it would never go zero

The problem appears to be somewhere else. As I modified the code to
use shared IndexSearcher and native FSDirectory and still the
performance improvement was marginal.

The problem is occuring because the
org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex#query [1]
currently does a eager initialization of cursor while the testcase
only fetches the first result. Compared to this the JR2 version does a
lazy evaluation. If put a break in loop (exit after first result) the
results are much better

Oak-Tar(break.shared searcher,fs)  1   2   2   3   3
  170   23204
Oak-Tar(break) 1   5   5   5   6
   90   10593
Jackrabbit 1   4   4   5   6
  231   11385

Now I am not sure if this a problem with the usecase taken. Or the
Lucene Index cursor management should be improved as in many case the
results would be multiple but the client code only makes use of
initial few results

Chetan Mehrotra
[1] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java#L381-L409


Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Jukka Zitting
Hi,

On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra
chetan.mehro...@gmail.com wrote:
 ... the testcase only fetches the first result.

Is that a common use case? To better simulate a normal usage scenario
I'd make the benchmark fetch up to N results (where N is configurable,
with default something like 20) and access the path and the title
property of the matching nodes.

BR,

Jukka Zitting


Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Thomas Mueller
Hi,

We have results from a different test case with multiple threads (internal
id GRANITE-5572). We have 50 full thread dumps, and there I count:

* 259 cases of LuceneIndex.java line 365:
  IndexReader reader = DirectoryReader.open(directory);

* 43 cases of LuceneIndex.java line 379:
  TopDocs docs = searcher.search(query, Integer.MAX_VALUE);

* 13 cases of LuceneInde.java line 382:
  String path = reader.document(doc.doc, PATH_SELECTOR).get(PATH);


So, running the Lucene query and getting the paths is slow, but opening
the Lucene index is even slower in this test case.

Regards,
Thomas



On 09/04/14 13:44, Jukka Zitting jukka.zitt...@gmail.com wrote:

Hi,

On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra
chetan.mehro...@gmail.com wrote:
 ... the testcase only fetches the first result.

Is that a common use case? To better simulate a normal usage scenario
I'd make the benchmark fetch up to N results (where N is configurable,
with default something like 20) and access the path and the title
property of the matching nodes.

BR,

Jukka Zitting



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Tommaso Teofili
2014-04-09 13:44 GMT+02:00 Jukka Zitting jukka.zitt...@gmail.com:

 Hi,

 On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra
 chetan.mehro...@gmail.com wrote:
  ... the testcase only fetches the first result.

 Is that a common use case? To better simulate a normal usage scenario
 I'd make the benchmark fetch up to N results (where N is configurable,
 with default something like 20) and access the path and the title
 property of the matching nodes.


+1

also, I wonder if we shouldn't also profile the stack of underlying calls
in the QueryEngine to measure how much time is spent there and how much
time is spent in the specific QueryIndex implementation.

Regards,
Tommaso




 BR,

 Jukka Zitting



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Thomas Mueller

also, I wonder if we shouldn't also profile the stack of underlying calls
in the QueryEngine to measure how much time is spent there and how much
time is spent in the specific QueryIndex implementation.

Analyzing full thread dumps will give you the statistical distribution,
which is quite accurate if you have enough data. In the full thread dumps
I saw so far, I didn't see a thread running within the query engine
itself. All (~300) threads where in the LuceneIndex for this case. So I
expect the query engine part is negligible (less than 1%).

Regards,
Thomas





Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Chetan Mehrotra
On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Is that a common use case? To better simulate a normal usage scenario
 I'd make the benchmark fetch up to N results (where N is configurable,
 with default something like 20) and access the path and the title
 property of the matching nodes.

I changed the logic of benchmark in http://svn.apache.org/r1585962.
With that JR2 slows down a bit

# FullTextSearchTest   C min 10% 50% 90%
  max   N
Oak-Tar1  34  35  36  39
   601639
Jackrabbit 1   5   5   6   7
   68   10038

Profiling the result shows that quite a bit of time goes in
org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
think is part of Lucene 4.x and not present in 3.x. Any idea if I can
disable compression?

Chetan Mehrotra


Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Alex Parvulescu
Profiling the result shows that quite a bit of time goes in
org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
think is part of Lucene 4.x and not present in 3.x. Any idea if I can
disable compression?

+1 I noticed that too, we should try to disable compression and compare
results.

alex


On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra
chetan.mehro...@gmail.comwrote:

 On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com
 wrote:
  Is that a common use case? To better simulate a normal usage scenario
  I'd make the benchmark fetch up to N results (where N is configurable,
  with default something like 20) and access the path and the title
  property of the matching nodes.

 I changed the logic of benchmark in http://svn.apache.org/r1585962.
 With that JR2 slows down a bit

 # FullTextSearchTest   C min 10% 50% 90%
   max   N
 Oak-Tar1  34  35  36  39
601639
 Jackrabbit 1   5   5   6   7
68   10038

 Profiling the result shows that quite a bit of time goes in
 org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
 think is part of Lucene 4.x and not present in 3.x. Any idea if I can
 disable compression?

 Chetan Mehrotra



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Tommaso Teofili
I'm looking into the Lucene codecs right now.

Tommaso


2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com:

 Profiling the result shows that quite a bit of time goes in
 org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
 think is part of Lucene 4.x and not present in 3.x. Any idea if I can
 disable compression?

 +1 I noticed that too, we should try to disable compression and compare
 results.

 alex


 On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra
 chetan.mehro...@gmail.comwrote:

  On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com
  wrote:
   Is that a common use case? To better simulate a normal usage scenario
   I'd make the benchmark fetch up to N results (where N is configurable,
   with default something like 20) and access the path and the title
   property of the matching nodes.
 
  I changed the logic of benchmark in http://svn.apache.org/r1585962.
  With that JR2 slows down a bit
 
  # FullTextSearchTest   C min 10% 50% 90%
max   N
  Oak-Tar1  34  35  36  39
 601639
  Jackrabbit 1   5   5   6   7
 68   10038
 
  Profiling the result shows that quite a bit of time goes in
  org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
  think is part of Lucene 4.x and not present in 3.x. Any idea if I can
  disable compression?
 
  Chetan Mehrotra
 



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Alex Parvulescu
Aside from the compression issue, there was another one related to the
'order by' clause. I saw Collections.sort taking up as far as 23% of the
perf.

I removed the order by temporarily so it doesn't get in the way of the
Lucene stuff, but I think the QueryEngine should skip ordering results in
this case.




On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili
tommaso.teof...@gmail.comwrote:

 I'm looking into the Lucene codecs right now.

 Tommaso


 2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com:

  Profiling the result shows that quite a bit of time goes in
  org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
  think is part of Lucene 4.x and not present in 3.x. Any idea if I can
  disable compression?
 
  +1 I noticed that too, we should try to disable compression and compare
  results.
 
  alex
 
 
  On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra
  chetan.mehro...@gmail.comwrote:
 
   On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com
 
   wrote:
Is that a common use case? To better simulate a normal usage scenario
I'd make the benchmark fetch up to N results (where N is
 configurable,
with default something like 20) and access the path and the title
property of the matching nodes.
  
   I changed the logic of benchmark in http://svn.apache.org/r1585962.
   With that JR2 slows down a bit
  
   # FullTextSearchTest   C min 10% 50% 90%
 max   N
   Oak-Tar1  34  35  36  39
  601639
   Jackrabbit 1   5   5   6   7
  68   10038
  
   Profiling the result shows that quite a bit of time goes in
   org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
   think is part of Lucene 4.x and not present in 3.x. Any idea if I can
   disable compression?
  
   Chetan Mehrotra
  
 



Re: Slow full text query performance and Lucene Index handling in Oak

2014-04-09 Thread Chetan Mehrotra
Current update

1. Tommaso provided a patch (OAK-1702) to disable compression and that
also helps quite a bit
2. Currently we are storing the full tokenized text in Lucene Index
[1]. This would cause fetching of doc fields to be slower. On
disabling the storage the number improve quite a bit. This was added
as part of OAK-319 for supporting MLT

# FullTextSearchTest   C min 10% 50% 90%
  max   N
Oak-Tar (codec)1   9   9  10  12
   415664
Oak-Tar (codec,mlt off)1   7   8   8  10
   216921

Would look further

Chetan Mehrotra
[1] 
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/FieldFactory.java#L44

On Wed, Apr 9, 2014 at 7:15 PM, Alex Parvulescu
alex.parvule...@gmail.com wrote:
 Aside from the compression issue, there was another one related to the
 'order by' clause. I saw Collections.sort taking up as far as 23% of the
 perf.

 I removed the order by temporarily so it doesn't get in the way of the
 Lucene stuff, but I think the QueryEngine should skip ordering results in
 this case.




 On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili
 tommaso.teof...@gmail.comwrote:

 I'm looking into the Lucene codecs right now.

 Tommaso


 2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com:

  Profiling the result shows that quite a bit of time goes in
  org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
  think is part of Lucene 4.x and not present in 3.x. Any idea if I can
  disable compression?
 
  +1 I noticed that too, we should try to disable compression and compare
  results.
 
  alex
 
 
  On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra
  chetan.mehro...@gmail.comwrote:
 
   On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com
 
   wrote:
Is that a common use case? To better simulate a normal usage scenario
I'd make the benchmark fetch up to N results (where N is
 configurable,
with default something like 20) and access the path and the title
property of the matching nodes.
  
   I changed the logic of benchmark in http://svn.apache.org/r1585962.
   With that JR2 slows down a bit
  
   # FullTextSearchTest   C min 10% 50% 90%
 max   N
   Oak-Tar1  34  35  36  39
  601639
   Jackrabbit 1   5   5   6   7
  68   10038
  
   Profiling the result shows that quite a bit of time goes in
   org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I
   think is part of Lucene 4.x and not present in 3.x. Any idea if I can
   disable compression?
  
   Chetan Mehrotra