RE: Slow full text query performance and Lucene Index handling in Oak
Hi, Since the Lucene index is in any case updated asynchronously, it should be fine for us to ignore the base NodeState of the current session and instead use an IndexSearcher based on the last state as updated by the async indexer. This would allow us to reuse the IndexSearcher over multiple queries. I was also wondering if it makes sense to share it across multiple sessions performing a query to reduce the number of index readers that may be open at the same time. however, this will likely also reduce concurrency because we synchronize access to a single session. we should also try to re-open the existing reader, which is less costly than creating a new reader. I'm not familiar anymore with the most recent lucene version, but with the version used in Jackrabbit 2.x this was possible and helped a lot. Regards Marcel
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, Do we still have the option to store the Lucene files in the file system? If we have, maybe we could run the test with that option and see if it improves performance? I'm not suggesting this is a solution, it's just one step to better analyze things. And it might be easy to do. Regards, Thomas On 08/04/14 17:51, Chetan Mehrotra chetan.mehro...@gmail.com wrote: Hi, As part of OAK-1702 I have added a benchmark to compare the performance of Full text query search with JR2 Based on approach taken (which might be wrong) I get following numbers Apache Jackrabbit Oak 0.21.0-SNAPSHOT # FullTextSearchTest C min 10% 50% 90% max N Oak-Mongo 1 58 71 101 119 287 610 Oak-Mongo-FDS 1 50 51 52 58 1841106 Oak-Tar1 39 40 40 44 641459 Oak-Tar-FDS1 53 54 55 64 1971030 Jackrabbit 1 4 4 5 6 231 11385 Which shows that JR2 performs lot better for full text queries and subsequent queries are quite faster once Lucene has warmed up. Looking at current usage of Lucene in Oak and the way we store and access the Lucene indexes [2] I have couple of doubts 1. Multiple IndexSearcher instances - Current impl would create a new IndexSearcher for every Lucene query as the OakDirectory uses is bound to NodeState of executing JCR session. Compared to this in JR2 we probably had a singleton IndexSearcher which was shared across all the query execution path. This would potentially cause performance issue as Lucene is effectively used in a state less way and it has to perform initialization for every call. As [3] the IndexSearcher must be shared 2. Index Access - Currently we have custom OakDirectory which provides access to Lucene indexes stored in NodeStore. Even with SegmentStore which has memory mapped file the random access used by Lucene would probably be lot slower with OakDirectory in comparison to default Lucene MMapDirectory. For small setups where Lucene index can be accomodated on each node I think it would be better if the index is access from file system Are the above concerns valid and should we relook into how we are using Lucene in Oak? Chetan Mehrotra [1] https://issues.apache.org/jira/browse/OAK-1702 [2] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/ja va/org/apache/jackrabbit/oak/plugins/index/lucene/OakDirectory.java [3] http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 12:25 PM, Marcel Reutegger mreut...@adobe.com wrote: Since the Lucene index is in any case updated asynchronously, it should be fine for us to ignore the base NodeState of the current session and instead use an IndexSearcher based on the last state as updated by the async indexer. This would allow us to reuse the IndexSearcher over multiple queries. I was also wondering if it makes sense to share it across multiple sessions performing a query to reduce the number of index readers that may be open at the same time. however, this will likely also reduce concurrency because we synchronize access to a single session. I tried with one approach where I used a custom SerahcerManager based on Lucene SearcherManager. It obtains the root NodeState directly from NodeStore. As NodeStore can be accessed concurrently it should not have any impact on session concurrency With this change there is a slight improvement Oak-Tar1 39 40 40 44 641459 Oak-Tar(Shared)1 32 33 34 36 611738 So did not gave much boost (at least with approach taken). As I do not have much understanding of Lucene internal can someone review the approach taken and see if there are some major issues with it Chetan Mehrotra [1] https://issues.apache.org/jira/secure/attachment/12639366/OAK-1702-shared-indexer.patch [2] https://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/SearcherManager.html
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 3:00 PM, Alex Parvulescu alex.parvule...@gmail.com wrote: - the patch assumes that there is and will be a single lucene index directly under the root node, which may not necessarily be the case. I agree this assumption holds now, but I would not introduce any changes that take away this flexibility. That is not a problem per se as IndexReader starts with a count of 1. So it would never go zero The problem appears to be somewhere else. As I modified the code to use shared IndexSearcher and native FSDirectory and still the performance improvement was marginal. The problem is occuring because the org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex#query [1] currently does a eager initialization of cursor while the testcase only fetches the first result. Compared to this the JR2 version does a lazy evaluation. If put a break in loop (exit after first result) the results are much better Oak-Tar(break.shared searcher,fs) 1 2 2 3 3 170 23204 Oak-Tar(break) 1 5 5 5 6 90 10593 Jackrabbit 1 4 4 5 6 231 11385 Now I am not sure if this a problem with the usecase taken. Or the Lucene Index cursor management should be improved as in many case the results would be multiple but the client code only makes use of initial few results Chetan Mehrotra [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java#L381-L409
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: ... the testcase only fetches the first result. Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. BR, Jukka Zitting
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, We have results from a different test case with multiple threads (internal id GRANITE-5572). We have 50 full thread dumps, and there I count: * 259 cases of LuceneIndex.java line 365: IndexReader reader = DirectoryReader.open(directory); * 43 cases of LuceneIndex.java line 379: TopDocs docs = searcher.search(query, Integer.MAX_VALUE); * 13 cases of LuceneInde.java line 382: String path = reader.document(doc.doc, PATH_SELECTOR).get(PATH); So, running the Lucene query and getting the paths is slow, but opening the Lucene index is even slower in this test case. Regards, Thomas On 09/04/14 13:44, Jukka Zitting jukka.zitt...@gmail.com wrote: Hi, On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: ... the testcase only fetches the first result. Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. BR, Jukka Zitting
Re: Slow full text query performance and Lucene Index handling in Oak
2014-04-09 13:44 GMT+02:00 Jukka Zitting jukka.zitt...@gmail.com: Hi, On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra chetan.mehro...@gmail.com wrote: ... the testcase only fetches the first result. Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. +1 also, I wonder if we shouldn't also profile the stack of underlying calls in the QueryEngine to measure how much time is spent there and how much time is spent in the specific QueryIndex implementation. Regards, Tommaso BR, Jukka Zitting
Re: Slow full text query performance and Lucene Index handling in Oak
also, I wonder if we shouldn't also profile the stack of underlying calls in the QueryEngine to measure how much time is spent there and how much time is spent in the specific QueryIndex implementation. Analyzing full thread dumps will give you the statistical distribution, which is quite accurate if you have enough data. In the full thread dumps I saw so far, I didn't see a thread running within the query engine itself. All (~300) threads where in the LuceneIndex for this case. So I expect the query engine part is negligible (less than 1%). Regards, Thomas
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra
Re: Slow full text query performance and Lucene Index handling in Oak
Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? +1 I noticed that too, we should try to disable compression and compare results. alex On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra chetan.mehro...@gmail.comwrote: On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra
Re: Slow full text query performance and Lucene Index handling in Oak
I'm looking into the Lucene codecs right now. Tommaso 2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com: Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? +1 I noticed that too, we should try to disable compression and compare results. alex On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra chetan.mehro...@gmail.comwrote: On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra
Re: Slow full text query performance and Lucene Index handling in Oak
Aside from the compression issue, there was another one related to the 'order by' clause. I saw Collections.sort taking up as far as 23% of the perf. I removed the order by temporarily so it doesn't get in the way of the Lucene stuff, but I think the QueryEngine should skip ordering results in this case. On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: I'm looking into the Lucene codecs right now. Tommaso 2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com: Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? +1 I noticed that too, we should try to disable compression and compare results. alex On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra chetan.mehro...@gmail.comwrote: On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra
Re: Slow full text query performance and Lucene Index handling in Oak
Current update 1. Tommaso provided a patch (OAK-1702) to disable compression and that also helps quite a bit 2. Currently we are storing the full tokenized text in Lucene Index [1]. This would cause fetching of doc fields to be slower. On disabling the storage the number improve quite a bit. This was added as part of OAK-319 for supporting MLT # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar (codec)1 9 9 10 12 415664 Oak-Tar (codec,mlt off)1 7 8 8 10 216921 Would look further Chetan Mehrotra [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/FieldFactory.java#L44 On Wed, Apr 9, 2014 at 7:15 PM, Alex Parvulescu alex.parvule...@gmail.com wrote: Aside from the compression issue, there was another one related to the 'order by' clause. I saw Collections.sort taking up as far as 23% of the perf. I removed the order by temporarily so it doesn't get in the way of the Lucene stuff, but I think the QueryEngine should skip ordering results in this case. On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: I'm looking into the Lucene codecs right now. Tommaso 2014-04-09 15:20 GMT+02:00 Alex Parvulescu alex.parvule...@gmail.com: Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? +1 I noticed that too, we should try to disable compression and compare results. alex On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra chetan.mehro...@gmail.comwrote: On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting jukka.zitt...@gmail.com wrote: Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra