Re: Slow full text query performance and Lucene Index handling in Oak
Hi, I not sure if Chetans test case matches the real world usage, if Collections.sort takes up 23% of the performance... I have not seen Collections.sort in other profiling results at all (so I guess it was less than 1%). Also, I have seen opening the Lucene index takes much more time in other tests than it takes for Chetans test case. Regards, Thomas On 09/04/14 15:45, "Alex Parvulescu" wrote: >Aside from the compression issue, there was another one related to the >'order by' clause. I saw Collections.sort taking up as far as 23% of the >perf. > >I removed the order by temporarily so it doesn't get in the way of the >Lucene stuff, but I think the QueryEngine should skip ordering results in >this case. > > > > >On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili >wrote: > >> I'm looking into the Lucene codecs right now. >> >> Tommaso >> >> >> 2014-04-09 15:20 GMT+02:00 Alex Parvulescu : >> >> > Profiling the result shows that quite a bit of time goes in >> > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I >> > think is part of Lucene 4.x and not present in 3.x. Any idea if I can >> > disable compression? >> > >> > +1 I noticed that too, we should try to disable compression and >>compare >> > results. >> > >> > alex >> > >> > >> > On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra >> > wrote: >> > >> > > On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting >>> > >> > > wrote: >> > > > Is that a common use case? To better simulate a normal usage >>scenario >> > > > I'd make the benchmark fetch up to N results (where N is >> configurable, >> > > > with default something like 20) and access the path and the title >> > > > property of the matching nodes. >> > > >> > > I changed the logic of benchmark in http://svn.apache.org/r1585962. >> > > With that JR2 slows down a bit >> > > >> > > # FullTextSearchTest C min 10% 50% 90% >> > > max N >> > > Oak-Tar1 34 35 36 39 >> > >601639 >> > > Jackrabbit 1 5 5 6 7 >> > >68 10038 >> > > >> > > Profiling the result shows that quite a bit of time goes in >> > > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I >> > > think is part of Lucene 4.x and not present in 3.x. Any idea if I >>can >> > > disable compression? >> > > >> > > Chetan Mehrotra >> > > >> > >>
Re: Slow full text query performance and Lucene Index handling in Oak
Current update 1. Tommaso provided a patch (OAK-1702) to disable compression and that also helps quite a bit 2. Currently we are storing the full tokenized text in Lucene Index [1]. This would cause fetching of doc fields to be slower. On disabling the storage the number improve quite a bit. This was added as part of OAK-319 for supporting MLT # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar (codec)1 9 9 10 12 415664 Oak-Tar (codec,mlt off)1 7 8 8 10 216921 Would look further Chetan Mehrotra [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/FieldFactory.java#L44 On Wed, Apr 9, 2014 at 7:15 PM, Alex Parvulescu wrote: > Aside from the compression issue, there was another one related to the > 'order by' clause. I saw Collections.sort taking up as far as 23% of the > perf. > > I removed the order by temporarily so it doesn't get in the way of the > Lucene stuff, but I think the QueryEngine should skip ordering results in > this case. > > > > > On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili > wrote: > >> I'm looking into the Lucene codecs right now. >> >> Tommaso >> >> >> 2014-04-09 15:20 GMT+02:00 Alex Parvulescu : >> >> > Profiling the result shows that quite a bit of time goes in >> > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I >> > think is part of Lucene 4.x and not present in 3.x. Any idea if I can >> > disable compression? >> > >> > +1 I noticed that too, we should try to disable compression and compare >> > results. >> > >> > alex >> > >> > >> > On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra >> > wrote: >> > >> > > On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting > > >> > > wrote: >> > > > Is that a common use case? To better simulate a normal usage scenario >> > > > I'd make the benchmark fetch up to N results (where N is >> configurable, >> > > > with default something like 20) and access the path and the title >> > > > property of the matching nodes. >> > > >> > > I changed the logic of benchmark in http://svn.apache.org/r1585962. >> > > With that JR2 slows down a bit >> > > >> > > # FullTextSearchTest C min 10% 50% 90% >> > > max N >> > > Oak-Tar1 34 35 36 39 >> > >601639 >> > > Jackrabbit 1 5 5 6 7 >> > >68 10038 >> > > >> > > Profiling the result shows that quite a bit of time goes in >> > > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I >> > > think is part of Lucene 4.x and not present in 3.x. Any idea if I can >> > > disable compression? >> > > >> > > Chetan Mehrotra >> > > >> > >>
Re: Slow full text query performance and Lucene Index handling in Oak
Aside from the compression issue, there was another one related to the 'order by' clause. I saw Collections.sort taking up as far as 23% of the perf. I removed the order by temporarily so it doesn't get in the way of the Lucene stuff, but I think the QueryEngine should skip ordering results in this case. On Wed, Apr 9, 2014 at 3:31 PM, Tommaso Teofili wrote: > I'm looking into the Lucene codecs right now. > > Tommaso > > > 2014-04-09 15:20 GMT+02:00 Alex Parvulescu : > > > Profiling the result shows that quite a bit of time goes in > > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I > > think is part of Lucene 4.x and not present in 3.x. Any idea if I can > > disable compression? > > > > +1 I noticed that too, we should try to disable compression and compare > > results. > > > > alex > > > > > > On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra > > wrote: > > > > > On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting > > > > wrote: > > > > Is that a common use case? To better simulate a normal usage scenario > > > > I'd make the benchmark fetch up to N results (where N is > configurable, > > > > with default something like 20) and access the path and the title > > > > property of the matching nodes. > > > > > > I changed the logic of benchmark in http://svn.apache.org/r1585962. > > > With that JR2 slows down a bit > > > > > > # FullTextSearchTest C min 10% 50% 90% > > > max N > > > Oak-Tar1 34 35 36 39 > > >601639 > > > Jackrabbit 1 5 5 6 7 > > >68 10038 > > > > > > Profiling the result shows that quite a bit of time goes in > > > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I > > > think is part of Lucene 4.x and not present in 3.x. Any idea if I can > > > disable compression? > > > > > > Chetan Mehrotra > > > > > >
Re: Slow full text query performance and Lucene Index handling in Oak
I'm looking into the Lucene codecs right now. Tommaso 2014-04-09 15:20 GMT+02:00 Alex Parvulescu : > Profiling the result shows that quite a bit of time goes in > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I > think is part of Lucene 4.x and not present in 3.x. Any idea if I can > disable compression? > > +1 I noticed that too, we should try to disable compression and compare > results. > > alex > > > On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra > wrote: > > > On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting > > wrote: > > > Is that a common use case? To better simulate a normal usage scenario > > > I'd make the benchmark fetch up to N results (where N is configurable, > > > with default something like 20) and access the path and the title > > > property of the matching nodes. > > > > I changed the logic of benchmark in http://svn.apache.org/r1585962. > > With that JR2 slows down a bit > > > > # FullTextSearchTest C min 10% 50% 90% > > max N > > Oak-Tar1 34 35 36 39 > >601639 > > Jackrabbit 1 5 5 6 7 > >68 10038 > > > > Profiling the result shows that quite a bit of time goes in > > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I > > think is part of Lucene 4.x and not present in 3.x. Any idea if I can > > disable compression? > > > > Chetan Mehrotra > > >
Re: Slow full text query performance and Lucene Index handling in Oak
Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? +1 I noticed that too, we should try to disable compression and compare results. alex On Wed, Apr 9, 2014 at 3:16 PM, Chetan Mehrotra wrote: > On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting > wrote: > > Is that a common use case? To better simulate a normal usage scenario > > I'd make the benchmark fetch up to N results (where N is configurable, > > with default something like 20) and access the path and the title > > property of the matching nodes. > > I changed the logic of benchmark in http://svn.apache.org/r1585962. > With that JR2 slows down a bit > > # FullTextSearchTest C min 10% 50% 90% > max N > Oak-Tar1 34 35 36 39 >601639 > Jackrabbit 1 5 5 6 7 >68 10038 > > Profiling the result shows that quite a bit of time goes in > org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I > think is part of Lucene 4.x and not present in 3.x. Any idea if I can > disable compression? > > Chetan Mehrotra >
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 5:14 PM, Jukka Zitting wrote: > Is that a common use case? To better simulate a normal usage scenario > I'd make the benchmark fetch up to N results (where N is configurable, > with default something like 20) and access the path and the title > property of the matching nodes. I changed the logic of benchmark in http://svn.apache.org/r1585962. With that JR2 slows down a bit # FullTextSearchTest C min 10% 50% 90% max N Oak-Tar1 34 35 36 39 601639 Jackrabbit 1 5 5 6 7 68 10038 Profiling the result shows that quite a bit of time goes in org.apache.lucene.codecs.compressing.LZ4.decompress() (40%). This I think is part of Lucene 4.x and not present in 3.x. Any idea if I can disable compression? Chetan Mehrotra
Re: Slow full text query performance and Lucene Index handling in Oak
> >also, I wonder if we shouldn't also profile the stack of underlying calls >in the QueryEngine to measure how much time is spent there and how much >time is spent in the specific QueryIndex implementation. Analyzing full thread dumps will give you the statistical distribution, which is quite accurate if you have enough data. In the full thread dumps I saw so far, I didn't see a thread running within the query engine itself. All (~300) threads where in the LuceneIndex for this case. So I expect the query engine part is negligible (less than 1%). Regards, Thomas
Re: Slow full text query performance and Lucene Index handling in Oak
2014-04-09 13:44 GMT+02:00 Jukka Zitting : > Hi, > > On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra > wrote: > > ... the testcase only fetches the first result. > > Is that a common use case? To better simulate a normal usage scenario > I'd make the benchmark fetch up to N results (where N is configurable, > with default something like 20) and access the path and the title > property of the matching nodes. > +1 also, I wonder if we shouldn't also profile the stack of underlying calls in the QueryEngine to measure how much time is spent there and how much time is spent in the specific QueryIndex implementation. Regards, Tommaso > > BR, > > Jukka Zitting >
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, We have results from a different test case with multiple threads (internal id GRANITE-5572). We have 50 full thread dumps, and there I count: * 259 cases of LuceneIndex.java line 365: IndexReader reader = DirectoryReader.open(directory); * 43 cases of LuceneIndex.java line 379: TopDocs docs = searcher.search(query, Integer.MAX_VALUE); * 13 cases of LuceneInde.java line 382: String path = reader.document(doc.doc, PATH_SELECTOR).get(PATH); So, running the Lucene query and getting the paths is slow, but opening the Lucene index is even slower in this test case. Regards, Thomas On 09/04/14 13:44, "Jukka Zitting" wrote: >Hi, > >On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra > wrote: >> ... the testcase only fetches the first result. > >Is that a common use case? To better simulate a normal usage scenario >I'd make the benchmark fetch up to N results (where N is configurable, >with default something like 20) and access the path and the title >property of the matching nodes. > >BR, > >Jukka Zitting
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, On Wed, Apr 9, 2014 at 7:24 AM, Chetan Mehrotra wrote: > ... the testcase only fetches the first result. Is that a common use case? To better simulate a normal usage scenario I'd make the benchmark fetch up to N results (where N is configurable, with default something like 20) and access the path and the title property of the matching nodes. BR, Jukka Zitting
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 3:00 PM, Alex Parvulescu wrote: > - the patch assumes that there is and will be a single lucene index > directly under the root node, which may not necessarily be the case. I > agree this assumption holds now, but I would not introduce any changes that > take away this flexibility. That is not a problem per se as IndexReader starts with a count of 1. So it would never go zero The problem appears to be somewhere else. As I modified the code to use shared IndexSearcher and native FSDirectory and still the performance improvement was marginal. The problem is occuring because the org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndex#query [1] currently does a eager initialization of cursor while the testcase only fetches the first result. Compared to this the JR2 version does a lazy evaluation. If put a break in loop (exit after first result) the results are much better Oak-Tar(break.shared searcher,fs) 1 2 2 3 3 170 23204 Oak-Tar(break) 1 5 5 5 6 90 10593 Jackrabbit 1 4 4 5 6 231 11385 Now I am not sure if this a problem with the usecase taken. Or the Lucene Index cursor management should be improved as in many case the results would be multiple but the client code only makes use of initial few results Chetan Mehrotra [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndex.java#L381-L409
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, I agree with the idea to find a way to share the readers across threads. Looking at the proposed patch I see a few problems: - the patch assumes that there is and will be a single lucene index directly under the root node, which may not necessarily be the case. I agree this assumption holds now, but I would not introduce any changes that take away this flexibility. - browsing through I notice that this only helps with concurrent threads, the call searcherManager.release translates into a decRef which means the readers will be closed if I'm not mistaken. This might explain the only marginal gain in perf. We should be looking for a more general optimization where we might leverage the fact that the index can be updated only each 5 seconds. I was thinking that we can use the initial NodeState from the index content node as a way to tell if it changed or not (using equals calls). It would work in the following way: first call, no state in the searchManager, take the provided NodeState (again random node state, the index could be on any node of the repo), build an index reader based on this, reuse it from how many threads you need. Cache this under path/NodeSate/IndexReader. On each subsequent call we can use the provided NodeState to check if the cache is stale or not: path + NodeState.equals. The biggest problem I see here is resource cleanup, as we'll not call decRef on each search call, we need a way to get notified when the application shuts down. Similar to Chetan's patch we can use a combo of 'Closeable' and '@Deactivate' but I'm not sure that will be enough outside OSGi. Take this with a grain of salt, I probably missed some aspects of the problem along the way. best, alex On Wed, Apr 9, 2014 at 10:43 AM, Chetan Mehrotra wrote: > On Wed, Apr 9, 2014 at 12:25 PM, Marcel Reutegger > wrote: > >> Since the Lucene index is in any case updated asynchronously, it > >> should be fine for us to ignore the base NodeState of the current > >> session and instead use an IndexSearcher based on the last state as > >> updated by the async indexer. This would allow us to reuse the > >> IndexSearcher over multiple queries. > > > > I was also wondering if it makes sense to share it across multiple > > sessions performing a query to reduce the number of index readers > > that may be open at the same time. however, this will likely also > > reduce concurrency because we synchronize access to a single > > session. > > I tried with one approach where I used a custom SerahcerManager based > on Lucene SearcherManager. It obtains the root NodeState directly from > NodeStore. As NodeStore can be accessed concurrently it should not > have any impact on session concurrency > > With this change there is a slight improvement > > Oak-Tar1 39 40 40 44 >641459 > Oak-Tar(Shared)1 32 33 34 36 >611738 > > So did not gave much boost (at least with approach taken). As I do not > have much understanding of Lucene internal can someone review the > approach taken and see if there are some major issues with it > > > Chetan Mehrotra > [1] > https://issues.apache.org/jira/secure/attachment/12639366/OAK-1702-shared-indexer.patch > [2] > https://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/SearcherManager.html >
Re: Slow full text query performance and Lucene Index handling in Oak
On Wed, Apr 9, 2014 at 12:25 PM, Marcel Reutegger wrote: >> Since the Lucene index is in any case updated asynchronously, it >> should be fine for us to ignore the base NodeState of the current >> session and instead use an IndexSearcher based on the last state as >> updated by the async indexer. This would allow us to reuse the >> IndexSearcher over multiple queries. > > I was also wondering if it makes sense to share it across multiple > sessions performing a query to reduce the number of index readers > that may be open at the same time. however, this will likely also > reduce concurrency because we synchronize access to a single > session. I tried with one approach where I used a custom SerahcerManager based on Lucene SearcherManager. It obtains the root NodeState directly from NodeStore. As NodeStore can be accessed concurrently it should not have any impact on session concurrency With this change there is a slight improvement Oak-Tar1 39 40 40 44 641459 Oak-Tar(Shared)1 32 33 34 36 611738 So did not gave much boost (at least with approach taken). As I do not have much understanding of Lucene internal can someone review the approach taken and see if there are some major issues with it Chetan Mehrotra [1] https://issues.apache.org/jira/secure/attachment/12639366/OAK-1702-shared-indexer.patch [2] https://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/search/SearcherManager.html
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, Do we still have the option to store the Lucene files in the file system? If we have, maybe we could run the test with that option and see if it improves performance? I'm not suggesting this is a solution, it's just one step to better analyze things. And it might be easy to do. Regards, Thomas On 08/04/14 17:51, "Chetan Mehrotra" wrote: >Hi, > >As part of OAK-1702 I have added a benchmark to compare the >performance of Full text query search with JR2 > >Based on approach taken (which might be wrong) I get following numbers > >Apache Jackrabbit Oak 0.21.0-SNAPSHOT ># FullTextSearchTest C min 10% 50% 90% > max N >Oak-Mongo 1 58 71 101 119 > 287 610 >Oak-Mongo-FDS 1 50 51 52 58 > 1841106 >Oak-Tar1 39 40 40 44 > 641459 >Oak-Tar-FDS1 53 54 55 64 > 1971030 >Jackrabbit 1 4 4 5 6 > 231 11385 > >Which shows that JR2 performs lot better for full text queries and >subsequent queries are quite faster once Lucene has warmed up. > >Looking at current usage of Lucene in Oak and the way we store and >access the Lucene indexes [2] I have couple of doubts > >1. Multiple IndexSearcher instances - Current impl would create a new >IndexSearcher for every Lucene query as the OakDirectory uses is bound >to NodeState of executing JCR session. Compared to this in JR2 we >probably had a singleton IndexSearcher which was shared across all the >query execution path. This would potentially cause performance issue >as Lucene is effectively used in a state less way and it has to >perform initialization for every call. As [3] the IndexSearcher must >be shared > >2. Index Access - Currently we have custom OakDirectory which provides >access to Lucene indexes stored in NodeStore. Even with SegmentStore >which has memory mapped file the random access used by Lucene would >probably be lot slower with OakDirectory in comparison to default >Lucene MMapDirectory. For small setups where Lucene index can be >accomodated on each node I think it would be better if the index is >access from file system > >Are the above concerns valid and should we relook into how we are >using Lucene in Oak? > >Chetan Mehrotra >[1] https://issues.apache.org/jira/browse/OAK-1702 >[2] >https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/ja >va/org/apache/jackrabbit/oak/plugins/index/lucene/OakDirectory.java >[3] http://wiki.apache.org/lucene-java/ImproveSearchingSpeed
RE: Slow full text query performance and Lucene Index handling in Oak
Hi, > Since the Lucene index is in any case updated asynchronously, it > should be fine for us to ignore the base NodeState of the current > session and instead use an IndexSearcher based on the last state as > updated by the async indexer. This would allow us to reuse the > IndexSearcher over multiple queries. I was also wondering if it makes sense to share it across multiple sessions performing a query to reduce the number of index readers that may be open at the same time. however, this will likely also reduce concurrency because we synchronize access to a single session. we should also try to re-open the existing reader, which is less costly than creating a new reader. I'm not familiar anymore with the most recent lucene version, but with the version used in Jackrabbit 2.x this was possible and helped a lot. Regards Marcel
Re: Slow full text query performance and Lucene Index handling in Oak
Hi, On Tue, Apr 8, 2014 at 11:51 AM, Chetan Mehrotra wrote: > 1. Multiple IndexSearcher instances - Current impl would create a new > IndexSearcher for every Lucene query as the OakDirectory uses is bound > to NodeState of executing JCR session. Since the Lucene index is in any case updated asynchronously, it should be fine for us to ignore the base NodeState of the current session and instead use an IndexSearcher based on the last state as updated by the async indexer. This would allow us to reuse the IndexSearcher over multiple queries. > 2. Index Access - Currently we have custom OakDirectory which provides > access to Lucene indexes stored in NodeStore. Even with SegmentStore > which has memory mapped file the random access used by Lucene would > probably be lot slower with OakDirectory in comparison to default > Lucene MMapDirectory. There's of course some extra overhead in going through Oak's Blob interface, but I would be surprised if this turned out to be significant and impossible to optimize as the frequently accessed parts of the index would in either case be cached in memory. So I'd go with approach 1 first and see where we are then before jumping to conclusions on this one. BR, Jukka Zitting