Hi Adrien. " Can you confirm that the response times that you are reporting both happen on Lucene 5.2 " - no, I upgraded Lucene from 5.0 to 5.2, and only in 5.2 tried to use CachingWrapperQuery
" on a "hot cache" " - answer is no, because I have a lot of different of queries, and therefore different filters. I use LRU cache, so when filter is not in cache, I create new one. But nevertheless, my timings contain both heating CachingWrapperFilter and CachingWrapperQuery Yesterday I reverted sources to use filters, and added some statistics that you had asked for. I will provide stat only for one query - index size 18M docs - response time with caching: reponseTimeWithCachingMs.samples | 59487 reponseTimeWithCachingMs.min | 0 reponseTimeWithCachingMs.avg | 53.68003093112781 reponseTimeWithCachingMs.max | 2530 reponseTimeWithCachingMs[-INF; 0) | 0 (0.0000%; 0.0000%) reponseTimeWithCachingMs[0; 10) | 216 (0.3631%; 0.3631%) reponseTimeWithCachingMs[10; 20) | 42 (0.0706%; 0.4337%) reponseTimeWithCachingMs[20; 30) | 12133 (20.3961%; 20.8298%) reponseTimeWithCachingMs[30; 40) | 25749 (43.2851%; 64.1148%) reponseTimeWithCachingMs[40; 50) | 9111 (15.3160%; 79.4308%) reponseTimeWithCachingMs[50; 60) | 3572 (6.0047%; 85.4355%) reponseTimeWithCachingMs[60; 70) | 1565 (2.6308%; 88.0663%) reponseTimeWithCachingMs[70; 80) | 1062 (1.7853%; 89.8516%) reponseTimeWithCachingMs[80; 90) | 535 (0.8994%; 90.7509%) reponseTimeWithCachingMs[90; 100) | 390 (0.6556%; 91.4065%) reponseTimeWithCachingMs[100; 110) | 446 (0.7497%; 92.1563%) reponseTimeWithCachingMs[110; 120) | 476 (0.8002%; 92.9564%) reponseTimeWithCachingMs[120; 130) | 380 (0.6388%; 93.5952%) reponseTimeWithCachingMs[130; 140) | 189 (0.3177%; 93.9130%) reponseTimeWithCachingMs[140; 150) | 157 (0.2639%; 94.1769%) reponseTimeWithCachingMs[150; 160) | 106 (0.1782%; 94.3551%) reponseTimeWithCachingMs[160; 170) | 94 (0.1580%; 94.5131%) reponseTimeWithCachingMs[170; 180) | 82 (0.1378%; 94.6509%) reponseTimeWithCachingMs[180; 190) | 126 (0.2118%; 94.8627%) reponseTimeWithCachingMs[190; 200) | 129 (0.2169%; 95.0796%) reponseTimeWithCachingMs[200; 210) | 140 (0.2353%; 95.3149%) reponseTimeWithCachingMs[210; 220) | 143 (0.2404%; 95.5553%) reponseTimeWithCachingMs[220; 230) | 177 (0.2975%; 95.8529%) reponseTimeWithCachingMs[230; 240) | 224 (0.3766%; 96.2294%) reponseTimeWithCachingMs[240; 250) | 222 (0.3732%; 96.6026%) reponseTimeWithCachingMs[250; 260) | 220 (0.3698%; 96.9725%) reponseTimeWithCachingMs[260; 270) | 212 (0.3564%; 97.3288%) reponseTimeWithCachingMs[270; 280) | 175 (0.2942%; 97.6230%) reponseTimeWithCachingMs[280; 290) | 213 (0.3581%; 97.9811%) reponseTimeWithCachingMs[290; 300) | 268 (0.4505%; 98.4316%) reponseTimeWithCachingMs[300; 310) | 225 (0.3782%; 98.8098%) reponseTimeWithCachingMs[310; 320) | 155 (0.2606%; 99.0704%) reponseTimeWithCachingMs[320; 330) | 116 (0.1950%; 99.2654%) reponseTimeWithCachingMs[330; 340) | 72 (0.1210%; 99.3864%) reponseTimeWithCachingMs[340; 350) | 49 (0.0824%; 99.4688%) reponseTimeWithCachingMs[350; 360) | 46 (0.0773%; 99.5461%) reponseTimeWithCachingMs[360; 370) | 43 (0.0723%; 99.6184%) ... - response time without caching: (I disabled caching on live app, and gathered some amount of samples) reponseTimeWithoutCachingMs.samples | 1525 reponseTimeWithoutCachingMs.min | 8 reponseTimeWithoutCachingMs.avg | 115.27213114754099 reponseTimeWithoutCachingMs.max | 893 reponseTimeWithoutCachingMs[-INF; 0) | 0 (0.0000%; 0.0000%) reponseTimeWithoutCachingMs[0; 10) | 2 (0.1311%; 0.1311%) reponseTimeWithoutCachingMs[10; 20) | 0 (0.0000%; 0.1311%) reponseTimeWithoutCachingMs[20; 30) | 0 (0.0000%; 0.1311%) reponseTimeWithoutCachingMs[30; 40) | 29 (1.9016%; 2.0328%) reponseTimeWithoutCachingMs[40; 50) | 193 (12.6557%; 14.6885%) reponseTimeWithoutCachingMs[50; 60) | 136 (8.9180%; 23.6066%) reponseTimeWithoutCachingMs[60; 70) | 110 (7.2131%; 30.8197%) reponseTimeWithoutCachingMs[70; 80) | 107 (7.0164%; 37.8361%) reponseTimeWithoutCachingMs[80; 90) | 134 (8.7869%; 46.6230%) reponseTimeWithoutCachingMs[90; 100) | 111 (7.2787%; 53.9016%) reponseTimeWithoutCachingMs[100; 110) | 36 (2.3607%; 56.2623%) reponseTimeWithoutCachingMs[110; 120) | 48 (3.1475%; 59.4098%) reponseTimeWithoutCachingMs[120; 130) | 37 (2.4262%; 61.8361%) reponseTimeWithoutCachingMs[130; 140) | 39 (2.5574%; 64.3934%) reponseTimeWithoutCachingMs[140; 150) | 56 (3.6721%; 68.0656%) reponseTimeWithoutCachingMs[150; 160) | 50 (3.2787%; 71.3443%) reponseTimeWithoutCachingMs[160; 170) | 58 (3.8033%; 75.1475%) reponseTimeWithoutCachingMs[170; 180) | 71 (4.6557%; 79.8033%) reponseTimeWithoutCachingMs[180; 190) | 69 (4.5246%; 84.3279%) reponseTimeWithoutCachingMs[190; 200) | 73 (4.7869%; 89.1148%) reponseTimeWithoutCachingMs[200; 210) | 50 (3.2787%; 92.3934%) reponseTimeWithoutCachingMs[210; 220) | 44 (2.8852%; 95.2787%) reponseTimeWithoutCachingMs[220; 230) | 15 (0.9836%; 96.2623%) reponseTimeWithoutCachingMs[230; 240) | 11 (0.7213%; 96.9836%) reponseTimeWithoutCachingMs[240; 250) | 12 (0.7869%; 97.7705%) reponseTimeWithoutCachingMs[250; 260) | 13 (0.8525%; 98.6229%) reponseTimeWithoutCachingMs[260; 270) | 10 (0.6557%; 99.2787%) reponseTimeWithoutCachingMs[270; 280) | 6 (0.3934%; 99.6721%) reponseTimeWithoutCachingMs[280; 290) | 1 (0.0656%; 99.7377%) reponseTimeWithoutCachingMs[290; 300) | 2 (0.1311%; 99.8689%) reponseTimeWithoutCachingMs[300; 310) | 1 (0.0656%; 99.9344%) - how many documents cached filter matches: filterMatchCount.samples | 24071 filterMatchCount.min | 0 filterMatchCount.avg | 17765.553736861784 filterMatchCount.max | 438929 filterMatchCount[-INF; 0) | 0 (0.0000%; 0.0000%) filterMatchCount[0; 1000) | 10233 (42.5117%; 42.5117%) filterMatchCount[1000; 2000) | 2338 (9.7129%; 52.2247%) filterMatchCount[2000; 3000) | 1330 (5.5253%; 57.7500%) filterMatchCount[3000; 4000) | 950 (3.9467%; 61.6966%) filterMatchCount[4000; 5000) | 642 (2.6671%; 64.3638%) filterMatchCount[5000; 6000) | 582 (2.4178%; 66.7816%) filterMatchCount[6000; 7000) | 481 (1.9983%; 68.7799%) filterMatchCount[7000; 8000) | 494 (2.0523%; 70.8321%) filterMatchCount[8000; 9000) | 351 (1.4582%; 72.2903%) filterMatchCount[9000; 10000) | 298 (1.2380%; 73.5283%) filterMatchCount[10000; 11000) | 255 (1.0594%; 74.5877%) filterMatchCount[11000; 12000) | 194 (0.8059%; 75.3936%) filterMatchCount[12000; 13000) | 209 (0.8683%; 76.2619%) filterMatchCount[13000; 14000) | 191 (0.7935%; 77.0554%) filterMatchCount[14000; 15000) | 183 (0.7603%; 77.8156%) filterMatchCount[15000; 16000) | 216 (0.8973%; 78.7130%) filterMatchCount[16000; 17000) | 146 (0.6065%; 79.3195%) filterMatchCount[17000; 18000) | 190 (0.7893%; 80.1088%) filterMatchCount[18000; 19000) | 134 (0.5567%; 80.6655%) filterMatchCount[19000; 20000) | 174 (0.7229%; 81.3884%) filterMatchCount[20000; +INF) | 4480 (18.6116%; 100.0000%) - how many documents query matches: queryMatchCount.samples | 24198 queryMatchCount.min | 0 queryMatchCount.avg | 2489.612240681048 queryMatchCount.max | 75347 queryMatchCount-INF; 0) | 0 (0.0000%; 0.0000%) queryMatchCount[0; 1000) | 16339 (67.5221%; 67.5221%) queryMatchCount[1000; 2000) | 2210 (9.1330%; 76.6551%) queryMatchCount[2000; 3000) | 1242 (5.1327%; 81.7878%) queryMatchCount[3000; 4000) | 792 (3.2730%; 85.0607%) queryMatchCount[4000; 5000) | 482 (1.9919%; 87.0527%) queryMatchCount[5000; 6000) | 545 (2.2523%; 89.3049%) queryMatchCount[6000; 7000) | 444 (1.8349%; 91.1398%) queryMatchCount[7000; 8000) | 248 (1.0249%; 92.1646%) queryMatchCount[8000; 9000) | 166 (0.6860%; 92.8506%) queryMatchCount[9000; 10000) | 172 (0.7108%; 93.5615%) queryMatchCount[10000; 11000) | 91 (0.3761%; 93.9375%) queryMatchCount[11000; 12000) | 85 (0.3513%; 94.2888%) queryMatchCount[12000; 13000) | 59 (0.2438%; 94.5326%) queryMatchCount[13000; 14000) | 58 (0.2397%; 94.7723%) queryMatchCount[14000; 15000) | 37 (0.1529%; 94.9252%) queryMatchCount[15000; 16000) | 164 (0.6777%; 95.6029%) queryMatchCount[16000; 17000) | 36 (0.1488%; 95.7517%) queryMatchCount[17000; 18000) | 25 (0.1033%; 95.8550%) queryMatchCount[18000; 19000) | 109 (0.4505%; 96.3055%) queryMatchCount[19000; 20000) | 25 (0.1033%; 96.4088%) queryMatchCount[20000; +INF) | 869 (3.5912%; 100.0000%) I think I should say few more details. First, there is difference between samples count in queryMatchCount/ filterMatchCount and reponseTimeWithCachingMs. This is because I need to execute one more query to get filterMatchCount ( searcher.search(new MatchAllDocsQuery(), searchFilter, totalHitCountCollector); ) and I do this only for the first search, and dont do when execute searchAfter func. But reponseTimeWithCachingMs also contains time of searchAfter . Second, I will explain about stats. queries "doc.fieldA=1 and doc.fieldB=2" and "doc.fieldA=3 and doc.fieldB=4" I process as queries of same type and aggregate them in one stat. Therefore there is a big variance in filterMatchCount & queryMatchCount. And finally, if you notice, query with filter takes 53 ms, but in previous letter I provided much less number (22 ms for this query). This may happen because I was too lazy while reverting sourses. Instead of revert using BooleanFilter, TermFilter, NumericRangeFilter (as was before) I just create chain CachingWrapperFilter(QueryWrapperFilter( BooleanQuery (TermQueries... NumericRangeQueries) )). Second possible reason why there is difference: 22ms was on Lucene 5.0. But nevertheless, on Lucene 5.2 query with CachingWrapperQuery takes 95ms, with CachingWrapperFilter 53 ms, and without any caching 115ms. Huh, I think that's all. I would appreciate any advice. Regards, Anton 2015-06-16 0:35 GMT+03:00 Adrien Grand <jpou...@gmail.com>: > Hi Anton, > > Thanks for reporting this. It is indeed a bit surprising given that > both classes work in a very similar way. Can you confirm that the > response times that you are reporting both happen on Lucene 5.2 (even > with CachingWrapperFilter) and on a "hot cache" (so that they don't > include the generation of the cache entries)? > > Could you also provide us with the following information: > - what are the response times without caching? > - how many docs do you have in the index? > - how many documents does the cached filter match? > - how many documents do your two popular queries match? > > If you have the opportunity to perform some profiling of your query > execution, this might be useful as well. Thanks! > > > On Mon, Jun 15, 2015 at 10:56 AM, Anton Lyska <profondome...@gmail.com> > wrote: > > Hi, > > > > I have performance issues with CachingWrapperQuery with lucene 5.2 > and > > dont know how to solve it. > > > > Prehistory: > > I have search with different parameters, where some parameters are > used > > more frequently then others. For these params I used filters(and cached > > them), and my search looked like this > > > > BooleanFilter bf = new BooleanFilter (); > > bf.add(.., BooleanClause.Occur.MUST); //frequently used params > here > > bf.add(.., BooleanClause.Occur.MUST); > > > > Filter searchFilter = new CachingWrapperFilter(bf); //and this > > filter object was reused between queries > > > > BooleanQuery searchQuery = new BooleanQuery(); > > searchQuery.add(.., BooleanClause.Occur.MUST); > > searchQuery.add(.., BooleanClause.Occur.MUST); > > > > hits = searcher.search(searchQuery, searchFilter, count, sort); > > > > And everything were great. I looked at two popular queries and each took > 32 > > ms and 22 ms to execute. > > > > Now I did upgrade Lucene to 5.2 (from 5.0), and saw that filters are > > deprecated and its advisable to use queries and CachingWrapperQuery. > > > > So i changed sources to this: > > > > > > BooleanQuery bq = new BooleanQuery (); //now it becomes > BooleanQuery > > bq.add(.., BooleanClause.Occur.MUST); //frequently used params > here, > > same terms as before > > bq.add(.., BooleanClause.Occur.MUST); > > > > Query cachedQuery = new CachingWrapperQuery(bq); //this object is > > reused between queries > > > > BooleanQuery searchQuery = new BooleanQuery(); > > searchQuery.add(.., BooleanClause.Occur.MUST); > > searchQuery.add(.., BooleanClause.Occur.MUST); > > searchQuery.add(cachedQuery, BooleanClause.Occur.FILTER); //here i > > add cached part of the query > > > > hits = searcher.search(searchQuery, count, sort); > > > > Later I looked at same queries, and they took 145ms and 95ms to execute! > > > > Moving to CachingWrapperQuery was the only difference between queries, so > > question is: > > How to use CachingWrapperQuery right in my situation, or should I > switch > > back to filters? > > > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >