Re: Solr Performance Improvement and degradation Help
I will run some queries today, both with lazyfield loading on and off (for the 2010 build we're using and the 2012 build we're using) and get you some of the debug data. On Sun, Feb 26, 2012 at 4:13 PM, Yonik Seeley-2-2 [via Lucene] ml-node+s472066n318...@n3.nabble.com wrote: On Sun, Feb 26, 2012 at 3:32 PM, Erick Erickson [hidden email]http://user/SendEmail.jtp?type=nodenode=318i=0 wrote: Would you hypothesize that lazy field loading could be that much slower if a large fraction of fields were selected? If you actually use the lazy field later, it will cause an extra read for each field. If you don't have enough free RAM for the OS to cache the entire index it could be even worse... the first time reading the document you take a hit from a real disk seek, then when you go and access those fields (assuming they have already been evicted from the OS cache) you take the hit of another disk seek. Those could really add up. So if we're actually seeing much worse performance for lazy loading now than in the past, one guess would be it's due to that scenario in conjunction with something that is actually accessing the lazy fields. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p318.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3780843.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
I've run some test on both the versions of Solr we are testing... one is the 2010.12.10 build and the other is the 2012.02.16 build. The latter one is where we were initially seeing poor response performance. I've attached 4 text files which have the results of a few runs against each of the builds with and without LazyFieldLoading enabled (plus some on the later build with wildcard fl parameters enabled). From what I see, the timings don't seem to be too telling (but not really knowing the ins and outs of it you may see something different). Where we see the hit/performance is on the response time getting the information back. Hopefully this helps some. http://lucene.472066.n3.nabble.com/file/n3780995/2010-12-10build_lazyfieldloading_false.txt 2010-12-10build_lazyfieldloading_false.txt http://lucene.472066.n3.nabble.com/file/n3780995/2010-12-10build_lazyfieldloading_true.txt 2010-12-10build_lazyfieldloading_true.txt http://lucene.472066.n3.nabble.com/file/n3780995/2012-02-16build_lazyfieldloading_false.txt 2012-02-16build_lazyfieldloading_false.txt http://lucene.472066.n3.nabble.com/file/n3780995/2012-02-16build_lazyfieldloading_true.txt 2012-02-16build_lazyfieldloading_true.txt -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3780995.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
around 6-8 seconds before. Granted, I've only run about 20 tests (manually) at this point, so I'm going to keep hitting at the server for a while with different queries to see if anything gives, but at least at this point, it does appear setting the lazyfieldloading to false has improved performance. It'd be ideal to figure out why that's the case, but that's a little beyond my skill set at the moment. I'll let you guys know how results look as I proceed throughout the day. (I've yet to run these tests against the 2010 build we were comparing against - so I need to do that too) Please also let me know if you have any further suggestions. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773362.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773540.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
On Sun, Feb 26, 2012 at 3:32 PM, Erick Erickson erickerick...@gmail.com wrote: Would you hypothesize that lazy field loading could be that much slower if a large fraction of fields were selected? If you actually use the lazy field later, it will cause an extra read for each field. If you don't have enough free RAM for the OS to cache the entire index it could be even worse... the first time reading the document you take a hit from a real disk seek, then when you go and access those fields (assuming they have already been evicted from the OS cache) you take the hit of another disk seek. Those could really add up. So if we're actually seeing much worse performance for lazy loading now than in the past, one guess would be it's due to that scenario in conjunction with something that is actually accessing the lazy fields. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10
Re: Solr Performance Improvement and degradation Help
, especially since you say this kicks in after a while, is garbage collection. Here's an excellent intro: http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/ Especially look at the getting a view into garbage collection section and try specifying those options. The result should be that your solr log gets stats dumped every time GC kicks in. If this is a problem, look at the times in the logfile after your system slows down. You'll see a bunch of GC dumps that collect very little unused memory. You can also connect to the process using jConsole (should be in the Java distro) and watch the memory tab, especially after your server has slowed down. You can also connect jConsole remotely... This is just an experiment, but any time I see and it slows down after ### minutes, GC is the first thing I think of. Best Erick On Thu, Feb 23, 2012 at 10:16 AM, naptowndev [hidden email] http://user/SendEmail.jtp?type=nodenode=3770307i=0 wrote: Erick - Agreed, it is puzzling. What I've found is that it doesn't matter if I pass in wildcards for the field list or not...but that the overall response time from the newer builds of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older (4.0.0.2010.12.10.08.54.56) build. If I run the exact same query against those two cores, bringing back a payload of just over 13MB (xml), the older build brings it back in about 1.6 seconds and the newer build brings it back in about 8.4 seconds. Implementing the field list wildcard allows us to reduce the payload in the newer build (not an option in the older build). They payload is reduced to 1.8MB but takes over 3.5 seconds to come back as compared to the full payload (13MB) in the older build at about 1.6 seconds. With everything else remaining the same (machine/processors/memory/network and the code base calling Solr) it seems to point to something in the newer builds that's causing the slowdown, but I'm not intimate enough with Solr to be able to figure that out. We are using the debugQuery=on in our test to see timings and they aren't showing any anomalies, so that makes it even more confusing. From a wildcard perspective, it's on the fl parameter... here's a 'snippet' of part of our fl parameter for the query fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled, DictionarySource, timestamp, Category_*_MemberReports, Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade, Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations, Category_*_ReportLocationCoordinates, Category_*_coordinate, score Please note that that fl param is greatly reduced from our full query, we have over 100 static files and a slew of dynamic fields - but that should give you an idea of how we are using wildcards. I'm not sure about the maxBooleanClauses...not being all that familiar with Solr, does that apply to wildcards used in the fl list? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770307.html To unsubscribe from Solr Performance Improvement and degradation Help, click here . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770939.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3771562.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace
Re: Solr Performance Improvement and degradation Help
On Fri, Feb 24, 2012 at 10:25 AM, naptowndev naptowndev...@gmail.com wrote: Our current config for that is as follows: documentCache class=*solr.LRUCache* size=*15000* initialSize=*15000*autowarmCount =*0* / It's the same for both instances I assume the asterisks are for emphasis and are not actually present in your config? And lazyfieldloading is enabled for both instances Could you try disabling lazy field loading for both instances to see what the difference is? I think both the lucene and solr lazy field stuff has changed. The other big change was pseudo-fields (field augmenters, transformers, etc), and there could possibly be an issue there (like maybe accessing the value of lazy loaded fields that it shouldn't need to). -Yonik lucidimagination.com
Re: Solr Performance Improvement and degradation Help
Yonik - Thanks, we'll give that a try (re: lazyfieldlaoding). and no, the * is not in our config...that must have come over from pasting it in from the file. Odd. Another question I have is regarding solr.LRUCache vs. solr.FastLRUCache. Would there be reason to implement (or not implement) fastLRU on the documentcache? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773015.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
On Fri, Feb 24, 2012 at 11:24 AM, naptowndev naptowndev...@gmail.com wrote: Another question I have is regarding solr.LRUCache vs. solr.FastLRUCache. Would there be reason to implement (or not implement) fastLRU on the documentcache? LRUCache can be faster if the hit rate is really low (i.e. the eviction rate is high) -Yonik lucidimagination.com
Re: Solr Performance Improvement and degradation Help
these inconsistencies as well. The last set of test I ran consistently showed the the older build of Solr bringing back a result set of 13.1MB with 1200 records in 2.3 seconds wheres the newer build was bringing back the same result set in about 17.4 seconds. The catch is that the qtime and highlighting component time in the newer version are faster than the older version. Again, if you have any more ideas, let me know. Thanks! Brian On Thu, Feb 23, 2012 at 11:51 AM, Erick Erickson [via Lucene] [hidden email] http://user/SendEmail.jtp?type=nodenode=3771562i=1 wrote: Ah, no, my mistake. The wildcards for the fl list won't matter re: maxBooleanClauses, I didn't read carefully enough. I assume that just returning a field or two doesn't slow down But one possible culprit, especially since you say this kicks in after a while, is garbage collection. Here's an excellent intro: http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/ Especially look at the getting a view into garbage collection section and try specifying those options. The result should be that your solr log gets stats dumped every time GC kicks in. If this is a problem, look at the times in the logfile after your system slows down. You'll see a bunch of GC dumps that collect very little unused memory. You can also connect to the process using jConsole (should be in the Java distro) and watch the memory tab, especially after your server has slowed down. You can also connect jConsole remotely... This is just an experiment, but any time I see and it slows down after ### minutes, GC is the first thing I think of. Best Erick On Thu, Feb 23, 2012 at 10:16 AM, naptowndev [hidden email] http://user/SendEmail.jtp?type=nodenode=3770307i=0 wrote: Erick - Agreed, it is puzzling. What I've found is that it doesn't matter if I pass in wildcards for the field list or not...but that the overall response time from the newer builds of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older (4.0.0.2010.12.10.08.54.56) build. If I run the exact same query against those two cores, bringing back a payload of just over 13MB (xml), the older build brings it back in about 1.6 seconds and the newer build brings it back in about 8.4 seconds. Implementing the field list wildcard allows us to reduce the payload in the newer build (not an option in the older build). They payload is reduced to 1.8MB but takes over 3.5 seconds to come back as compared to the full payload (13MB) in the older build at about 1.6 seconds. With everything else remaining the same (machine/processors/memory/network and the code base calling Solr) it seems to point to something in the newer builds that's causing the slowdown, but I'm not intimate enough with Solr to be able to figure that out. We are using the debugQuery=on in our test to see timings and they aren't showing any anomalies, so that makes it even more confusing. From a wildcard perspective, it's on the fl parameter... here's a 'snippet' of part of our fl parameter for the query fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled, DictionarySource, timestamp, Category_*_MemberReports, Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade, Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations, Category_*_ReportLocationCoordinates, Category_*_coordinate, score Please note that that fl param is greatly reduced from our full query, we have over 100 static files and a slew of dynamic fields - but that should give you an idea of how we are using wildcards. I'm not sure about the maxBooleanClauses...not being all that familiar with Solr, does that apply to wildcards used in the fl list? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770307.html To unsubscribe from Solr Performance Improvement and degradation Help, click here . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene
Re: Solr Performance Improvement and degradation Help
I'm not sure what would constitute a low vs. high hit rate (and eviction rate), so we've kept the setting at LRUCache instead of FastCache for now. But I will say we did turn the LazyFieldLoading option off and wow - a huge increase in performance on the newer nightly build we are using (the one from Feb 2, 2012). The payload of 13.7 MB that was taking from anywhere around 15-17 seconds (with fastvectorhighlighter on) and 33+ seconds with FVH off is now taking just about 3.2 seconds with FVH on. When we implement the wildcards for the fieldlist, thereby reducing the payload down to 1.9MB, our average return time is around 875ms, down from anywhere around 6-8 seconds before. Granted, I've only run about 20 tests (manually) at this point, so I'm going to keep hitting at the server for a while with different queries to see if anything gives, but at least at this point, it does appear setting the lazyfieldloading to false has improved performance. It'd be ideal to figure out why that's the case, but that's a little beyond my skill set at the moment. I'll let you guys know how results look as I proceed throughout the day. (I've yet to run these tests against the 2010 build we were comparing against - so I need to do that too) Please also let me know if you have any further suggestions. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
Let me echo this back to see if I have it right, because it's *extremely* weird if I'm reading it correctly. In your solrconfig.xml file, you changed this line: enableLazyFieldLoadingtrue/enableLazyFieldLoading to this: enableLazyFieldLoadingfalse/enableLazyFieldLoading and your response time DECREASED? If you can confirm that I'm reading it right, I'll open up a JIRA. Best Erick On Fri, Feb 24, 2012 at 1:14 PM, naptowndev naptowndev...@gmail.com wrote: I'm not sure what would constitute a low vs. high hit rate (and eviction rate), so we've kept the setting at LRUCache instead of FastCache for now. But I will say we did turn the LazyFieldLoading option off and wow - a huge increase in performance on the newer nightly build we are using (the one from Feb 2, 2012). The payload of 13.7 MB that was taking from anywhere around 15-17 seconds (with fastvectorhighlighter on) and 33+ seconds with FVH off is now taking just about 3.2 seconds with FVH on. When we implement the wildcards for the fieldlist, thereby reducing the payload down to 1.9MB, our average return time is around 875ms, down from anywhere around 6-8 seconds before. Granted, I've only run about 20 tests (manually) at this point, so I'm going to keep hitting at the server for a while with different queries to see if anything gives, but at least at this point, it does appear setting the lazyfieldloading to false has improved performance. It'd be ideal to figure out why that's the case, but that's a little beyond my skill set at the moment. I'll let you guys know how results look as I proceed throughout the day. (I've yet to run these tests against the 2010 build we were comparing against - so I need to do that too) Please also let me know if you have any further suggestions. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
Erick - That is exactly what we are seeing. this is in our solrconfig.xml: enableLazyFieldLoadingfalse/enableLazyFieldLoading and our response times have decreased drastically. I'm on my 40th-ish test today and the response times are still 10+ seconds faster on the higher payload than they were when it was set to true. Smaller payloads are also about 2.5 seconds faster. On Fri, Feb 24, 2012 at 1:38 PM, Erick Erickson [via Lucene] ml-node+s472066n377336...@n3.nabble.com wrote: Let me echo this back to see if I have it right, because it's *extremely* weird if I'm reading it correctly. In your solrconfig.xml file, you changed this line: enableLazyFieldLoadingtrue/enableLazyFieldLoading to this: enableLazyFieldLoadingfalse/enableLazyFieldLoading and your response time DECREASED? If you can confirm that I'm reading it right, I'll open up a JIRA. Best Erick On Fri, Feb 24, 2012 at 1:14 PM, naptowndev [hidden email]http://user/SendEmail.jtp?type=nodenode=3773362i=0 wrote: I'm not sure what would constitute a low vs. high hit rate (and eviction rate), so we've kept the setting at LRUCache instead of FastCache for now. But I will say we did turn the LazyFieldLoading option off and wow - a huge increase in performance on the newer nightly build we are using (the one from Feb 2, 2012). The payload of 13.7 MB that was taking from anywhere around 15-17 seconds (with fastvectorhighlighter on) and 33+ seconds with FVH off is now taking just about 3.2 seconds with FVH on. When we implement the wildcards for the fieldlist, thereby reducing the payload down to 1.9MB, our average return time is around 875ms, down from anywhere around 6-8 seconds before. Granted, I've only run about 20 tests (manually) at this point, so I'm going to keep hitting at the server for a while with different queries to see if anything gives, but at least at this point, it does appear setting the lazyfieldloading to false has improved performance. It'd be ideal to figure out why that's the case, but that's a little beyond my skill set at the moment. I'll let you guys know how results look as I proceed throughout the day. (I've yet to run these tests against the 2010 build we were comparing against - so I need to do that too) Please also let me know if you have any further suggestions. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773362.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773537.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
Obviously it'd be great if someone else was able to confirm this in their setup as well. But with different environments, payload sizes, etc., I'm not sure how easily it can be tested in other environments. On Fri, Feb 24, 2012 at 2:46 PM, Brian G naptowndev...@gmail.com wrote: Erick - That is exactly what we are seeing. this is in our solrconfig.xml: enableLazyFieldLoadingfalse/enableLazyFieldLoading and our response times have decreased drastically. I'm on my 40th-ish test today and the response times are still 10+ seconds faster on the higher payload than they were when it was set to true. Smaller payloads are also about 2.5 seconds faster. On Fri, Feb 24, 2012 at 1:38 PM, Erick Erickson [via Lucene] ml-node+s472066n377336...@n3.nabble.com wrote: Let me echo this back to see if I have it right, because it's *extremely* weird if I'm reading it correctly. In your solrconfig.xml file, you changed this line: enableLazyFieldLoadingtrue/enableLazyFieldLoading to this: enableLazyFieldLoadingfalse/enableLazyFieldLoading and your response time DECREASED? If you can confirm that I'm reading it right, I'll open up a JIRA. Best Erick On Fri, Feb 24, 2012 at 1:14 PM, naptowndev [hidden email]http://user/SendEmail.jtp?type=nodenode=3773362i=0 wrote: I'm not sure what would constitute a low vs. high hit rate (and eviction rate), so we've kept the setting at LRUCache instead of FastCache for now. But I will say we did turn the LazyFieldLoading option off and wow - a huge increase in performance on the newer nightly build we are using (the one from Feb 2, 2012). The payload of 13.7 MB that was taking from anywhere around 15-17 seconds (with fastvectorhighlighter on) and 33+ seconds with FVH off is now taking just about 3.2 seconds with FVH on. When we implement the wildcards for the fieldlist, thereby reducing the payload down to 1.9MB, our average return time is around 875ms, down from anywhere around 6-8 seconds before. Granted, I've only run about 20 tests (manually) at this point, so I'm going to keep hitting at the server for a while with different queries to see if anything gives, but at least at this point, it does appear setting the lazyfieldloading to false has improved performance. It'd be ideal to figure out why that's the case, but that's a little beyond my skill set at the moment. I'll let you guys know how results look as I proceed throughout the day. (I've yet to run these tests against the 2010 build we were comparing against - so I need to do that too) Please also let me know if you have any further suggestions. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773310.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773362.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3773540.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
It's pretty hard to say, even with the data you've provided. But, try adding debugQuery=on and look particularly down near the bottom there'll be a lst name=timing section. That section lists the time taken by all the components of a search, not just the QTime. Things like highlighting etc. that can often give a clue where the time's spent. What sort of wildcards are you using? Did you have to bump the maxBooleanClauses? This is a bit puzzling though Best Erick On Wed, Feb 22, 2012 at 3:16 PM, naptowndev naptowndev...@gmail.com wrote: As an update to this... I tried running a query again the 4.0.0.2010.12.10.08.54.56 version and the newer 4.0.0.2012.02.16 (both on the same box). So the query params were the same, returned results were the same, but the 4.0.0.2010.12.10.08.54.56 returned the results in about 1.6 seconds and the newer (4.0.0.2012.02.16) version returned the results in about 4 seconds. If I add the wildcard field list to the newer version, the time increases anywhere from .5-1 second. These are all averages after running the queries several times over a 30 minute period. (allowing for warming and cache). Anybody have any insight into why the newer versions are performing a bit slower? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3767725.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
Erick - Agreed, it is puzzling. What I've found is that it doesn't matter if I pass in wildcards for the field list or not...but that the overall response time from the newer builds of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older (4.0.0.2010.12.10.08.54.56) build. If I run the exact same query against those two cores, bringing back a payload of just over 13MB (xml), the older build brings it back in about 1.6 seconds and the newer build brings it back in about 8.4 seconds. Implementing the field list wildcard allows us to reduce the payload in the newer build (not an option in the older build). They payload is reduced to 1.8MB but takes over 3.5 seconds to come back as compared to the full payload (13MB) in the older build at about 1.6 seconds. With everything else remaining the same (machine/processors/memory/network and the code base calling Solr) it seems to point to something in the newer builds that's causing the slowdown, but I'm not intimate enough with Solr to be able to figure that out. We are using the debugQuery=on in our test to see timings and they aren't showing any anomalies, so that makes it even more confusing. From a wildcard perspective, it's on the fl parameter... here's a 'snippet' of part of our fl parameter for the query fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled, DictionarySource, timestamp, Category_*_MemberReports, Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade, Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations, Category_*_ReportLocationCoordinates, Category_*_coordinate, score Please note that that fl param is greatly reduced from our full query, we have over 100 static files and a slew of dynamic fields - but that should give you an idea of how we are using wildcards. I'm not sure about the maxBooleanClauses...not being all that familiar with Solr, does that apply to wildcards used in the fl list? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
Ah, no, my mistake. The wildcards for the fl list won't matter re: maxBooleanClauses, I didn't read carefully enough. I assume that just returning a field or two doesn't slow down But one possible culprit, especially since you say this kicks in after a while, is garbage collection. Here's an excellent intro: http://www.lucidimagination.com/blog/2011/03/27/garbage-collection-bootcamp-1-0/ Especially look at the getting a view into garbage collection section and try specifying those options. The result should be that your solr log gets stats dumped every time GC kicks in. If this is a problem, look at the times in the logfile after your system slows down. You'll see a bunch of GC dumps that collect very little unused memory. You can also connect to the process using jConsole (should be in the Java distro) and watch the memory tab, especially after your server has slowed down. You can also connect jConsole remotely... This is just an experiment, but any time I see and it slows down after ### minutes, GC is the first thing I think of. Best Erick On Thu, Feb 23, 2012 at 10:16 AM, naptowndev naptowndev...@gmail.com wrote: Erick - Agreed, it is puzzling. What I've found is that it doesn't matter if I pass in wildcards for the field list or not...but that the overall response time from the newer builds of Solr that we've tested (e.g. 4.0.0.2012.02.16) is slower than the older (4.0.0.2010.12.10.08.54.56) build. If I run the exact same query against those two cores, bringing back a payload of just over 13MB (xml), the older build brings it back in about 1.6 seconds and the newer build brings it back in about 8.4 seconds. Implementing the field list wildcard allows us to reduce the payload in the newer build (not an option in the older build). They payload is reduced to 1.8MB but takes over 3.5 seconds to come back as compared to the full payload (13MB) in the older build at about 1.6 seconds. With everything else remaining the same (machine/processors/memory/network and the code base calling Solr) it seems to point to something in the newer builds that's causing the slowdown, but I'm not intimate enough with Solr to be able to figure that out. We are using the debugQuery=on in our test to see timings and they aren't showing any anomalies, so that makes it even more confusing. From a wildcard perspective, it's on the fl parameter... here's a 'snippet' of part of our fl parameter for the query fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled, DictionarySource, timestamp, Category_*_MemberReports, Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade, Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations, Category_*_ReportLocationCoordinates, Category_*_coordinate, score Please note that that fl param is greatly reduced from our full query, we have over 100 static files and a slew of dynamic fields - but that should give you an idea of how we are using wildcards. I'm not sure about the maxBooleanClauses...not being all that familiar with Solr, does that apply to wildcards used in the fl list? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770307.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770939.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
. With everything else remaining the same (machine/processors/memory/network and the code base calling Solr) it seems to point to something in the newer builds that's causing the slowdown, but I'm not intimate enough with Solr to be able to figure that out. We are using the debugQuery=on in our test to see timings and they aren't showing any anomalies, so that makes it even more confusing. From a wildcard perspective, it's on the fl parameter... here's a 'snippet' of part of our fl parameter for the query fl=id, CategoryGroupTypeID, MedicalSpecialtyDescription, TermsMisspelled, DictionarySource, timestamp, Category_*_MemberReports, Category_*_MemberReportRange, Category_*_NonMemberReports, Category_*_Grade, Category_*_GradeDisplay, Category_*_GradeTier, Category_*_ReportLocations, Category_*_ReportLocationCoordinates, Category_*_coordinate, score Please note that that fl param is greatly reduced from our full query, we have over 100 static files and a slew of dynamic fields - but that should give you an idea of how we are using wildcards. I'm not sure about the maxBooleanClauses...not being all that familiar with Solr, does that apply to wildcards used in the fl list? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3769995.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770307.html To unsubscribe from Solr Performance Improvement and degradation Help, click herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=3767015code=bmFwdG93bmRldmd1eUBnbWFpbC5jb218Mzc2NzAxNXwtMTgwOTkwNzM4Ng== . NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3770939.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr Performance Improvement and degradation Help
As I've mentioned before, I'm very new to Solr. I'm not a Java guy or an Apache guy. I'm a .Net guy. We have a rather large schema - some 100 + fields plus a large number of dynamic fields. We've been trying to improve performance and finally got around to implementing fastvectorhighlighting which gave us an immediate improvement on the qtime (nearly 70%) which also improved the overall response time by over 20%. With that, we also bring back an extraordinarly large amount of data in the XML. Some results (20 records) come back with a payload between 3MB and even 17MB. We have a lot of report text that is used for searching and highlighting. We recently implemented field list wildcards on two versions of Solr to test it out. This allowed us to leave the report text off the return and decreased the payload significantly - by nearly 85% in the large cases... SO, we'd expect a performance boost there, however we are seeing greatly increased response times on these builds of Solr even though the qtime is incredibly fast. To put it in perspective - our original Solr core is 4.0, I believe the 4.0.0.2010.12.10.08.54.56 version. On our test boxes, we have one running 4.0.0.2011.11.17 and one running 4.0.0.2012.02.16 version. with the older version (not having the wildcard field list), it returns a payload of approximately 13MB in an average of 1.5 seconds. with the new version (2012.02.16) which is on the same machines as the older version (so network traffic/latency/hardware/etc are all the same), it's returning the reduced payload (approximately 1.5MB in an average of 3.5-4 seconds). I will say that we reloaded the core once and briefly saw the 1.5MB payload come back in 150-200 milliseconds, but within minutes we were back to the 3.5-4 seconds. We also noticed the CPU was being pegged for seconds when running the queries on the new build with the wildcard field list. We have a lower scale box running the 2011.11.17 version and had more success for a while. We were getting the 150-200 ms response time on the reduced payload for probably 30 minutes or so, and then it did the same thing - bumped up to 3-4 seconds in response time. Anyone have any experience with this type of random yet consistent performance degradation or have insight as to what might be causing the issues and how to fix them? We'd love to not only have the performance boost from fast vector highlighting, but also the decreased payload size. Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3767015.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Performance Improvement and degradation Help
As an update to this... I tried running a query again the 4.0.0.2010.12.10.08.54.56 version and the newer 4.0.0.2012.02.16 (both on the same box). So the query params were the same, returned results were the same, but the 4.0.0.2010.12.10.08.54.56 returned the results in about 1.6 seconds and the newer (4.0.0.2012.02.16) version returned the results in about 4 seconds. If I add the wildcard field list to the newer version, the time increases anywhere from .5-1 second. These are all averages after running the queries several times over a 30 minute period. (allowing for warming and cache). Anybody have any insight into why the newer versions are performing a bit slower? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Performance-Improvement-and-degradation-Help-tp3767015p3767725.html Sent from the Solr - User mailing list archive at Nabble.com.