Re: Performance troubles with solr
Thank you all for your fast replies, Changing photo_id:* to boolean has_photo field via transformer, when importing data, *fixed my problems*; reducing query times to *30~ ms*. I'll try to optimize furthermore by your advices on filter query usage and int=tint (will search it first) transform. On Thu, Sep 15, 2011 at 1:31 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : q=photo_id:* AND gender:true AND country:MALAWI AND online:false photo_id:* does not mean what you probably think it means. you most likely want photo_id:[* TO *] given your current schema, but i would recommend adding a new has_photo boolean field and using that instead. thta alone should explain a big part of what those queries would be slow. you didn't describe how your q param varies in your test queries (just your fq). I'm assuming gender and online can vary, and that you sometimes don't use the photo_id clauses, and that the country clause can vary, but that these clauses are always all mandatory. in which case i would suggest using fq for all of them individually, and leaving your q param as *:* (unless you sometimes sort on the actual solr score, in which case leave it as whatever part of hte queyr you actually want to contribute to hte score) Lastly: I don't remember off the top of my head how int and tinit are defined in the example solrconfig files, but you should consider your usage of them carefully -- particularly with the precisionStep and which fields you do range queries on. -Hoss
Performance troubles with solr
Hi, i'm having performance troubles with solr. I don't know if i'm expection too much from solr or i missconfigured solr. When i run a single query its QTime is 500-1000~ ms (without any use of caches). When i run my test script (with use of caches) QTime increases exponentially, reaching 8000~ to 6~ ms. And Cpu usage also increases to %550~ My solr-start script: java -Duser.timezone=EET -Xmx6000m -jar ./start.jar 2,000,000~ documents , currently there aren't any commits but in future there will be 5,000~ updates/additions to documents every 3-5~ min via delta import. Search Query sort=userscore+desc start=0 q=photo_id:* AND gender:true AND country:MALAWI AND online:false fq=birth:[NOW-31YEARS/DAY TO NOW-17YEARS/DAY] ( Random age ranges ) fq=lastlogin:[* TO NOW-6MONTHS/DAY] ( Only 2 options, [* TO NOW-6MONTHS/DAY] or [NOW-6MONTHS/DAY TO *] ) fq=userscore:[500 TO *] ( Only 2 options, [500 TO *] or [* TO 500] ) rows=150 Schema field name=id type=long indexed=true stored=true required=true/ field name=username type=string indexed=true stored=false required=true/ field name=namesurname type=string indexed=true stored=false/ field name=network type=int indexed=true stored=false/ field name=photo_id type=int indexed=true stored=false/ field name=gender type=boolean indexed=true stored=false/ field name=country type=string indexed=true stored=false/ field name=birth type=tdate indexed=true stored=false/ field name=lastlogin type=tdate indexed=true stored=false/ field name=online type=boolean indexed=true stored=false/ field name=userscore type=int indexed=true stored=false/ Cache Sizes Lazy Load filterCache class=solr.FastLRUCache size=16384 initialSize=4096 autowarmCount=4096/ queryResultCache class=solr.LRUCache size=16384 initialSize=4096 autowarmCount=4096/ documentCache class=solr.LRUCache size=16384 initialSize=4096 autowarmCount=4096/ enableLazyFieldLoadingtrue/enableLazyFieldLoading
Re: Performance troubles with solr
Thank you for your reply. I tried to give most of the information i can but obviously i missed some. 1. Just what does your test script do? Is it doing updates, or just queries of the sort you mentioned below? Test script only sends random queries. 2. If the test script is doing updates, how are those updates being fed to Solr? There are no updates right now, as i failed on performance. 3. What version of Solr are you running? I'm using Solr 3.3.0 4. Why did you increase the default for jetty (around 384m) to 6000m, particularly given your relatively modest number of documents (2,000,000). I was trying everything before asking here. 5. Machine characteristics, particularly operating system and physical memory on the machine. OS = Debian 6.0, Physcal Memory = 32 gb, CPU = 2x Intel Quad Core On Wed, Sep 14, 2011 at 5:38 PM, Jaeger, Jay - DOT jay.jae...@dot.wi.govwrote: I think folks are going to need a *lot* more information. Particularly 1. Just what does your test script do? Is it doing updates, or just queries of the sort you mentioned below? 2. If the test script is doing updates, how are those updates being fed to Solr? 3. What version of Solr are you running? 4. Why did you increase the default for jetty (around 384m) to 6000m, particularly given your relatively modest number of documents (2,000,000). 5. Machine characteristics, particularly operating system and physical memory on the machine. Please refer to http://wiki.apache.org/solr/UsingMailingLists for additional guidance in using the mailing list to get help. -Original Message- From: Yusuf Karakaya [mailto:karakaya...@gmail.com] Sent: Wednesday, September 14, 2011 9:19 AM To: solr-user@lucene.apache.org Subject: Performance troubles with solr Hi, i'm having performance troubles with solr. I don't know if i'm expection too much from solr or i missconfigured solr. When i run a single query its QTime is 500-1000~ ms (without any use of caches). When i run my test script (with use of caches) QTime increases exponentially, reaching 8000~ to 6~ ms. And Cpu usage also increases to %550~ My solr-start script: java -Duser.timezone=EET -Xmx6000m -jar ./start.jar 2,000,000~ documents , currently there aren't any commits but in future there will be 5,000~ updates/additions to documents every 3-5~ min via delta import. Search Query sort=userscore+desc start=0 q=photo_id:* AND gender:true AND country:MALAWI AND online:false fq=birth:[NOW-31YEARS/DAY TO NOW-17YEARS/DAY] ( Random age ranges ) fq=lastlogin:[* TO NOW-6MONTHS/DAY] ( Only 2 options, [* TO NOW-6MONTHS/DAY] or [NOW-6MONTHS/DAY TO *] ) fq=userscore:[500 TO *] ( Only 2 options, [500 TO *] or [* TO 500] ) rows=150 Schema field name=id type=long indexed=true stored=true required=true/ field name=username type=string indexed=true stored=false required=true/ field name=namesurname type=string indexed=true stored=false/ field name=network type=int indexed=true stored=false/ field name=photo_id type=int indexed=true stored=false/ field name=gender type=boolean indexed=true stored=false/ field name=country type=string indexed=true stored=false/ field name=birth type=tdate indexed=true stored=false/ field name=lastlogin type=tdate indexed=true stored=false/ field name=online type=boolean indexed=true stored=false/ field name=userscore type=int indexed=true stored=false/ Cache Sizes Lazy Load filterCache class=solr.FastLRUCache size=16384 initialSize=4096 autowarmCount=4096/ queryResultCache class=solr.LRUCache size=16384 initialSize=4096 autowarmCount=4096/ documentCache class=solr.LRUCache size=16384 initialSize=4096 autowarmCount=4096/ enableLazyFieldLoadingtrue/enableLazyFieldLoading
Re: Performance troubles with solr
I tried moving age query from filter query to normal query but nothing really changed. But when i try to move everything into query itself ( removed all filter queries) QTimes slowed much more. I don't have problem with memory or cpu usage, my problem is query response times. When i send only one query respond times vary from 500 ms to 1000 ms (non cached) and its too much. When i send a set of random queries (10-20 queries per second) response times goes crayz ( 8 seconds to 60+ seconds). On Wed, Sep 14, 2011 at 6:07 PM, Jaeger, Jay - DOT jay.jae...@dot.wi.govwrote: I don't have enough experience with filter queries to advise well on when to use fq vs. putting it in the query itself, but I do know that we are not using filter queries, and with index sizes ranging from 7 Million to 27+ Million we have not seen this kind of issue. Maybe keeping 16,384 filter queries around, particularly caching the ones with random age ranges is eating your memory up -- so perhaps try moving just that particular fq into q instead (since it is random) and just cache the ones where the number of options is limited? What happens if you try your test without the filter queries? What happens if you put the additional criteria that are in your filter query into the query itself? JRJ -Original Message- From: Yusuf Karakaya [mailto:karakaya...@gmail.com] Sent: Wednesday, September 14, 2011 9:54 AM To: solr-user@lucene.apache.org Subject: Re: Performance troubles with solr Thank you for your reply. I tried to give most of the information i can but obviously i missed some. 1. Just what does your test script do? Is it doing updates, or just queries of the sort you mentioned below? Test script only sends random queries. 2. If the test script is doing updates, how are those updates being fed to Solr? There are no updates right now, as i failed on performance. 3. What version of Solr are you running? I'm using Solr 3.3.0 4. Why did you increase the default for jetty (around 384m) to 6000m, particularly given your relatively modest number of documents (2,000,000). I was trying everything before asking here. 5. Machine characteristics, particularly operating system and physical memory on the machine. OS = Debian 6.0, Physcal Memory = 32 gb, CPU = 2x Intel Quad Core On Wed, Sep 14, 2011 at 5:38 PM, Jaeger, Jay - DOT jay.jae...@dot.wi.gov wrote: I think folks are going to need a *lot* more information. Particularly 1. Just what does your test script do? Is it doing updates, or just queries of the sort you mentioned below? 2. If the test script is doing updates, how are those updates being fed to Solr? 3. What version of Solr are you running? 4. Why did you increase the default for jetty (around 384m) to 6000m, particularly given your relatively modest number of documents (2,000,000). 5. Machine characteristics, particularly operating system and physical memory on the machine. Please refer to http://wiki.apache.org/solr/UsingMailingLists for additional guidance in using the mailing list to get help. -Original Message- From: Yusuf Karakaya [mailto:karakaya...@gmail.com] Sent: Wednesday, September 14, 2011 9:19 AM To: solr-user@lucene.apache.org Subject: Performance troubles with solr Hi, i'm having performance troubles with solr. I don't know if i'm expection too much from solr or i missconfigured solr. When i run a single query its QTime is 500-1000~ ms (without any use of caches). When i run my test script (with use of caches) QTime increases exponentially, reaching 8000~ to 6~ ms. And Cpu usage also increases to %550~ My solr-start script: java -Duser.timezone=EET -Xmx6000m -jar ./start.jar 2,000,000~ documents , currently there aren't any commits but in future there will be 5,000~ updates/additions to documents every 3-5~ min via delta import. Search Query sort=userscore+desc start=0 q=photo_id:* AND gender:true AND country:MALAWI AND online:false fq=birth:[NOW-31YEARS/DAY TO NOW-17YEARS/DAY] ( Random age ranges ) fq=lastlogin:[* TO NOW-6MONTHS/DAY] ( Only 2 options, [* TO NOW-6MONTHS/DAY] or [NOW-6MONTHS/DAY TO *] ) fq=userscore:[500 TO *] ( Only 2 options, [500 TO *] or [* TO 500] ) rows=150 Schema field name=id type=long indexed=true stored=true required=true/ field name=username type=string indexed=true stored=false required=true/ field name=namesurname type=string indexed=true stored=false/ field name=network type=int indexed=true stored=false/ field name=photo_id type=int indexed=true stored=false/ field name=gender type=boolean indexed=true stored=false/ field name=country type=string indexed=true stored=false/ field name=birth type=tdate indexed=true stored=false/ field name=lastlogin type=tdate indexed=true stored=false/ field name=online type=boolean indexed=true stored=false/ field name=userscore type=int indexed