last week we put our solr in production. it was a very smooth start. solr really works great and without any problems so far. its a huge improvement over our old intranet search
i wonder however whether we can increase the search performance of our solr installation, just to make the search experience even better. i know that performance is depended on many different things and parameters so a general answer is hard to make. here are some figures: - at the moment we have a bout 20.000 search queries a day. - median query time is about 400ms - ca. 80% are running under 500ms - ca. 90% are running under 1s, - ca. 10% over 1s, 3% over 2s, - there are even some queries which lasts way to looong, over 6s and up to 18s there are even simple queries for one word which last that long. maybe there is one special thing to mention. we do have a kind of user-filter with each query, these parameters differs for each usergroup, so i think at least one of the caches won't work very well, because even if the query (foobar) is the same, fq and bq can (and will) differ from user to user. fq=__intern:0+OR+__intern:344 together with a boost query bq=__lokal:0^6+OR+__lokal:344^2 our query looks like: INFO: [core] webapp=/solr path=/select params={spellcheck=true&facet=on&facet.limit=500&initSearch=1&hl=on&version=1.2&bq=__lokal:0^6+OR+__lokal:344^2&fl=score,+id,+title,+visiblePath,+__doctype,+_erstelldatum,+_dienststelle,+_dokumententyp,+__source,+__intern,+objClass,+jurislinkUrl,+destinationUrl,+_aktenzeichen,+_stelle,+_zielgruppen,+_stichwort,+_kurzbeschreibung,+_autor,+_hauptthema,+_unterthema&facet.field=__source&facet.field=__dst&facet.field=__cyear&facet.field=_dokumententyp&facet.field=__mikronav&facet.field=_zielgruppen&facet.field=__doctype&spellcheck.count=2&qt=dismax&fq=__intern:0+OR+__intern:344&hl.fragsize=640&facet.mincount=1&spellcheck.extendedResults=true&json.nl=map&hl.fl=body,+_kurzbeschreibung,+_stichwort&wt=json&spellcheck.collate=true&hl.maxAnalyzedChars=99999&rows=20&spellcheck.onlyMorePopular=false&start=0&facet.sort=index&q=foobar} hits=93 status=0 QTime=113 - we have indexed 115.000 documents, our index size is about 720 MB any hints where to look? what will the ramBufferSizeMB in mainIndex in solrconfig.xml do? does it make sense to increase this value? should we increase one of your caches? - we're using jetty, java jdk 1.6.0_21, java settings are -D64 -server -Xms892m -Xmx2048m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:-HeapDumpOnOutOfMemoryError -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled - our machine as 4GB of mem and 4 CPUs, load is about 0.6%, the java process seems to use only one CPU, no other services are running on this machine. - from the beginning we have a master/slave setup. but at the moment we are only working with the master. yesterday i included the slave in our search application, so that half the queries were handled by the master and the other half by the slave. the query times didn't change. so it is not a bottleneck with our machine or I/O or memory. - cache stats from admin panel queryResultCache, LRU Cache(maxSize=65536, initialSize=65536) lookups : 1159 hits : 498 hitratio : 0.42 (<=== seems a bit low compared to the other) inserts : 697 evictions : 0 size : 661 warmupTime : 0 cumulative_lookups : 91470 cumulative_hits : 41370 cumulative_hitratio : 0.45 cumulative_inserts : 52835 cumulative_evictions : 0 documentCache, LRU Cache(maxSize=32768, initialSize=32768) lookups : 53099 hits : 45429 hitratio : 0.85 inserts : 7670 evictions : 0 size : 7670 warmupTime : 0 cumulative_lookups : 4254335 cumulative_hits : 3760521 cumulative_hitratio : 0.88 cumulative_inserts : 493814 cumulative_evictions : 0 fieldValueCache, Concurrent LRU Cache(maxSize=10000, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) lookups : 3312 hits : 3306 hitratio : 0.99 inserts : 3 evictions : 0 size : 3 warmupTime : 0 cumulative_lookups : 261969 cumulative_hits : 261351 cumulative_hitratio : 0.99 cumulative_inserts : 306 cumulative_evictions : 0 item__zielgruppen : {field=_zielgruppen,memSize=491861,tindexSize=46,time=10,phase1=10,nTerms=46,bigTerms=13,termInstances=53913,uses=1187} item___mikronav : {field=__mikronav,memSize=464524,tindexSize=82,time=5,phase1=5,nTerms=39,bigTerms=4,termInstances=18817,uses=1187} item___dst : {field=__dst,memSize=464640,tindexSize=66,time=10,phase1=9,nTerms=160,bigTerms=5,termInstances=86516,uses=1187} (these are a few of our facet fields) filterCache Concurrent LRU Cache(maxSize=16384, initialSize=16384, minSize=14745, acceptableSize=15564, cleanupThread=false) lookups : 26851 hits : 26434 hitratio : 0.98 inserts : 417 evictions : 0 size : 417 warmupTime : 0 cumulative_lookups : 1985851 cumulative_hits : 1959304 cumulative_hitratio : 0.98 cumulative_inserts : 26547 cumulative_evictions : 0 Markus Rietzler <rietzler_software/> Rechenzentrum der Finanzverwaltung