Re: Solr performance issues
Thanks all. I've the same index with a bit different schema and 200M documents, installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size of index is about 1.5TB, have many updates every 5 minutes, complex queries and faceting with response time of 100ms that is acceptable for us. Toke Eskildsen, Is the index updated while you are searching? *No* Do you do any faceting or other heavy processing as part of a search? *No* How many hits does a search typically have and how many documents are returned? *The test for QTime only with no documents returned and No. of hits varying from 50,000 to 50,000,000.* How many concurrent searches do you need to support? How fast should the response time be? *May be 100 concurrent searches with 100ms with facets.* Does splitting the shard to two shards on the same node so every shard will be on a single EBS Volume better than using LVM? Thanks On Mon, Dec 29, 2014 at 2:00 AM, Toke Eskildsen t...@statsbiblioteket.dk wrote: Mahmoud Almokadem [prog.mahm...@gmail.com] wrote: We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS General purpose (1x1TB + 1x500GB) on each instance. Then we create logical volume using LVM of 1.5TB to fit our index. Your search speed will be limited by the slowest storage in your group, which would be your 500GB EBS. The General Purpose SSD option means (as far as I can read at http://aws.amazon.com/ebs/details/#piops) that your baseline of 3 IOPS/MB = 1500 IOPS, with bursts of 3000 IOPS. Unfortunately they do not say anything about latency. For comparison, I checked the system logs from a local test with our 21TB / 7 billion documents index. It used ~27,000 IOPS during the test, with mean search time a bit below 1 second. That was with ~100GB RAM for disk cache, which is about ½% of index size. The test was with simple term queries (1-3 terms) and some faceting. Back of the envelope: 27,000 IOPS for 21TB is ~1300 IOPS/TB. Your indexes are 1.1TB, so 1.1*1300 IOPS ~= 1400 IOPS. All else being equal (which is never the case), getting 1-3 second response times for a 1.1TB index, when one link in the storage chain is capped at a few thousand IOPS, you are using networked storage and you have little RAM for caching, does not seem unrealistic. If possible, you could try temporarily boosting performance of the EBS, to see if raw IO is the bottleneck. The response time is about 1 and 3 seconds for simple queries (1 token). Is the index updated while you are searching? Do you do any faceting or other heavy processing as part of a search? How many hits does a search typically have and how many documents are returned? How many concurrent searches do you need to support? How fast should the response time be? - Toke Eskildsen
Re: Solr performance issues
On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote: I've the same index with a bit different schema and 200M documents, installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size of index is about 1.5TB, have many updates every 5 minutes, complex queries and faceting with response time of 100ms that is acceptable for us. Toke Eskildsen, Is the index updated while you are searching? *No* Do you do any faceting or other heavy processing as part of a search? *No* How many hits does a search typically have and how many documents are returned? *The test for QTime only with no documents returned and No. of hits varying from 50,000 to 50,000,000.* How many concurrent searches do you need to support? How fast should the response time be? *May be 100 concurrent searches with 100ms with facets.* Does splitting the shard to two shards on the same node so every shard will be on a single EBS Volume better than using LVM? The basic problem is simply that the system has so little memory that it must read large amounts of data from the disk when it does a query. There is not enough RAM to cache the important parts of the index. RAM is much faster than disk, even SSD. Typical consumer-grade DDR3-1600 memory has a data transfer rate of about 12800 megabytes per second. If it's ECC memory (which I would say is a requirement) then the transfer rate is probably a little bit slower than that. Figuring 9 bits for every byte gets us about 11377 MB/s. That's only an estimate, and it could be wrong in either direction, but I'll go ahead and use it. http://en.wikipedia.org/wiki/DDR3_SDRAM#JEDEC_standard_modules If your SSD is SATA, the transfer rate will be limited to approximately 600MB/s -- the 6 gigabit per second transfer rate of the newest SATA standard. That makes memory about 18 times as fast as SATA SSD. I saw one PCI express SSD that claimed a transfer rate of 2900 MB/s. Even that is only about one fourth of the estimated speed of DDR3-1600 with ECC. I don't know what interface technology Amazon uses for their SSD volumes, but I would bet on it being the cheaper version, which would mean SATA. The networking between the EC2 instance and the EBS storage is unknown to me and may be a further bottleneck. http://ocz.com/enterprise/z-drive-4500/specifications Bottom line -- you need a lot more memory. Speeding up the disk may *help* ... but it will not replace that simple requirement. With EC2 as the platform, you may need more instances and more shards. Your 200 million document index that works well with only 90GB of total memory ... that's surprising to me. That means that the important parts of that index *do* fit in memory ... but if the index gets much larger, performance is likely to drop off sharply. Thanks, Shawn
Re: Solr performance issues
Thanks Shawn. What do you mean with important parts of index? and how to calculate their size? Thanks, Mahmoud Sent from my iPhone On Dec 29, 2014, at 8:19 PM, Shawn Heisey apa...@elyograg.org wrote: On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote: I've the same index with a bit different schema and 200M documents, installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size of index is about 1.5TB, have many updates every 5 minutes, complex queries and faceting with response time of 100ms that is acceptable for us. Toke Eskildsen, Is the index updated while you are searching? *No* Do you do any faceting or other heavy processing as part of a search? *No* How many hits does a search typically have and how many documents are returned? *The test for QTime only with no documents returned and No. of hits varying from 50,000 to 50,000,000.* How many concurrent searches do you need to support? How fast should the response time be? *May be 100 concurrent searches with 100ms with facets.* Does splitting the shard to two shards on the same node so every shard will be on a single EBS Volume better than using LVM? The basic problem is simply that the system has so little memory that it must read large amounts of data from the disk when it does a query. There is not enough RAM to cache the important parts of the index. RAM is much faster than disk, even SSD. Typical consumer-grade DDR3-1600 memory has a data transfer rate of about 12800 megabytes per second. If it's ECC memory (which I would say is a requirement) then the transfer rate is probably a little bit slower than that. Figuring 9 bits for every byte gets us about 11377 MB/s. That's only an estimate, and it could be wrong in either direction, but I'll go ahead and use it. http://en.wikipedia.org/wiki/DDR3_SDRAM#JEDEC_standard_modules If your SSD is SATA, the transfer rate will be limited to approximately 600MB/s -- the 6 gigabit per second transfer rate of the newest SATA standard. That makes memory about 18 times as fast as SATA SSD. I saw one PCI express SSD that claimed a transfer rate of 2900 MB/s. Even that is only about one fourth of the estimated speed of DDR3-1600 with ECC. I don't know what interface technology Amazon uses for their SSD volumes, but I would bet on it being the cheaper version, which would mean SATA. The networking between the EC2 instance and the EBS storage is unknown to me and may be a further bottleneck. http://ocz.com/enterprise/z-drive-4500/specifications Bottom line -- you need a lot more memory. Speeding up the disk may *help* ... but it will not replace that simple requirement. With EC2 as the platform, you may need more instances and more shards. Your 200 million document index that works well with only 90GB of total memory ... that's surprising to me. That means that the important parts of that index *do* fit in memory ... but if the index gets much larger, performance is likely to drop off sharply. Thanks, Shawn
Re: Solr performance issues
On 12/29/2014 12:07 PM, Mahmoud Almokadem wrote: What do you mean with important parts of index? and how to calculate their size? I have no formal education in what's important when it comes to doing a query, but I can make some educated guesses. Starting with this as a reference: http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/codecs/lucene410/package-summary.html#file-names I would guess that the segment info (.si) files and the term index (*.tip) files would be supremely important to *always* have in memory, and they are fairly small. Next would be the term dictionary (*.tim) files. The term dictionary is pretty big, and would be very important for fast queries. Frequencies, positions, and norms may also be important, depending on exactly what kind of query you have. Frequencies and positions are quite large. Frequencies are critical for relevence ranking (the default sort by score), and positions are important for phrase queries. Position data may also be used by relevance ranking, but I am not familiar enough with it to say for sure. If you have docvalues defined, then *.dvm and *.dvd files would be used for facets and sorting on those specific fields. The *.dvd files can be very big, depending on your schema. The *.fdx and *.fdt files become important when actually retrieving results after the matching documents have been determined. The stored data is compressed, so additional CPU power is required to uncompress that data before it is sent to the client. Stored data may be large or small, depending on your schema. Stored data does not directly affect search speed, but if memory space is limited, every block of stored data that gets retrieved will result in some other part of the index being removed from the OS disk cache, which means that it might need to be re-read from the disk on the next query. Thanks, Shawn
RE: Solr performance issues
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote: I've the same index with a bit different schema and 200M documents, installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size of index is about 1.5TB, have many updates every 5 minutes, complex queries and faceting with response time of 100ms that is acceptable for us. So you have Setup 1: 3 * (30GB RAM + 600GB SSD) for a total of 1.5TB index 200M docs. Acceptable performance. Setup 2: 3 * (60GB RAM + 1TB SSD + 500GB SSD) for a total of 3.3TB 350M docs. Poor performance. The only real difference, besides doubling everything, is the LVM? I understand why you find that to be the culprit, but from what I can read, the overhead should not be anywhere near enough to result in the performance drop you are describing. Could it be that some snapshotting or backup was running when you tested? Splitting your shards and doubling the number of machines, as you suggest, would result in Setup 3: 6 * (60GB RAM + 600GB SSD) for a total of 3.3TB 350M docs. which would be remarkable similar to your setup 1. I think that would be the next logical step, unless you can easily do a temporary boost of your IOPS. BTW: You are getting dangerously close to your storage limits here - it seems that a single large merge could make you run out of space. - Toke Eskildsen
Re: Solr performance issues
On 12/26/2014 7:17 AM, Mahmoud Almokadem wrote: We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS General purpose (1x1TB + 1x500GB) on each instance. Then we create logical volume using LVM of 1.5TB to fit our index. The response time is about 1 and 3 seconds for simple queries (1 token). Is the LVM become a bottleneck for our index? SSD is very fast, but its speed is very slow when compared to RAM. The problem here is that Solr must read data off the disk in order to do a query, and even at SSD speeds, that is slow. LVM is not the problem here, though it's possible that it may be a contributing factor. You need more RAM. For Solr to be fast, a large percentage (ideally 100%, but smaller fractions can often be enough) of the index must be loaded into unused RAM by the operating system. Your information seems to indicate that the index is about 3 terabytes. If that's the index size, I would guess that you would need somewhere between 1 and 2 terabytes of total RAM for speed to be acceptable. Because RAM is *very* expensive on Amazon and is not available in sizes like 256GB-1TB, that typically means a lot of their virtual machines, with a lot of shards in SolrCloud. You may find that real hardware is less expensive for very large Solr indexes in the long term than cloud hardware. Thanks, Shawn
RE: Solr performance issues
Mahmoud Almokadem [prog.mahm...@gmail.com] wrote: We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS General purpose (1x1TB + 1x500GB) on each instance. Then we create logical volume using LVM of 1.5TB to fit our index. Your search speed will be limited by the slowest storage in your group, which would be your 500GB EBS. The General Purpose SSD option means (as far as I can read at http://aws.amazon.com/ebs/details/#piops) that your baseline of 3 IOPS/MB = 1500 IOPS, with bursts of 3000 IOPS. Unfortunately they do not say anything about latency. For comparison, I checked the system logs from a local test with our 21TB / 7 billion documents index. It used ~27,000 IOPS during the test, with mean search time a bit below 1 second. That was with ~100GB RAM for disk cache, which is about ½% of index size. The test was with simple term queries (1-3 terms) and some faceting. Back of the envelope: 27,000 IOPS for 21TB is ~1300 IOPS/TB. Your indexes are 1.1TB, so 1.1*1300 IOPS ~= 1400 IOPS. All else being equal (which is never the case), getting 1-3 second response times for a 1.1TB index, when one link in the storage chain is capped at a few thousand IOPS, you are using networked storage and you have little RAM for caching, does not seem unrealistic. If possible, you could try temporarily boosting performance of the EBS, to see if raw IO is the bottleneck. The response time is about 1 and 3 seconds for simple queries (1 token). Is the index updated while you are searching? Do you do any faceting or other heavy processing as part of a search? How many hits does a search typically have and how many documents are returned? How many concurrent searches do you need to support? How fast should the response time be? - Toke Eskildsen
Re: Solr performance issues
Likely lots of disk + network IO, yes. Put SPM for Solr on your nodes to double check. Otis On Dec 26, 2014, at 09:17, Mahmoud Almokadem prog.mahm...@gmail.com wrote: Dears, We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS General purpose (1x1TB + 1x500GB) on each instance. Then we create logical volume using LVM of 1.5TB to fit our index. The response time is about 1 and 3 seconds for simple queries (1 token). Is the LVM become a bottleneck for our index? Thanks for help.
Re: Solr performance issues for simple query - q=*:* with start and rows
Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com: We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some points is taking in order of minutes. 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable [0x7f42865dd000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184) at org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61) at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) 1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 runnable [0x7f42869e1000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at org.apache.lucene.util.PriorityQueue.updateTop(PriorityQueue.java:210) at org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.collect(TopScoreDocCollector.java:62) at org.apache.lucene.search.Scorer.score(Scorer.java:64) at
Re: Solr performance issues for simple query - q=*:* with start and rows
Jan, Would the same distrib=false help for distributed faceting? We are running into a similar issue with facet paging. Dmitry On Mon, Apr 29, 2013 at 11:58 AM, Jan Høydahl jan@cominvent.com wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com: We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some points is taking in order of minutes. 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable [0x7f42865dd000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184) at org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61) at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) 1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 runnable [0x7f42869e1000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at
Re: Solr performance issues for simple query - q=*:* with start and rows
We have a single shard, and all the data is in a single box only. Definitely looks like deep-paging is having problems. Just to understand, is the searcher looping over the result set everytime and skipping the first start count? This will definitely take a toll when we reach higher start values. On 4/29/13 2:28 PM, Jan Høydahl wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com: We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:*wt=javabinversion=2rows=4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some points is taking in order of minutes. 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable [0x7f42865dd000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at org.apache.lucene.util.PriorityQueue.pop(PriorityQueue.java:184) at org.apache.lucene.search.TopDocsCollector.populateResults(TopDocsCollector.java:61) at org.apache.lucene.search.TopDocsCollector.topDocs(TopDocsCollector.java:156) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1499) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1366) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:457) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:410) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) 1154127582@qtp-162198005-3 prio=10 tid=0x7f4aa0613800 nid=0x2956 runnable [0x7f42869e1000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.PriorityQueue.downHeap(PriorityQueue.java:252) at
Re: Solr performance issues for simple query - q=*:* with start and rows
Abhishek, There is a wiki regarding this: http://wiki.apache.org/solr/CommonQueryParameters search pageDoc and pageScore. On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam abhi.sanou...@gmail.comwrote: We have a single shard, and all the data is in a single box only. Definitely looks like deep-paging is having problems. Just to understand, is the searcher looping over the result set everytime and skipping the first start count? This will definitely take a toll when we reach higher start values. On 4/29/13 2:28 PM, Jan Høydahl wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com : We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some points is taking in order of minutes. 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable [0x7f42865dd000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.**PriorityQueue.downHeap(** PriorityQueue.java:252) at org.apache.lucene.util.**PriorityQueue.pop(** PriorityQueue.java:184) at org.apache.lucene.search.**TopDocsCollector.** populateResults(**TopDocsCollector.java:61) at org.apache.lucene.search.**TopDocsCollector.topDocs(** TopDocsCollector.java:156) at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(** SolrIndexSearcher.java:1499) at org.apache.solr.search.**SolrIndexSearcher.getDocListC(** SolrIndexSearcher.java:1366) at org.apache.solr.search.**SolrIndexSearcher.search(** SolrIndexSearcher.java:457) at org.apache.solr.handler.**component.QueryComponent.** process(QueryComponent.java:**410) at org.apache.solr.handler.**component.SearchHandler.** handleRequestBody(**SearchHandler.java:208) at org.apache.solr.handler.**RequestHandlerBase.**handleRequest( **RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.**execute(SolrCore.java:1817) at org.apache.solr.servlet.**SolrDispatchFilter.execute(**
Re: Solr performance issues for simple query - q=*:* with start and rows
We've found that you can do a lot for yourself by using a filter query to page through your data if it has a natural range to do so instead of start and rows. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com wrote: Abhishek, There is a wiki regarding this: http://wiki.apache.org/solr/CommonQueryParameters search pageDoc and pageScore. On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam abhi.sanou...@gmail.comwrote: We have a single shard, and all the data is in a single box only. Definitely looks like deep-paging is having problems. Just to understand, is the searcher looping over the result set everytime and skipping the first start count? This will definitely take a toll when we reach higher start values. On 4/29/13 2:28 PM, Jan Høydahl wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com : We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some points is taking in order of minutes. 395883378@qtp-162198005-7 prio=10 tid=0x7f4aa0636000 nid=0x295a runnable [0x7f42865dd000] java.lang.Thread.State: RUNNABLE at org.apache.lucene.util.**PriorityQueue.downHeap(** PriorityQueue.java:252) at org.apache.lucene.util.**PriorityQueue.pop(** PriorityQueue.java:184) at org.apache.lucene.search.**TopDocsCollector.** populateResults(**TopDocsCollector.java:61) at org.apache.lucene.search.**TopDocsCollector.topDocs(** TopDocsCollector.java:156) at org.apache.solr.search.**SolrIndexSearcher.**getDocListNC(** SolrIndexSearcher.java:1499) at org.apache.solr.search.**SolrIndexSearcher.getDocListC(** SolrIndexSearcher.java:1366) at org.apache.solr.search.**SolrIndexSearcher.search(** SolrIndexSearcher.java:457) at
Re: Solr performance issues for simple query - q=*:* with start and rows
I guess so, you'd have to use a filter query to page through the set of documents you were faceting against and sum them all at the end. It's not quite the same operation as paging through results, because facets are aggregate statistics, but if you're willing to go through the trouble, I bet it would also help performance. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Apr 29, 2013 at 9:06 AM, Dmitry Kan solrexp...@gmail.com wrote: Michael, Interesting! Do (Can) you apply this to facet searches as well? Dmitry On Mon, Apr 29, 2013 at 4:02 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: We've found that you can do a lot for yourself by using a filter query to page through your data if it has a natural range to do so instead of start and rows. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com wrote: Abhishek, There is a wiki regarding this: http://wiki.apache.org/solr/CommonQueryParameters search pageDoc and pageScore. On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam abhi.sanou...@gmail.comwrote: We have a single shard, and all the data is in a single box only. Definitely looks like deep-paging is having problems. Just to understand, is the searcher looping over the result set everytime and skipping the first start count? This will definitely take a toll when we reach higher start values. On 4/29/13 2:28 PM, Jan Høydahl wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726 https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com : We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=165518 INFO: [coreName] webapp=/solr path=/select params={start=55486000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118163 INFO: [coreName] webapp=/solr path=/select params={start=55494000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141642 INFO: [coreName] webapp=/solr path=/select params={start=5549q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=145037 I've taken some thread dumps in the solr server and most of the time the threads seem to be busy in the following stacks mostly: Is there anything that can be done to improve the performance? Is it a known issue? Its very surprising that querying for some just rows starting at some
Re: Solr performance issues for simple query - q=*:* with start and rows
Thanks. Only question is how to smoothly transition to this model. Our facet (string) fields contain timestamp prefixes, that are reverse ordered starting from the freshest value. In theory, we could try computing the filter queries for those. But before doing so, we would need the matched ids from solr, so it becomes at least 2 pass algorithm? The biggest concern in general we have with the paging is that the system seems to pass way more data back and forth, than is needed for computing the values. On Mon, Apr 29, 2013 at 4:14 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: I guess so, you'd have to use a filter query to page through the set of documents you were faceting against and sum them all at the end. It's not quite the same operation as paging through results, because facets are aggregate statistics, but if you're willing to go through the trouble, I bet it would also help performance. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Apr 29, 2013 at 9:06 AM, Dmitry Kan solrexp...@gmail.com wrote: Michael, Interesting! Do (Can) you apply this to facet searches as well? Dmitry On Mon, Apr 29, 2013 at 4:02 PM, Michael Della Bitta michael.della.bi...@appinions.com wrote: We've found that you can do a lot for yourself by using a filter query to page through your data if it has a natural range to do so instead of start and rows. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Mon, Apr 29, 2013 at 6:44 AM, Dmitry Kan solrexp...@gmail.com wrote: Abhishek, There is a wiki regarding this: http://wiki.apache.org/solr/CommonQueryParameters search pageDoc and pageScore. On Mon, Apr 29, 2013 at 1:17 PM, Abhishek Sanoujam abhi.sanou...@gmail.comwrote: We have a single shard, and all the data is in a single box only. Definitely looks like deep-paging is having problems. Just to understand, is the searcher looping over the result set everytime and skipping the first start count? This will definitely take a toll when we reach higher start values. On 4/29/13 2:28 PM, Jan Høydahl wrote: Hi, How many shards do you have? This is a known issue with deep paging with multi shard, see https://issues.apache.org/**jira/browse/SOLR-1726 https://issues.apache.org/jira/browse/SOLR-1726 You may be more successful in going to each shard, one at a time (with distrib=false) to avoid this issue. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 29. apr. 2013 kl. 09:17 skrev Abhishek Sanoujam abhi.sanou...@gmail.com : We have a solr core with about 115 million documents. We are trying to migrate data and running a simple query with *:* query and with start and rows param. The performance is becoming too slow in solr, its taking almost 2 mins to get 4000 rows and migration is being just too slow. Logs snippet below: INFO: [coreName] webapp=/solr path=/select params={start=55438000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=168308 INFO: [coreName] webapp=/solr path=/select params={start=55446000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=122771 INFO: [coreName] webapp=/solr path=/select params={start=55454000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=137615 INFO: [coreName] webapp=/solr path=/select params={start=5545q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=141223 INFO: [coreName] webapp=/solr path=/select params={start=55462000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=97474 INFO: [coreName] webapp=/solr path=/select params={start=55458000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=98115 INFO: [coreName] webapp=/solr path=/select params={start=55466000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=143822 INFO: [coreName] webapp=/solr path=/select params={start=55474000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=118066 INFO: [coreName] webapp=/solr path=/select params={start=5547q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=121498 INFO: [coreName] webapp=/solr path=/select params={start=55482000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0 QTime=164062 INFO: [coreName] webapp=/solr path=/select params={start=55478000q=*:* **wt=javabinversion=2rows=**4000} hits=115760479 status=0
Re: Solr Performance Issues
Try cutting back Solr's memory - the OS knows how to manage disk caches better than Solr does. Another approach is to raise and lower the queryResultCache and see if the hitratio changes. On Wed, Mar 17, 2010 at 9:44 AM, Siddhant Goel siddhantg...@gmail.com wrote: Hi, Apparently the bottleneck seem to be the time periods when CPU is waiting to do some I/O. Out of all the numbers I can see, the CPU wait times for I/O seem to be the highest. I've alloted 4GB to Solr out of the total 8GB available. There's only 47MB free on the machine, so I assume the rest of the memory is being used for OS disk caches. In addition, the hit ratios for queryResultCache isn't going beyond 20%. So the problem I think is not at Solr's end. Are there any pointers available on how can I resolve such issues related to disk I/O? Does this mean I need more overall memory? Or reducing the amount of memory allocated to Solr so that the disk cache has more memory, would help? Thanks, On Fri, Mar 12, 2010 at 11:21 PM, Erick Erickson erickerick...@gmail.comwrote: Sounds like you're pretty well on your way then. This is pretty typical of multi-threaded situations... Threads 1-n wait around on I/O and increasing the number of threads increases throughput without changing (much) the individual response time. Threads n+1 - p don't change throughput much, but increase the response time for each request. On aggregate, though, the throughput doesn't change (much). Adding threads after p+1 *decreases* throughput while *increasing* individual response time as your processors start spending w to much time context and/or memory swapping. The trick is finding out what n and p are G. Best Erick On Fri, Mar 12, 2010 at 12:06 PM, Siddhant Goel siddhantg...@gmail.com wrote: Hi, Thanks for your responses. It actually feels good to be able to locate where the bottlenecks are. I've created two sets of data - in the first one I'm measuring the time took purely on Solr's end, and in the other one I'm including network latency (just for reference). The data that I'm posting below contains the time took purely by Solr. I'm running 10 threads simultaneously and the average response time (for each query in each thread) remains close to 40 to 50 ms. But as soon as I increase the number of threads to something like 100, the response time goes up to ~600ms, and further up when the number of threads is close to 500. Yes the average time definitely depends on the number of concurrent requests. Going from memory, debugQuery=on will let you know how much time was spent in various operations in SOLR. It's important to know whether it was the searching, assembling the response, or transmitting the data back to the client. I just tried this. The information that it gives me for a query that took 7165ms is - http://pastebin.ca/1835644 So out of the total time 7165ms, QueryComponent took most of the time. Plus I can see the load average going up when the number of threads is really high. So it actually makes sense. (I didn't add any other component while searching; it was a plain /select?q=query call). Like I mentioned earlier in this mail, I'm maintaining separate sets for data with/without network latency, and I don't think its the bottleneck. How many threads does it take to peg the CPU? And what response times are you getting when your number of threads is around 10? If the number of threads is greater than 100, that really takes its toll on the CPU. So probably thats the number. When the number of threads is around 10, the response times average to something like 60ms (and 95% of the queries fall within 100ms of that value). Thanks, Erick On Fri, Mar 12, 2010 at 3:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk caching. I think that at any point of time, there can be a maximum of number of threads concurrent requests, which happens to make sense btw (does it?). As I increase the number of threads, the load average shown by top goes up to as high as 80%. But if I keep the number of threads low (~10), the load average never goes beyond ~8). So probably thats the number of requests I can expect Solr to serve concurrently on this index size with this hardware. Can anyone give a general opinion as to how much hardware should be sufficient for a Solr deployment with an index size of ~43GB, containing around 2.5 million documents? I'm expecting it to serve at least 20 requests per second. Any experiences? Thanks On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.com wrote: How much of your memory are you allocating to the JVM and how much are you leaving free?
Re: Solr Performance Issues
I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk caching. I think that at any point of time, there can be a maximum of number of threads concurrent requests, which happens to make sense btw (does it?). As I increase the number of threads, the load average shown by top goes up to as high as 80%. But if I keep the number of threads low (~10), the load average never goes beyond ~8). So probably thats the number of requests I can expect Solr to serve concurrently on this index size with this hardware. Can anyone give a general opinion as to how much hardware should be sufficient for a Solr deployment with an index size of ~43GB, containing around 2.5 million documents? I'm expecting it to serve at least 20 requests per second. Any experiences? Thanks On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.comwrote: How much of your memory are you allocating to the JVM and how much are you leaving free? If you don't leave enough free memory for the OS, the OS won't have a large enough disk cache, and you will be hitting the disk for lots of queries. You might want to monitor your Disk I/O using iostat and look at the iowait. If you are doing phrase queries and your *prx file is significantly larger than the available memory then when a slow phrase query hits Solr, the contention for disk I/O with other queries could be slowing everything down. You might also want to look at the 90th and 99th percentile query times in addition to the average. For our large indexes, we found at least an order of magnitude difference between the average and 99th percentile queries. Again, if Solr gets hit with a few of those 99th percentile slow queries and your not hitting your caches, chances are you will see serious contention for disk I/O.. Of course if you don't see any waiting on i/o, then your bottleneck is probably somewhere else:) See http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 for more background on our experience. Tom Burton-West University of Michigan Library www.hathitrust.org On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words) with 10 threads running in parallel, the average times for queries goes from ~50ms in the beginning to ~6000ms. Is this expected or is there something wrong with my configuration. Currently I've configured the queryResultCache and the documentCache to contain 2048 entries (hit ratios for both is close to 50%). Apart from this, a general question that I want to ask is that is such a hardware enough for this scenario? I'm aiming at achieving around 20 queries per second with the hardware mentioned above. Thanks, Regards, -- - Siddhant -- - Siddhant -- View this message in context: http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Siddhant
Re: Solr Performance Issues
You've probably already looked at this, but here goes anyway. The first question probably should have been what are you measuring? I've been fooled before by looking at, say, average response time and extrapolating. You're getting 20 qps if your response time is 1 second, but you have 20 threads running simultaneously, ditto if you're getting 2 second response time and 40 threads. So And what is response time? It would clarify things a lot if you broke out which parts of the operation are taking the time. Going from memory, debugQuery=on will let you know how much time was spent in various operations in SOLR. It's important to know whether it was the searching, assembling the response, or transmitting the data back to the client. If your timings are all just how long it takes the response to get back to the client, you could even be hammered by network latency. How many threads does it take to peg the CPU? And what response times are you getting when your number of threads is around 10? Erick On Fri, Mar 12, 2010 at 3:39 AM, Siddhant Goel siddhantg...@gmail.comwrote: I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk caching. I think that at any point of time, there can be a maximum of number of threads concurrent requests, which happens to make sense btw (does it?). As I increase the number of threads, the load average shown by top goes up to as high as 80%. But if I keep the number of threads low (~10), the load average never goes beyond ~8). So probably thats the number of requests I can expect Solr to serve concurrently on this index size with this hardware. Can anyone give a general opinion as to how much hardware should be sufficient for a Solr deployment with an index size of ~43GB, containing around 2.5 million documents? I'm expecting it to serve at least 20 requests per second. Any experiences? Thanks On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.com wrote: How much of your memory are you allocating to the JVM and how much are you leaving free? If you don't leave enough free memory for the OS, the OS won't have a large enough disk cache, and you will be hitting the disk for lots of queries. You might want to monitor your Disk I/O using iostat and look at the iowait. If you are doing phrase queries and your *prx file is significantly larger than the available memory then when a slow phrase query hits Solr, the contention for disk I/O with other queries could be slowing everything down. You might also want to look at the 90th and 99th percentile query times in addition to the average. For our large indexes, we found at least an order of magnitude difference between the average and 99th percentile queries. Again, if Solr gets hit with a few of those 99th percentile slow queries and your not hitting your caches, chances are you will see serious contention for disk I/O.. Of course if you don't see any waiting on i/o, then your bottleneck is probably somewhere else:) See http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 for more background on our experience. Tom Burton-West University of Michigan Library www.hathitrust.org On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words) with 10 threads running in parallel, the average times for queries goes from ~50ms in the beginning to ~6000ms. Is this expected or is there something wrong with my configuration. Currently I've configured the queryResultCache and the documentCache to contain 2048 entries (hit ratios for both is close to 50%). Apart from this, a general question that I want to ask is that is such a hardware enough for this scenario? I'm aiming at achieving around 20 queries per second with the hardware mentioned above. Thanks, Regards, -- - Siddhant -- - Siddhant -- View this message in context: http://old.nabble.com/Solr-Performance-Issues-tp27864278p27868456.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Siddhant
Re: Solr Performance Issues
Hi, Thanks for your responses. It actually feels good to be able to locate where the bottlenecks are. I've created two sets of data - in the first one I'm measuring the time took purely on Solr's end, and in the other one I'm including network latency (just for reference). The data that I'm posting below contains the time took purely by Solr. I'm running 10 threads simultaneously and the average response time (for each query in each thread) remains close to 40 to 50 ms. But as soon as I increase the number of threads to something like 100, the response time goes up to ~600ms, and further up when the number of threads is close to 500. Yes the average time definitely depends on the number of concurrent requests. Going from memory, debugQuery=on will let you know how much time was spent in various operations in SOLR. It's important to know whether it was the searching, assembling the response, or transmitting the data back to the client. I just tried this. The information that it gives me for a query that took 7165ms is - http://pastebin.ca/1835644 So out of the total time 7165ms, QueryComponent took most of the time. Plus I can see the load average going up when the number of threads is really high. So it actually makes sense. (I didn't add any other component while searching; it was a plain /select?q=query call). Like I mentioned earlier in this mail, I'm maintaining separate sets for data with/without network latency, and I don't think its the bottleneck. How many threads does it take to peg the CPU? And what response times are you getting when your number of threads is around 10? If the number of threads is greater than 100, that really takes its toll on the CPU. So probably thats the number. When the number of threads is around 10, the response times average to something like 60ms (and 95% of the queries fall within 100ms of that value). Thanks, Erick On Fri, Mar 12, 2010 at 3:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: I've allocated 4GB to Solr, so the rest of the 4GB is free for the OS disk caching. I think that at any point of time, there can be a maximum of number of threads concurrent requests, which happens to make sense btw (does it?). As I increase the number of threads, the load average shown by top goes up to as high as 80%. But if I keep the number of threads low (~10), the load average never goes beyond ~8). So probably thats the number of requests I can expect Solr to serve concurrently on this index size with this hardware. Can anyone give a general opinion as to how much hardware should be sufficient for a Solr deployment with an index size of ~43GB, containing around 2.5 million documents? I'm expecting it to serve at least 20 requests per second. Any experiences? Thanks On Fri, Mar 12, 2010 at 12:47 AM, Tom Burton-West tburtonw...@gmail.com wrote: How much of your memory are you allocating to the JVM and how much are you leaving free? If you don't leave enough free memory for the OS, the OS won't have a large enough disk cache, and you will be hitting the disk for lots of queries. You might want to monitor your Disk I/O using iostat and look at the iowait. If you are doing phrase queries and your *prx file is significantly larger than the available memory then when a slow phrase query hits Solr, the contention for disk I/O with other queries could be slowing everything down. You might also want to look at the 90th and 99th percentile query times in addition to the average. For our large indexes, we found at least an order of magnitude difference between the average and 99th percentile queries. Again, if Solr gets hit with a few of those 99th percentile slow queries and your not hitting your caches, chances are you will see serious contention for disk I/O.. Of course if you don't see any waiting on i/o, then your bottleneck is probably somewhere else:) See http://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-1 for more background on our experience. Tom Burton-West University of Michigan Library www.hathitrust.org On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words)
Re: Solr Performance Issues
How many outstanding queries do you have at a time? Is it possible that when you start, you have only a few queries executing concurrently but as your test runs you have hundreds? This really is a question of how your load test is structured. You might get a better sense of how it works if your tester had a limited number of threads running so the max concurrent requests SOLR was serving at once were capped (30, 50, whatever). But no, I wouldn't expect SOLR to bog down the way you're describing just because it was running for a while. HTH Erick On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.comwrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words) with 10 threads running in parallel, the average times for queries goes from ~50ms in the beginning to ~6000ms. Is this expected or is there something wrong with my configuration. Currently I've configured the queryResultCache and the documentCache to contain 2048 entries (hit ratios for both is close to 50%). Apart from this, a general question that I want to ask is that is such a hardware enough for this scenario? I'm aiming at achieving around 20 queries per second with the hardware mentioned above. Thanks, Regards, -- - Siddhant
Re: Solr Performance Issues
Hi Erick, The way the load test works is that it picks up 5000 queries, splits them according to the number of threads (so if we have 10 threads, it schedules 10 threads - each one sending 500 queries). So it might be possible that the number of queries at a point later in time is greater than the number of queries earlier in time. I'm not very sure about that though. Its a simple Ruby script that starts up threads, calls the search function in each thread, and then waits for each of them to exit. How many queries per second can we expect Solr to serve, given this kind of hardware? If what you suggest is true, then is it possible that while Solr is serving a query, another query hits it, which increases the response time even further? I'm not sure about it. But yes I can observe the query times going up as I increase the number of threads. Thanks, Regards, On Thu, Mar 11, 2010 at 8:30 PM, Erick Erickson erickerick...@gmail.comwrote: How many outstanding queries do you have at a time? Is it possible that when you start, you have only a few queries executing concurrently but as your test runs you have hundreds? This really is a question of how your load test is structured. You might get a better sense of how it works if your tester had a limited number of threads running so the max concurrent requests SOLR was serving at once were capped (30, 50, whatever). But no, I wouldn't expect SOLR to bog down the way you're describing just because it was running for a while. HTH Erick On Thu, Mar 11, 2010 at 9:39 AM, Siddhant Goel siddhantg...@gmail.com wrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words) with 10 threads running in parallel, the average times for queries goes from ~50ms in the beginning to ~6000ms. Is this expected or is there something wrong with my configuration. Currently I've configured the queryResultCache and the documentCache to contain 2048 entries (hit ratios for both is close to 50%). Apart from this, a general question that I want to ask is that is such a hardware enough for this scenario? I'm aiming at achieving around 20 queries per second with the hardware mentioned above. Thanks, Regards, -- - Siddhant -- - Siddhant
Re: Solr Performance Issues
I dont mean to turn this into a sales pitch, but there is a tool for Java app performance management that you may find helpful. Its called New Relic (www.newrelic.com) and the tool can be installed in 2 minutes. It can give you very deep visibility inside Solr and other Java apps. (Full disclosure I work at New Relic.) Mike Siddhant Goel wrote: Hi everyone, I have an index corresponding to ~2.5 million documents. The index size is 43GB. The configuration of the machine which is running Solr is - Dual Processor Quad Core Xeon 5430 - 2.66GHz (Harpertown) - 2 x 12MB cache, 8GB RAM, and 250 GB HDD. I'm observing a strange trend in the queries that I send to Solr. The query times for queries that I send earlier is much lesser than the queries I send afterwards. For instance, if I write a script to query solr 5000 times (with 5000 distinct queries, most of them containing not more than 3-5 words) with 10 threads running in parallel, the average times for queries goes from ~50ms in the beginning to ~6000ms. Is this expected or is there something wrong with my configuration. Currently I've configured the queryResultCache and the documentCache to contain 2048 entries (hit ratios for both is close to 50%). Apart from this, a general question that I want to ask is that is such a hardware enough for this scenario? I'm aiming at achieving around 20 queries per second with the hardware mentioned above. Thanks, Regards, -- - Siddhant -- View this message in context: http://old.nabble.com/Solr-Performance-Issues-tp27864278p27872139.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr performance issues
On Jun 19, 2008, at 6:28 PM, Yonik Seeley wrote: 2. I use acts_as_solr and by default they only make post requests, even for /select. With that setup the response time for most queries, simple or complex ones, were ranging from 150ms to 600ms, with an average of 250ms. I changed the select request to use get requests instead and now the response time is down to 10ms to 60ms. Did someone seen that before? Why is it doing it? Are the get requests being cached by the ruby stuff? No, I'm sure that the results aren't being cached by Ruby's library, solr-ruby, or acts_as_solr. But even with no caching, I've seen differences with get/post on Linux with the python client when persistent HTTP connections were in use. I tracked it down to the POST being written in two parts, triggering nagle's algorithm in the networking stack. There was another post I found that mentioned this a couple of years ago: http://markmail.org/message/45qflvwnakhripqp I would welcome patches with tests that allow solr-ruby to send most requests with GET, and the ones that are actually sending a body beyond just parameters (delete, update, commit) as POST. Erik
Re: Solr performance issues
On Fri, Jun 20, 2008 at 8:32 AM, Erik Hatcher [EMAIL PROTECTED] wrote: On Jun 19, 2008, at 6:28 PM, Yonik Seeley wrote: 2. I use acts_as_solr and by default they only make post requests, even for /select. With that setup the response time for most queries, simple or complex ones, were ranging from 150ms to 600ms, with an average of 250ms. I changed the select request to use get requests instead and now the response time is down to 10ms to 60ms. Did someone seen that before? Why is it doing it? Are the get requests being cached by the ruby stuff? No, I'm sure that the results aren't being cached by Ruby's library, solr-ruby, or acts_as_solr. I confirm that the results are not cached by Ruby's library. But even with no caching, I've seen differences with get/post on Linux with the python client when persistent HTTP connections were in use. I tracked it down to the POST being written in two parts, triggering nagle's algorithm in the networking stack. There was another post I found that mentioned this a couple of years ago: http://markmail.org/message/45qflvwnakhripqp I would welcome patches with tests that allow solr-ruby to send most requests with GET, and the ones that are actually sending a body beyond just parameters (delete, update, commit) as POST. Erik I made a few modifications but it still need more testing... Sebastien