Re: solr benchmarks
I would shard the index so that each shard is no larger than the memory of the machine it sits on, that way your entire index will be in memory all the time. When I was at Feedster (I wrote the search engine), the rule of thumb I had was to have 14GB of index on a 16GB machine. François On Dec 31, 2010, at 9:06 PM, Tri Nguyen wrote: > Hi, > > I remember going through some page that had graphs of response times based on > index size for solr. > > Anyone know of such pages? > > Internally, we have some requirements for response times and I'm trying to > figure out when to shard the index. > > Thanks, > > Tri
Re: solr benchmarks
On Sat, 2011-01-01 at 03:06 +0100, Tri Nguyen wrote: > I remember going through some page that had graphs of response times based on > index size for solr. > > Anyone know of such pages? Sorry, no. Some small scale tests with our corpus showed that response times suffered less than proportionally to index size, with regard to the raw searches: Doubling the index size did not halve the response time. On the other hand, faceting time was proportional to the index size. As always, your mileage will vary. > Internally, we have some requirements for response times and I'm trying to > figure out when to shard the index. If you discover that your searches are primarily IO-bound, which is often the case, and if you're still using spinning disks, I highly recommend that you upgrade to SDD's. They are very cheap compared to RAM, you don't need to change your code or workflow and they work beautifully with Lucene/SOLR: They gave us 2-4 times speedup, compared to 2 * 15.000 RPM harddisks in RAID 1. Compared to holding the index fully in RAM (with a 14GB index) they gave us 80% on a dual core machine - more CPU cores might benefit more from the RAM solution.
Re: solr benchmarks
Hi, You can find benchmark results but these are not directly based on "index size vs. response time" http://wiki.apache.org/solr/SolrPerformanceData On Sat, Jan 1, 2011 at 4:06 AM, Tri Nguyen wrote: > Hi, > > I remember going through some page that had graphs of response times based > on index size for solr. > > Anyone know of such pages? > > Internally, we have some requirements for response times and I'm trying to > figure out when to shard the index. > > Thanks, > > Tri
Re: solr benchmarks
Tri: What is the volume of content (# of documents) and index size you are expecting? What about the document complexity in terms of # of fields, what are you storing in the index, complexity of the queries etc? We have used SOLR with 10m documents with 1-3 second response times on the front end - this is with minimal tuning, 4-5 facet fields and large blobs of content in the index and jRuby on Rails and complex queries and under low load conditions (hence caches are probably not warmed much). We have external search application almost fully powered by SOLR (except for web crawl) and the response is of the typically less than 1 second with about 100k documents. Solr time is probably 100-200 ms of this. My sense is that SOLR is as fast as it gets and scales very, very well. On the user group, I have seen reference to people using SOLR for 100m documents or more. It would be useful to get your use case(s). On Mon, Jan 3, 2011 at 10:44 AM, Jak Akdemir wrote: > Hi, > You can find benchmark results but these are not directly based on "index > size vs. response time" > http://wiki.apache.org/solr/SolrPerformanceData > > On Sat, Jan 1, 2011 at 4:06 AM, Tri Nguyen wrote: > > > Hi, > > > > I remember going through some page that had graphs of response times > based > > on index size for solr. > > > > Anyone know of such pages? > > > > Internally, we have some requirements for response times and I'm trying > to > > figure out when to shard the index. > > > > Thanks, > > > > Tri >
Re: Solr Benchmarks
The performance data on the wiki (http://wiki.apache.org/solr/ SolrPerformanceData) are a little short to get a good idea. Le 06-11-06 à 09:28, Nicolas St-Laurent a écrit : Hello, Is there any Solr benchmarks available somewhere ? I would like to know how well it performs. I understand that it depends on the hardware config and on the application server used. Just to got an idea... Thank you, Nicolas
Re: Solr Benchmarks
I've been using Solr for keyword search on Discogs.com for a few months with great results. As of today Solr is running under Tomcat on a single dedicated box. It's a 2.66Ghz P4, with 1 gig ram. The index has about 1.2 million documents and is 1.2 gigs in size. This machine handles 250,000 queries per day with no problem. CPU load stays around 0.15 most of the time. I hope that is helpful to you. Kevin On 11/6/06, Nicolas St-Laurent <[EMAIL PROTECTED]> wrote: Hello, Is there any Solr benchmarks available somewhere ? I would like to know how well it performs. I understand that it depends on the hardware config and on the application server used. Just to got an idea... Thank you, Nicolas
Re: Solr Benchmarks
On 11/6/06 6:28 AM, "Nicolas St-Laurent" <[EMAIL PROTECTED]> wrote: > > Is there any Solr benchmarks available somewhere ? I would like to > know how well it performs. I understand that it depends on the > hardware config and on the application server used. Just to got an > idea... With search engines, you really need to test with your documents, your queries, and your settings. Performance might vary by a factor of ten or more. I've done some testing using JMeter. I followed the instructions in the JMeter FAQ for "How do I use external data files in my test scripts?" http://wiki.apache.org/jakarta-jmeter/JMeterFAQ I'm attaching the script I built with this. A few notes: * The queries should be one per line in a file named "query.txt" in the JMeter bin directory. * This test will use HTTP 1.1 persistent connections, so it is faster than a bunch of different clients. It should be fairly accurate if search is front-ended by another app. * It helps to have a lot of queries, maybe 50K or more. I've seen other search engines run entirely from cache with a 1000 query test set. * JMeter can use a lot of CPU, so it might hit the limit before Solr does. Watch the CPU usage on both systems (JMeter and Solr) to see which one is the bottleneck. * The display graphs can slow down JMeter on long tests. I was seeing spots of low CPU usage on the Solr server and those went away when I cleared the graph. I was very pleased with the Solr performance in my testing. With our small corpus (65K docs) I was seeing over 240 qps on my dev box (dual 3 GHz Xeon). I expect that it didn't touch the disk at all, since the index is only 50 Meg. wunder -- Walter Underwood Search Guru, Netflix
Re: Solr Benchmarks
Le 06-11-06 à 12:50, Walter Underwood a écrit : http://wiki.apache.org/jakarta-jmeter/JMeterFAQ I'm attaching the script I built with this. A few notes: Well, I doesn't get the script... I was very pleased with the Solr performance in my testing. With our small corpus (65K docs) I was seeing over 240 qps on my dev box (dual 3 GHz Xeon). I expect that it didn't touch the disk at all, since the index is only 50 Meg. wunder Thank you wunder. It gives me a good idea of what to expect of Solr. I understand that performance change a lot depending of the context of execution. It's a good idea to user JMeter to get a performance report. I will try this. Nicolas
Re: Solr Benchmarks
Le 06-11-06 à 12:21, Kevin Lewandowski a écrit : As of today Solr is running under Tomcat on a single dedicated box. It's a 2.66Ghz P4, with 1 gig ram. The index has about 1.2 million documents and is 1.2 gigs in size. This machine handles 250,000 queries per day with no problem. CPU load stays around 0.15 most of the time. I hope that is helpful to you. Kevin Thank you Kevin. It gives me a good idea. I use a simple socket server right now in front of Lucene. I will give Solr a try.
Re: Solr Benchmarks
Hi Walter, Thunderbird shows that there is an attachment to this message in the message list, but when I view the message, no attachment is available. Could you try sending this attachment again? Thanks --Joachim Walter Underwood wrote: I've done some testing using JMeter. I followed the instructions in the JMeter FAQ for "How do I use external data files in my test scripts?" http://wiki.apache.org/jakarta-jmeter/JMeterFAQ I'm attaching the script I built with this. A few notes:
Re: Solr Benchmarks
Here it is again, but the mailing list might strip attachments. It is very easy to build your own using the instructions in the FAQ. wunder On 11/9/06 11:02 AM, "Joachim Martin" <[EMAIL PROTECTED]> wrote: > Hi Walter, > > Thunderbird shows that there is an attachment to this message in the > message list, but when I view > the message, no attachment is available. Could you try sending this > attachment again? > > Thanks --Joachim > > Walter Underwood wrote: > >> I've done some testing using JMeter. I followed the instructions >> in the JMeter FAQ for "How do I use external data files in my >> test scripts?" >> >> http://wiki.apache.org/jakarta-jmeter/JMeterFAQ >> >> I'm attaching the script I built with this. A few notes: >> >> >>
Re: Solr Benchmarks
: Here it is again, but the mailing list might strip attachments. : It is very easy to build your own using the instructions in the FAQ. in general, the Apache mailing lists strip attachments. In my experience plain text attachments seem to be okay, as long as they aren't too big and have the mime type set properly by your mail sender. in practice: it's usually better to just cut/paste in the body of your message, or send a URL to an external resource. : : wunder : : On 11/9/06 11:02 AM, "Joachim Martin" <[EMAIL PROTECTED]> wrote: : : > Hi Walter, : > : > Thunderbird shows that there is an attachment to this message in the : > message list, but when I view : > the message, no attachment is available. Could you try sending this : > attachment again? : > : > Thanks --Joachim : > : > Walter Underwood wrote: : > : >> I've done some testing using JMeter. I followed the instructions : >> in the JMeter FAQ for "How do I use external data files in my : >> test scripts?" : >> : >> http://wiki.apache.org/jakarta-jmeter/JMeterFAQ : >> : >> I'm attaching the script I built with this. A few notes: : >> : >> : >> : : -Hoss