RE: Determining the Number of Solr Shards

2015-01-12 Thread Andrew Butkus
[mailto:jack.krupan...@gmail.com] Sent: 08 January 2015 22:17 To: solr-user@lucene.apache.org Subject: Re: Determining the Number of Solr Shards My final advice would be my standard proof of concept implementation advice - test a configuration with 10% (or 5%) of the target data size and 10% (or 5%) of the

Re: Determining the Number of Solr Shards

2015-01-09 Thread Toke Eskildsen
On Thu, 2015-01-08 at 22:55 +0100, Nishanth S wrote: > Thanks guys for your inputs I would be looking at around 100 Tb of total > index size with 5100 million documents [...] That is a large corpus when coupled with your high indexing & QPS requirements. Are the queries complex too? Will you be

Re: Determining the Number of Solr Shards

2015-01-08 Thread Jack Krupansky
My final advice would be my standard proof of concept implementation advice - test a configuration with 10% (or 5%) of the target data size and 10% (or 5%) of the estimated resource requirements (maybe 25% of the estimated RAM) and see how well it performs. Take the actual index size and multiply

Re: Determining the Number of Solr Shards

2015-01-08 Thread Nishanth S
Thanks guys for your inputs I would be looking at around 100 Tb of total index size with 5100 million documents for a period of 30 days before we purge the indexes.I had estimated it slightly on the higher side of things but that's where I feel we would be. Thanks, Nishanth On Wed, Jan 7,

Re: Determining the Number of Solr Shards

2015-01-07 Thread Shawn Heisey
On 1/7/2015 7:14 PM, Nishanth S wrote: > Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the > moment would be in the 1000 reads/second. Guess finding out the right > number of shards would be my starting point. I don't think indexing 12000 docs per second would be too much

Re: Determining the Number of Solr Shards

2015-01-07 Thread Jack Krupansky
Anybody on the list have a feel for how many simultaneous queries Solr can handle in parallel? Will it be linear WRT the number of CPU cores? Or are their other bottlenecks or locks in Lucene or Solr such that even with more CPU cores the Solr server will be saturated with fewer queries than the nu

Re: Determining the Number of Solr Shards

2015-01-07 Thread Erick Erickson
1,000 queries/second is not trivial either. My starting point for QPS is about 50. But that's entirely "straw man" and (and as the link Shawn provided indicates) only testing will determine if that's realistic. So going for 1,000 queries/second, you're talking 20 replicas for each shard. And

Re: Determining the Number of Solr Shards

2015-01-07 Thread Nishanth S
Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the moment would be in the 1000 reads/second. Guess finding out the right number of shards would be my starting point. Thanks, Nishanth On Wed, Jan 7, 2015 at 6:28 PM, Walter Underwood wrote: > This is described as “write

Re: Determining the Number of Solr Shards

2015-01-07 Thread Walter Underwood
This is described as “write heavy”, so I think that is 12,000 writes/second, not queries. Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Jan 7, 2015, at 5:16 PM, Shawn Heisey wrote: > On 1/7/2015 3:29 PM, Nishanth S wrote: >> I am working on coming up with a solr a

Re: Determining the Number of Solr Shards

2015-01-07 Thread Shawn Heisey
On 1/7/2015 3:29 PM, Nishanth S wrote: > I am working on coming up with a solr architecture layout for my use > case.We are a very write heavy application with no down time tolerance and > have low SLAs on reads when compared with writes.I am looking at around > 12K tps with average index size