We decided to downgrade to 20 shards again, as we kept having the query time 
spikes, if it was a memory issue, I would assume we would have the same 
performance issues with 20 shards, so I think this is maybe a problem in solr 
rather than our configuration / amount of ram.


In anycase, we have thought about adding some more servers to the solrcloud we 
have, is there an easy way to add servers to a quoram without having to reshard 
and re-index? I have looked at the collections API and not discovered one yet 
...

Thanks

Andy


-----Original Message-----
From: Jack Krupansky [mailto:jack.krupan...@gmail.com] 
Sent: 08 January 2015 22:17
To: solr-user@lucene.apache.org
Subject: Re: Determining the Number of Solr Shards

My final advice would be my standard proof of concept implementation advice
- test a configuration with 10% (or 5%) of the target data size and 10% (or
5%) of the estimated resource requirements (maybe 25% of the estimated RAM) and 
see how well it performs.

Take the actual index size and multiply by 10 (or 20 for a 5% load) to get a 
closer estimate of total storage required.

If a 10% load fails to perform well with 25% of the total estimated RAM, then 
you can be sure that you'll have problems with 10x the data and only 4x the 
RAM. Increase the RAM for that 10 load until you get acceptable performance for 
both indexing and a full range of queries, and then use 10x that RAM for the 
RAM for the 100% load. That's the OS system memory for file caching, not the 
total system RAM.

-- Jack Krupansky

On Thu, Jan 8, 2015 at 4:55 PM, Nishanth S <nishanth.2...@gmail.com> wrote:

> Thanks guys for your inputs I would be looking at around 100 Tb of 
> total  index size  with 5100 million documents  for  a period of  30 
> days before we purge the  indexes.I had estimated it slightly on the  
> higher side of things but that's where I feel we would be.
>
> Thanks,
> Nishanth
>
> On Wed, Jan 7, 2015 at 7:50 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>
> > On 1/7/2015 7:14 PM, Nishanth S wrote:
> > > Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads  
> > > for
> the
> > > moment would be in the 1000 reads/second. Guess finding out the 
> > > right number  of  shards would be my starting point.
> >
> > I don't think indexing 12000 docs per second would be too much for 
> > Solr to handle, as long as you architect the indexing application properly.
> > You would likely need to have several indexing threads or processes 
> > that index in parallel.  Solr is fully thread-safe and can handle 
> > several indexing requests at the same time.  If the indexing 
> > application is single-threaded, indexing speed will not reach its full 
> > potential.
> >
> > Be aware that indexing at the same time as querying will reduce the 
> > number of queries per second that you can handle.  In an environment 
> > where both reads and writes are heavy like you have described, more 
> > shards and/or more replicas might be required.
> >
> > For the query side ... even 1000 queries per second is a fairly 
> > heavy query rate.  You're likely to need at least a few replicas, 
> > possibly several, to handle that.  The type and complexity of the 
> > queries you do will make a big difference as well.  To handle that 
> > query level, I would still recommend only running one shard replica 
> > on each server.  If you have three shards and three replicas, that means 9 
> > Solr servers.
> >
> > How many documents will you have in total?  You said they are about 
> > 6KB each ... but depending on the fieldType definitions (and the 
> > analysis chain for TextField types), 6KB might be very large or fairly 
> > small.
> >
> > Do you have any idea how large the Solr index will be with all your 
> > documents?  Estimating that will require indexing a significant 
> > percentage of your documents with the actual schema and config that 
> > you will use in production.
> >
> > If I know how many documents you have, how large the full index will 
> > be, and can see an example of the more complex queries you will do, 
> > I can make *preliminary* guesses about the number of shards you 
> > might need.  I do have to warn you that it will only be a guess.  
> > You'll have to experiment to see what works best.
> >
> > Thanks,
> > Shawn
> >
> >
>

Reply via email to