If you’re using Solr 8.2 or newer there’s a built-in index analysis tool that gives you a better understanding of what kind of data in your index occupies the most disk space, so that you can tweak your schema accordingly: https://lucene.apache.org/solr/guide/8_2/collection-management.html#colstatus <https://lucene.apache.org/solr/guide/8_2/collection-management.html#colstatus>
Which is another way of saying that you have to try and see ;) > On 3 Feb 2020, at 18:02, David Hastings <hastings.recurs...@gmail.com> wrote: > > Yup, I find the right calculation to be as much ram as the server can take, > and as much SSD space as it will hold, when you run out, buy another server > and repeat. machines/ram/SSD's are cheap. just get as much as you can. > > On Mon, Feb 3, 2020 at 11:59 AM Walter Underwood <wun...@wunderwood.org> > wrote: > >> What he said. >> >> But if you must have a number, assume that the index will be as big as >> your (text) data. It might be 2X bigger or 2X smaller. Or 3X or 4X, but >> that is a starting point. Once you start updating, the index might get as >> much as 2X bigger before merges. >> >> Do NOT try to get by with the smallest possible RAM or disk. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Feb 3, 2020, at 5:28 AM, Erick Erickson <erickerick...@gmail.com> >> wrote: >>> >>> I’ve always had trouble with that advice, that RAM size should be JVM + >> index size. I’ve seen 300G indexes (as measured by the size of the >> data/index directory) run in 128G of memory. >>> >>> Here’s the long form: >> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ >>> >>> But the short form is “stress test and see”. >>> >>> To answer your question, though, when people say “index size” they’re >> usually referring to the size on disk as I mentioned above. >>> >>> Best, >>> Erick >>> >>>> On Feb 3, 2020, at 4:24 AM, Mohammed Farhan Ejaz <farhan.e...@gmail.com> >> wrote: >>>> >>>> Hello All, >>>> >>>> I want to size the RAM for my Solr cloud instance. The thumb rule is >> your >>>> total RAM size should be = (JVM size + index size) >>>> >>>> Now I have a simple question, How do I know my index size? A simple >> method, >>>> perhaps from the Solr cloud admin UI or an API? >>>> >>>> My assumption so far is the total segment info size is the same as the >>>> index size. >>>> >>>> Thanks & Regards >>>> Farhan >>> >> >>