We're probably going to be building a Solr service to handle a dataset of ~60TB, which for our data and schema typically gives a Solr index size of 1/10th - i.e., 6TB. Given there's a general rule about the amount of hardware memory required should exceed the size of the Solr index (exceed to also allow for the operating system etc.), how have people handled this situation? Do I really need, for example, 12 servers with 512GB RAM, or are there other techniques to handling this?
Many thanks in advance for any general/conceptual/specific ideas/comments/answers! Gil Gil Hoggarth Web Archiving Technical Services Engineer The British Library, Boston Spa, West Yorkshire, LS23 7BQ