Curious how many documents per shard you were planning? The number of documents per shard and field type will drive the amount of a RAM needed to sort and facet.
On Wed, Dec 11, 2013 at 7:02 AM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote: > On Tue, 2013-12-10 at 17:51 +0100, Hoggarth, Gil wrote: > > We're probably going to be building a Solr service to handle a dataset > > of ~60TB, which for our data and schema typically gives a Solr index > > size of 1/10th - i.e., 6TB. Given there's a general rule about the > > amount of hardware memory required should exceed the size of the Solr > > index (exceed to also allow for the operating system etc.), how have > > people handled this situation? > > By acknowledging that it is cheaper to buy SSDs instead of trying to > compensate for slow spinning drives with excessive amounts of RAM. > > Our plans for an estimated 20TB of indexes out of 372TB of raw web data > is to use SSDs controlled by a single machine with 512GB of RAM (or was > it 256GB? I'll have to ask the hardware guys): > https://sbdevel.wordpress.com/2013/12/06/danish-webscale/ > > As always YMMW and the numbers you quite elsewhere indicates that your > queries are quite complex. You might want to be a bit of profiling to > see if they are heavy enough to make the CPU the bottleneck. > > Regards, > Toke Eskildsen, State and University Library, Denmark > > > -- Joel Bernstein Search Engineer at Heliosearch