On 12/26/2014 7:17 AM, Mahmoud Almokadem wrote: > We've installed a cluster of one collection of 350M documents on 3 > r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is > about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS > General purpose (1x1TB + 1x500GB) on each instance. Then we create logical > volume using LVM of 1.5TB to fit our index. > > The response time is about 1 and 3 seconds for simple queries (1 token). > > Is the LVM become a bottleneck for our index?
SSD is very fast, but its speed is very slow when compared to RAM. The problem here is that Solr must read data off the disk in order to do a query, and even at SSD speeds, that is slow. LVM is not the problem here, though it's possible that it may be a contributing factor. You need more RAM. For Solr to be fast, a large percentage (ideally 100%, but smaller fractions can often be enough) of the index must be loaded into unused RAM by the operating system. Your information seems to indicate that the index is about 3 terabytes. If that's the index size, I would guess that you would need somewhere between 1 and 2 terabytes of total RAM for speed to be acceptable. Because RAM is *very* expensive on Amazon and is not available in sizes like 256GB-1TB, that typically means a lot of their virtual machines, with a lot of shards in SolrCloud. You may find that real hardware is less expensive for very large Solr indexes in the long term than cloud hardware. Thanks, Shawn