On 12/26/2014 7:17 AM, Mahmoud Almokadem wrote:
> We've installed a cluster of one collection of 350M documents on 3
> r3.2xlarge (60GB RAM) Amazon servers. The size of index on each shard is
> about 1.1TB and maximum storage on Amazon is 1 TB so we add 2 SSD EBS
> General purpose (1x1TB + 1x500GB) on each instance. Then we create logical
> volume using LVM of 1.5TB to fit our index.
> 
> The response time is about 1 and 3 seconds for simple queries (1 token).
> 
> Is the LVM become a bottleneck for our index?

SSD is very fast, but its speed is very slow when compared to RAM.  The
problem here is that Solr must read data off the disk in order to do a
query, and even at SSD speeds, that is slow.  LVM is not the problem
here, though it's possible that it may be a contributing factor.  You
need more RAM.

For Solr to be fast, a large percentage (ideally 100%, but smaller
fractions can often be enough) of the index must be loaded into unused
RAM by the operating system.  Your information seems to indicate that
the index is about 3 terabytes.  If that's the index size, I would guess
that you would need somewhere between 1 and 2 terabytes of total RAM for
speed to be acceptable.  Because RAM is *very* expensive on Amazon and
is not available in sizes like 256GB-1TB, that typically means a lot of
their virtual machines, with a lot of shards in SolrCloud.  You may find
that real hardware is less expensive for very large Solr indexes in the
long term than cloud hardware.

Thanks,
Shawn

Reply via email to