Thanks Shawn.

What do you mean with "important parts of index"? and how to calculate their 
size?

Thanks,
Mahmoud

Sent from my iPhone

> On Dec 29, 2014, at 8:19 PM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 12/29/2014 2:36 AM, Mahmoud Almokadem wrote:
>> I've the same index with a bit different schema and 200M documents,
>> installed on 3 r3.xlarge (30GB RAM, and 600 General Purpose SSD). The size
>> of index is about 1.5TB, have many updates every 5 minutes, complex queries
>> and faceting with response time of 100ms that is acceptable for us.
>> 
>> Toke Eskildsen,
>> 
>> Is the index updated while you are searching? *No*
>> Do you do any faceting or other heavy processing as part of a search? *No*
>> How many hits does a search typically have and how many documents are
>> returned? *The test for QTime only with no documents returned and No. of
>> hits varying from 50,000 to 50,000,000.*
>> How many concurrent searches do you need to support? How fast should the
>> response time be? *May be 100 concurrent searches with 100ms with facets.*
>> 
>> Does splitting the shard to two shards on the same node so every shard will
>> be on a single EBS Volume better than using LVM?
> 
> The basic problem is simply that the system has so little memory that it
> must read large amounts of data from the disk when it does a query.
> There is not enough RAM to cache the important parts of the index.  RAM
> is much faster than disk, even SSD.
> 
> Typical consumer-grade DDR3-1600 memory has a data transfer rate of
> about 12800 megabytes per second.  If it's ECC memory (which I would say
> is a requirement) then the transfer rate is probably a little bit slower
> than that.  Figuring 9 bits for every byte gets us about 11377 MB/s.
> That's only an estimate, and it could be wrong in either direction, but
> I'll go ahead and use it.
> 
> http://en.wikipedia.org/wiki/DDR3_SDRAM#JEDEC_standard_modules
> 
> If your SSD is SATA, the transfer rate will be limited to approximately
> 600MB/s -- the 6 gigabit per second transfer rate of the newest SATA
> standard.  That makes memory about 18 times as fast as SATA SSD.  I saw
> one PCI express SSD that claimed a transfer rate of 2900 MB/s.  Even
> that is only about one fourth of the estimated speed of DDR3-1600 with
> ECC.  I don't know what interface technology Amazon uses for their SSD
> volumes, but I would bet on it being the cheaper version, which would
> mean SATA.  The networking between the EC2 instance and the EBS storage
> is unknown to me and may be a further bottleneck.
> 
> http://ocz.com/enterprise/z-drive-4500/specifications
> 
> Bottom line -- you need a lot more memory.  Speeding up the disk may
> *help* ... but it will not replace that simple requirement.  With EC2 as
> the platform, you may need more instances and more shards.
> 
> Your 200 million document index that works well with only 90GB of total
> memory ... that's surprising to me.  That means that the important parts
> of that index *do* fit in memory ... but if the index gets much larger,
> performance is likely to drop off sharply.
> 
> Thanks,
> Shawn
> 

Reply via email to