Hi,

It is impossible to guess the required HW size without more knowledge about 
data and usage. 80 mill docs is a fair amount.

Here's how I would approach sizing the setup:
1) Get your schema in shape, removing unnecessary stored/indexed fields
2) To a test index locally of a part of the dataset, e.g. 10 mill docs and 
perform an Optimize
3) Measure the size of the index folder, multiply with 8 to get a clue of total 
index size
4) Do some benchmarking with realistic types of queries to identify performance 
bottlenecks on query side

Depending on your requirements for search performance, you can beef up your RAM 
to hold the whole index or depend on slow disks as a bottleneck. If you find 
that total size of index is 16Gb, you should leave >16Gb free for OS disk 
caching, e.g. allocate 8Gb to Tomcat/Solr and leave the rest for the OS. If I 
should guess, you probably find that one server gets overloaded or too slow 
with your amount of docs, and that you end up with sharding across 2-4 servers.

PS: Do you always need to search all data? A trick may be to partition your 
data such that say 80% of searches go to a "fresh" index with 10% of the 
content, while the remaining searches include everything.

--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 22. mai 2012, at 11:06, Bruno Mannina wrote:

> My choice: http://www.ovh.com/fr/serveurs_dedies/eg_best_of.xml
> 
> 24 Go DDR3
> 
> Le 22/05/2012 10:26, findbestopensource a écrit :
>> Dedicated Server may not be required. If you want to cut down cost, then
>> prefer shared server.
>> 
>> How much the RAM?
>> 
>> Regards
>> Aditya
>> www.findbestopensource.com
>> 
>> 
>> On Tue, May 22, 2012 at 12:36 PM, Bruno Mannina<bmann...@free.fr>  wrote:
>> 
>>> Dear Solr users,
>>> 
>>> My company would like to use solr to index around 80 000 000 documents
>>> (xml files with around 5~10ko size each).
>>> My program (robot) will connect to this solr with boolean requests.
>>> 
>>> Number of users: around 1000
>>> Number of requests by user and by day: 300
>>> Number of users by day: 30
>>> 
>>> I would like to subscribe to a host provider with this configuration:
>>> - Dedicated Server
>>> - Ubuntu
>>> - Intel Xeon i7 2x 266+ GHz 12 Go 2 * 1500Go
>>> - Unlimited bandwidth
>>> - IP fixe
>>> 
>>> Do you think this configuration is enough?
>>> 
>>> Thanks for your info,
>>> Sincerely
>>> Bruno
>>> 
> 

Reply via email to