Hi Jan,

Thanks for all these details !

Answers are below.

Sincerely,
Bruno


Le 22/05/2012 13:58, Jan Høydahl a écrit :
Hi,

It is impossible to guess the required HW size without more knowledge about 
data and usage. 80 mill docs is a fair amount.

Here's how I would approach sizing the setup:
1) Get your schema in shape, removing unnecessary stored/indexed fields
Ok good idea !
2) To a test index locally of a part of the dataset, e.g. 10 mill docs and 
perform an Optimize
Concerning test, I have only actually a sample with 12000 docs. no more :'(
3) Measure the size of the index folder, multiply with 8 to get a clue of total 
index size
With 12 000 docs my index folder size is: 33Mo
ps: I use "solr.clustering.enabled=true"

4) Do some benchmarking with realistic types of queries to identify performance 
bottlenecks on query side
yep, this point is for later.

Depending on your requirements for search performance, you can beef up your RAM to 
hold the whole index or depend on slow disks as a bottleneck. If you find that 
total size of index is 16Gb, you should leave>16Gb free for OS disk caching, 
e.g. allocate 8Gb to Tomcat/Solr and leave the rest for the OS. If I should guess, 
you probably find that one server gets overloaded or too slow with your amount of 
docs, and that you end up with sharding across 2-4 servers.
I will take a look to see if I can easely increase RAM on the server (actually 24Go)

Another question concerning the execution of solr, have just to run java -jar start.jar ?
or you think I must run it with another way ?


PS: Do you always need to search all data? A trick may be to partition your data such 
that say 80% of searches go to a "fresh" index with 10% of the content, while 
the remaining searches include everything.
Yes I need to search to the whole index, even old document must be requested.


--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Training - www.solrtraining.com

On 22. mai 2012, at 11:06, Bruno Mannina wrote:

My choice: http://www.ovh.com/fr/serveurs_dedies/eg_best_of.xml

24 Go DDR3

Le 22/05/2012 10:26, findbestopensource a écrit :
Dedicated Server may not be required. If you want to cut down cost, then
prefer shared server.

How much the RAM?

Regards
Aditya
www.findbestopensource.com


On Tue, May 22, 2012 at 12:36 PM, Bruno Mannina<bmann...@free.fr>   wrote:

Dear Solr users,

My company would like to use solr to index around 80 000 000 documents
(xml files with around 5~10ko size each).
My program (robot) will connect to this solr with boolean requests.

Number of users: around 1000
Number of requests by user and by day: 300
Number of users by day: 30

I would like to subscribe to a host provider with this configuration:
- Dedicated Server
- Ubuntu
- Intel Xeon i7 2x 266+ GHz 12 Go 2 * 1500Go
- Unlimited bandwidth
- IP fixe

Do you think this configuration is enough?

Thanks for your info,
Sincerely
Bruno




Reply via email to