Prediction About Index Sizes of Solr
This may not be a well detailed question but I will try to make it clear. I am crawling web pages and will index them at SolrCloud 4.2. What I want to predict is the index size. I will have approximately 2 billion web pages and I consider each of them will be 100 Kb. I know that it depends on storing documents, stop words. etc. etc. If you want to ask about detail of my question I may give you more explanation. However there should be some analysis to help me because I should predict something about what will be the index size for me. On the other hand my other important question is how SolrCloud makes replicas for indexes, can I change it how many replicas will be. Because I should multiply the total amount of index size with replica size. Here I found an article related to my analysis: http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/ I know this question may not be details but if you give ideas about it you are welcome.
Re: Prediction About Index Sizes of Solr
Hello! Let me answer the first part of your question. Please have a look at https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls It should help you make an estimation about your index size. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch This may not be a well detailed question but I will try to make it clear. I am crawling web pages and will index them at SolrCloud 4.2. What I want to predict is the index size. I will have approximately 2 billion web pages and I consider each of them will be 100 Kb. I know that it depends on storing documents, stop words. etc. etc. If you want to ask about detail of my question I may give you more explanation. However there should be some analysis to help me because I should predict something about what will be the index size for me. On the other hand my other important question is how SolrCloud makes replicas for indexes, can I change it how many replicas will be. Because I should multiply the total amount of index size with replica size. Here I found an article related to my analysis: http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/ I know this question may not be details but if you give ideas about it you are welcome.
Re: Prediction About Index Sizes of Solr
Interesting bit, thanks* *Rafał! On Mon, Apr 8, 2013 at 12:54 PM, Rafał Kuć r@solr.pl wrote: Hello! Let me answer the first part of your question. Please have a look at https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls It should help you make an estimation about your index size. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch This may not be a well detailed question but I will try to make it clear. I am crawling web pages and will index them at SolrCloud 4.2. What I want to predict is the index size. I will have approximately 2 billion web pages and I consider each of them will be 100 Kb. I know that it depends on storing documents, stop words. etc. etc. If you want to ask about detail of my question I may give you more explanation. However there should be some analysis to help me because I should predict something about what will be the index size for me. On the other hand my other important question is how SolrCloud makes replicas for indexes, can I change it how many replicas will be. Because I should multiply the total amount of index size with replica size. Here I found an article related to my analysis: http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/ I know this question may not be details but if you give ideas about it you are welcome.