Prediction About Index Sizes of Solr

2013-04-08 Thread Furkan KAMACI
This may not be a well detailed question but I will try to make it clear.

I am crawling web pages and will index them at SolrCloud 4.2. What I want
to predict is the index size.

I will have approximately 2 billion web pages and I consider each of them
will be 100 Kb.
I know that it depends on storing documents, stop words. etc. etc. If you
want to ask about detail of my question I may give you more explanation.
However there should be some analysis to help me because I should predict
something about what will be the index size for me.

On the other hand my other important question is how SolrCloud makes
replicas for indexes, can I change it how many replicas will be. Because I
should multiply the total amount of index size with replica size.

Here I found an article related to my analysis:
http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/

I know this question may not be details but if you give ideas about it you
are welcome.


Re: Prediction About Index Sizes of Solr

2013-04-08 Thread Rafał Kuć
Hello!

Let me answer the first part of your question. Please have a look at
https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls
It should help you make an estimation about your index size. 

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

 This may not be a well detailed question but I will try to make it clear.

 I am crawling web pages and will index them at SolrCloud 4.2. What I want
 to predict is the index size.

 I will have approximately 2 billion web pages and I consider each of them
 will be 100 Kb.
 I know that it depends on storing documents, stop words. etc. etc. If you
 want to ask about detail of my question I may give you more explanation.
 However there should be some analysis to help me because I should predict
 something about what will be the index size for me.

 On the other hand my other important question is how SolrCloud makes
 replicas for indexes, can I change it how many replicas will be. Because I
 should multiply the total amount of index size with replica size.

 Here I found an article related to my analysis:
 http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/

 I know this question may not be details but if you give ideas about it you
 are welcome.



Re: Prediction About Index Sizes of Solr

2013-04-08 Thread Dmitry Kan
Interesting bit, thanks* *Rafał!



On Mon, Apr 8, 2013 at 12:54 PM, Rafał Kuć r@solr.pl wrote:

 Hello!

 Let me answer the first part of your question. Please have a look at

 https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls
 It should help you make an estimation about your index size.

 --
 Regards,
  Rafał Kuć
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

  This may not be a well detailed question but I will try to make it clear.

  I am crawling web pages and will index them at SolrCloud 4.2. What I want
  to predict is the index size.

  I will have approximately 2 billion web pages and I consider each of them
  will be 100 Kb.
  I know that it depends on storing documents, stop words. etc. etc. If you
  want to ask about detail of my question I may give you more explanation.
  However there should be some analysis to help me because I should predict
  something about what will be the index size for me.

  On the other hand my other important question is how SolrCloud makes
  replicas for indexes, can I change it how many replicas will be. Because
 I
  should multiply the total amount of index size with replica size.

  Here I found an article related to my analysis:
  http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/

  I know this question may not be details but if you give ideas about it
 you
  are welcome.