Re: Size of index to use shard

2012-01-26 Thread Dmitry Kan
@Erick: Thanks for the detailed explanation. On this note, we have 75GB for *.fdt and *.fdx out of 99GB index. The search is still not that fast, if cache size is small. But giving more cache led to OOMs. Partitioning to shards is not an option either, as at the moment we try to run as less

Re: Size of index to use shard

2012-01-24 Thread Vadim Kisselmann
Hi, it depends from your hardware. Read this: http://www.derivante.com/2009/05/05/solr-performance-benchmarks-single-vs-multi-core-index-shards/ Think about your cache-config (few updates, big caches) and a good HW-infrastructure. In my case i can handle a 250GB index with 100mil. docs on a I7

Re: Size of index to use shard

2012-01-24 Thread Dmitry Kan
Hi, The article you gave mentions 13GB of index size. It is quite small index from our perspective. We have noticed, that at least solr 3.4 has some sort of choking point with respect to growing index size. It just becomes substantially slower than what we need (a query on avg taking more than

Re: Size of index to use shard

2012-01-24 Thread Anderson vasconcelos
Apparently, not so easy to determine when to break the content into pieces. I'll investigate further about the amount of documents, the size of each document and what kind of search is being used. It seems, I will have to do a load test to identify the cutoff point to begin using the strategy of

Re: Size of index to use shard

2012-01-24 Thread Erick Erickson
Talking about index size can be very misleading. Take a look at http://lucene.apache.org/java/3_5_0/fileformats.html#file-names. Note that the *.fdt and *.fdx files are used to for stored fields, i.e. the verbatim copy of data put in the index when you specify stored=true. These files have

Re: Size of index to use shard

2012-01-24 Thread Anderson vasconcelos
Thanks for the explanation Erick :) 2012/1/24, Erick Erickson erickerick...@gmail.com: Talking about index size can be very misleading. Take a look at http://lucene.apache.org/java/3_5_0/fileformats.html#file-names. Note that the *.fdt and *.fdx files are used to for stored fields, i.e. the

Re: Size of index to use shard

2012-01-24 Thread Vadim Kisselmann
@Erick thanks:) i´m with you with your opinion. my load tests show the same. @Dmitry my docs are small too, i think about 3-15KB per doc. i update my index all the time and i have an average of 20-50 requests per minute (20% facet queries, 80% large boolean queries with wildcard/fuzzy) . How much

Size of index to use shard

2012-01-23 Thread Anderson vasconcelos
Hi Has some size of index (or number of docs) that is necessary to break the index in shards? I have a index with 100GB of size. This index increase 10GB per year. (I don't have information how many docs they have) and the docs never will be deleted. Thinking in 30 years, the index will be with