I agree w/ Erick, there is no cutoff point (index size for that matter)
above which you start sharding.
What you can do is create a scheduled job in your system that runs a select
list of queries and monitors their performance. Once it degrades, it shards
the index by either splitting it (you can
<<>>
Hmmm, then it's pretty hopeless I think. Problem is that
anything you say about running on a machine with
2G available memory on a single processor is completely
incomparable to running on a machine with 64G of
memory available for Lucene and 16 processors.
There's really no such thing as an
Hi all,
I know Lucene indexes to be at their optimum up to a certain size - said
to be around several GBs. I haven't found a good discussion over this,
but its my understanding that at some point its better to split an index
into parts (a la sharding) than to continue searching on a huge-size