On Thu, Oct 22, 2009 at 10:29 PM, Hrishikesh Agashe <
hrishikesh_aga...@persistent.co.in> wrote:

> Can I create an index file with very large size, like 1 TB or so? Is there
> any limit on how large index file one can create? Also, will I be able to
> search on this 1 TB index file at all?
>

Leaving aside the question of hardware or JVM limits on monstrous files,
this question (can you search this file) is easier: if you've got say, a ten
billion documents in one index, and you have a query which is going to hit
maybe even just 0.1% of the documents, you'll need to do scoring of 10
million hits in the course of that query.  To do this in under a second
means you only have 100 nanoseconds to look at each document.  If your query
hits 1% of your documents, you're down to 10 ns per document.  I've never
tried searching a 1TB index, but I'd say that's pushing it.

Is there a reason you can't shard your index, and instead put maybe 20
shards of 50GB (or better - 100 shards of 10GB) each on a variety of
machines, and just merge results?

  -jake

Reply via email to