> Out of curiosity, what is the size of your corpus? How much and how > quickly do you expect it to grow?
in terms of lucene documents, we tend to have in the 10M-100M range. Currently we use merging to make larger indices from smaller ones, so a single index can have a lot of documents in it, but merging takes a long time so I'm trying to test out just using a ton of tiny indices to see if the search penalty from doing that is worth the time savings from not having to build and optimize large indices. > I'm just trying to make sure that we are all on the same page here ^^ > > I can see the benefits of doing what you are describing with a very > large corpus that is expected to grow at quick rate, but if that's not > really your use case, then perhaps it might be worth investigating if a > simpler solution would serve you just as well. The indices also grow pretty quickly. We have some cases where we get nearly a million new documents per day. I haven't looked at those machines for quite a while, but I guess they'd probably have well over a hundred million documents, and still are growing. We also don't have a lot of simultaneous searches yet, but that's changing, so I'm getting concerned about how well that's being handled. We expect that we will soon be dealing with tens to hundreds of searches being executed simultaneously. > In the example you provided, you are only talking about searching > against 1M documents, which I can guarantee will search with VERY good > performance in a single properly setup lucene index. > > Now if we are talking more on the order of... 100M or more documents you > may be onto something. But in any case, there isn't currently any framework for making multiple searches simultaneously use an index in a coordinated fashion. I was pretty much just planning on adding it to my tests if such a thing existed. Since it doesn't, I guess I'll stuck to searching in parallel and hoping that the Linux VFS layer is smart enough to keep things fast until I have time to try putting something together myself. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org