Unfortunately, there's not much anyone can say. If I can paraphrase what you're asking, it's "Will a Lucene search be fast enough?" The answer is "it depends". Asking the question "is part of the index stored in RAM" isn't really relevant.
Yes, some parts of the index are cached in RAM. Yes, that is handled transparently under the covers. That information is entirely useless for answering the question "is the search fast enough". All you can do is create an index and test. This is especially true since you've supplied no information that helps. 6M documents is the only hard figure you've provided. Are you indexing 1 field per doc? 100 fields per doc? Do you expect to have to do lots of sorting? Complex queries? What is acceptable performance? How fast is "fast"? How many requests do you anticipate per second/minute/hour? And on and on. But even with those pieces of information, we still can't help much because there are just so many variables. My suggestion: Create a simple Lucene index of your documenta and measure. Read the FAQ and other documentation on the Lucene website and on the Lucene Wiki to get some pointers on how to structure your index for speed. And experiment. Nothing else will give you a feel for the speed. Creating the index should take you about a day. It'll be time well spent. Best Erick On Nov 26, 2007 9:27 AM, Haroldo Nascimento <[EMAIL PROTECTED]> wrote: > Hi, > > I have a very great volume of data (6.000.000 of documents) and I > need to have a very fast search. I am thinking about using Terracotta > (with Lucene) for clustering the solution. > > One of the advantages of the Terracotta is that part of the index is > stored in memory and part is persisted em disk. If not to find in > memory the application searchs in disk. This is transparent for the > user. The problem would be in the process of updat of index. > > Another solution would be to persist the index using only Lucene, > but I believe that the reply time very using disk either bigger that > the solution in memory. > > You know some document comparative about the solution em memory > (RAMDirectory) and solution em disk using Lucene? > > Tip: In another application I am using the solution index in memory > (RAMDirectory), to initiate the process of load of the indice I > serialized the RAMDirectory object. For it I need insert "implements > Serializable" in some classrooms of Lucene. > > > > On Nov 25, 2007 5:55 PM, Erick Erickson <[EMAIL PROTECTED]> wrote: > > As I understand, Lucene does a fair amount of caching of terms in > > memory without you having to specify anything. > > > > But it's hard to see how your question relates. Remember that Lucene is > > finding *all* matching docs. So searching in a RAMdirectory and then > > searching in the file doesn't really seem possible since Lucene has to > > search > > the entire index every time to score the docs. It doesn't stop after > > the first hit, since the next hit may score higher. > > > > But I'm sure Lucene *does* cache portions of the index in RAM when > > possible, but I've never had occasion to dig into the details. > > > > Which leads me to ask "Why do you care?". Is there a specific > > situation you're trying to get better performance from or is this more > > of a background question? If you have a specific situation, please > describe > > it in some detail so better minds than mine can give you a better > > response <G>.... > > > > Best > > Erick > > > > On Nov 24, 2007 10:26 AM, Haroldo Nascimento <[EMAIL PROTECTED]> > > wrote: > > > > > > > Hi, > > > > > > I have a question ? > > > > > > Lucene offers a mixing structure of storage of index, that is, first > > > do search in memoria (ARMDirectory) and in case of not found do search > > > in index file automatically ? For example: Load part of index in > > > memory for do the search fastest. > > > > > > Thnaks > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >