hi >> would the memory usage go through the roof?
Yup .... My past experience got me pickels in there... with regards karthik On Mon, Dec 5, 2011 at 11:28 PM, Rui Wang <rw...@ebi.ac.uk> wrote: > Hi All, > > We are planning to use lucene in our project, but not entirely sure about > some of the design decisions were made. Below are the details, any > comments/suggestions are more than welcome. > > The requirements of the project are below: > > 1. We have tens of thousands of files, their size ranging from 500M to a > few terabytes, and majority of the contents in these files will not be > accessed frequently. > > 2. We are planning to keep less accessed contents outside of our database, > store them on the file system. > > 3. We also have code to get the binary position of these contents in the > files. Using these binary positions, we can quickly retrieve the contents > and convert them into our domain objects. > > We think Lucene provides a scalable solution for storing and indexing > these binary positions, so the idea is that each piece of the content in > the files will a document, each document will have at least an ID field to > identify to content and a binary position field contains the starting and > stop position of the content. Having done some performance testing, it > seems to us that Lucene is well capable of doing this. > > At the moment, we are planning to create one Lucene index per file, so if > we have new files to be added to the system, we can simply generate a new > index. The problem is do with searching, this approach means that we need > to create an new IndexSearcher every time a file is accessed through our > web service. We knew that it is rather expensive to open a new > IndexSearcher, and are thinking of using some kind of pooling mechanism. > Our questions are: > > 1. Is this one index per file approach a viable solution? What do you > think about pooling IndexSearcher? > > 2. If we have many IndexSearchers opened at the same time, would the > memory usage go through the roof? I couldn't find any document on how > Lucene use allocate memory. > > Thank you very much for your help. > > Many thanks, > Rui Wang > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- *N.S.KARTHIK R.M.S.COLONY BEHIND BANK OF INDIA R.M.V 2ND STAGE BANGALORE 560094*