deleting/updating/identifying a document

2007-07-20 Thread Samuel LEMOINE
Hi everybody ! I'm asking myself about the way Lucene deals with deleting documents. As far as I know, a document is identified by a document number, but this document number is not reliable for long-term issues as it may change on segment merging. The way Lucene deletes documents' data from th

Re: checking existing docs before indexing

2007-07-12 Thread Samuel LEMOINE
Neeraj Gupta a écrit : Hi, You an use updateDocument() method of IndexWriter to update any existing document.. It searches for a document matching the Term, if document existes then delete that document. After that it adds the provided document to the indexes in both the cases whether documen

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Samuel LEMOINE
Chun Wei Ho a écrit : Hi, We are currently running a Tomcat web application serving searches over our Lucene index (10GB) on a single server machine (Dual 3GHz CPU, 4GB RAM). Due to performance issues and to scale up to handle more traffic/search requests, we are getting another server machine.

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE
Grant Ingersoll a écrit : On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote: Thanks for the resources about payloads, I'll have a look over it. About the positions/offsets in .tvf, please tell me if I've well understood: The .tvd provides the needed informations concerning the occur

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE
Grant Ingersoll a écrit : On Jun 27, 2007, at 8:51 AM, Samuel LEMOINE wrote: Hi everyone ! I'm working on bibliographical researches on Lucene as an intern in Lingway (which uses Lucene in its main product), and I'm currently studying Lucene's file system. There are several

several existential issues about Lucene's filesystem

2007-06-27 Thread Samuel LEMOINE
Hi everyone ! I'm working on bibliographical researches on Lucene as an intern in Lingway (which uses Lucene in its main product), and I'm currently studying Lucene's file system. There are several things I don't catch in Lucene's file system, and I thought here was the right place to ask abou