several existential issues about Lucene's filesystem

2007-06-27 Thread Samuel LEMOINE
Hi everyone ! I'm working on bibliographical researches on Lucene as an intern in Lingway (which uses Lucene in its main product), and I'm currently studying Lucene's file system. There are several things I don't catch in Lucene's file system, and I thought here was the right place to ask abou

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE
Grant Ingersoll a écrit : On Jun 27, 2007, at 8:51 AM, Samuel LEMOINE wrote: Hi everyone ! I'm working on bibliographical researches on Lucene as an intern in Lingway (which uses Lucene in its main product), and I'm currently studying Lucene's file system. There are several

Re: several existential issues about Lucene's filesystem

2007-06-28 Thread Samuel LEMOINE
Grant Ingersoll a écrit : On Jun 28, 2007, at 5:29 AM, Samuel LEMOINE wrote: Thanks for the resources about payloads, I'll have a look over it. About the positions/offsets in .tvf, please tell me if I've well understood: The .tvd provides the needed informations concerning the occur

Re: Scaling up to several machines with Lucene

2007-06-28 Thread Samuel LEMOINE
Chun Wei Ho a écrit : Hi, We are currently running a Tomcat web application serving searches over our Lucene index (10GB) on a single server machine (Dual 3GHz CPU, 4GB RAM). Due to performance issues and to scale up to handle more traffic/search requests, we are getting another server machine.

Re: checking existing docs before indexing

2007-07-12 Thread Samuel LEMOINE
Neeraj Gupta a écrit : Hi, You an use updateDocument() method of IndexWriter to update any existing document.. It searches for a document matching the Term, if document existes then delete that document. After that it adds the provided document to the indexes in both the cases whether documen

deleting/updating/identifying a document

2007-07-20 Thread Samuel LEMOINE
Hi everybody ! I'm asking myself about the way Lucene deals with deleting documents. As far as I know, a document is identified by a document number, but this document number is not reliable for long-term issues as it may change on segment merging. The way Lucene deletes documents' data from th