Uh, I hate to market it, but.... it's in the book. But you don't have to wait for it, as there already is a Lucene demo that does what you described. I am not sure if the demo always recreates the index or whether it deletes and re-adds only the new and modified files, but if it's the former, you would only need to modify the demo a little bit to check the timestamps of File objects and compare them to those stored in the index (if they are being stored - if not, you should add a field to hold that data)
Otis --- Luke Shannon <[EMAIL PROTECTED]> wrote: > I am working on debugging an existing Lucene implementation. > > Before I started, I built a demo to understand Lucene. In my demo I > indexed > the entire content hierarhcy all at once, and than optimize this > index and > used it for queries. It was time consuming but very simply. > > The code I am currently trying to fix indexes the content hierarchy > by > folder creating a seperate index for each one. Thus it ends up with a > bunch > of indexes. I still don't understand how this works (I am assuming > they get > merged someone that I have tracked down yet) but I have noticed it > doesn't > always index the right folder. This results in the users reporting > "inconsistant" behavior in searching after they make a change to a > document. > To keep things simiple I would like to remove all the logic that > figures out > which folder to index and just do them all (usually less than 1000 > files) so > I end up with one index. > > Would indexing time be the only area I would be losing out in, or is > there > something more to the approach of creating multiple indexes and > merging > them. > > What is a good approach I can take to indexing a content hierarchy > composed > primarily of pdf, xsl, doc and xml where any of these documents can > be > changed several times a day? > > Thanks, > > Luke > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]