I am working on debugging an existing Lucene implementation. Before I started, I built a demo to understand Lucene. In my demo I indexed the entire content hierarhcy all at once, and than optimize this index and used it for queries. It was time consuming but very simply.
The code I am currently trying to fix indexes the content hierarchy by folder creating a seperate index for each one. Thus it ends up with a bunch of indexes. I still don't understand how this works (I am assuming they get merged someone that I have tracked down yet) but I have noticed it doesn't always index the right folder. This results in the users reporting "inconsistant" behavior in searching after they make a change to a document. To keep things simiple I would like to remove all the logic that figures out which folder to index and just do them all (usually less than 1000 files) so I end up with one index. Would indexing time be the only area I would be losing out in, or is there something more to the approach of creating multiple indexes and merging them. What is a good approach I can take to indexing a content hierarchy composed primarily of pdf, xsl, doc and xml where any of these documents can be changed several times a day? Thanks, Luke --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]