> This is a good approach if the number of total documents doesn't grow > too much. There's obviously a limit to full index runs at some point.
Well I was actually going to go with incremental indexing, since a full reindex will probably take ~1 hour. We have a relatively fixed size of data, but the data is updated very frequently - almost 100% turnover/day. > You need to find out where you lose most of the time: Fair enough, I haven't tried much in the way of profiling yet. I just thought you might have found some Lucene settings that made a big difference for you, or you'd found indexing into a RAMDirectory then dumping it to disk was faster, etc. But it sounds like you're pretty happy with near default settings. > However, what I wonder: if you have your data in a database anyway, why > not use the database's indexing features? It seems like Lucene is an > additional layer on top of your data, which you don't really need. Our current DB server (running SQL Server) is under enormous strain, partly due to the complex searches that are being performed against it. We've got it pretty heavily tweaked already, so I don't think there's too much room to improve on that front. The idea is to use Lucene to take the searching load off it so it can get on with all the other tasks it has to perform. The Lucene implementation I'm working on here is just a proof of concept - it may be that we stay with SQL Server in the long run anyway, but Lucene definitely seems to be worth investigating - it has certainly worked well for us on smaller projects. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]