: And our final queries sent to lucene are quite complicated. This is because : we need to confirm to a lot of criteria (some are set by users and some are : internal logics). I don't think we can simplify our queries.
in my experience, your queries can always be made simpler by making your indexing more complex -- that's not neccessarily a good tradeoff to make in every situation, but it's usually possible to "denormalize" your index to help decrease complexity. : We are not sure about implementing using Solr because we crawl only specific : type of sites and our crawling mechanism has proven to be quite stable. We whether or not you want to use Solr is really independent of what kidn of crawler you currently have -- you'd still use hte same crawler, but dpeneding on how you wanted to leverage Solr either you'd change your crawler to POST your new documents to Solr instead of writing the the index directly, or you could keep writing to the index directly and just use Solr to serve the searches (assuming you configure the Solr schema.xml to match up with the fields/analyzers your crawler uses when writing to the index so it can query on the right fields) if you have no interest in using Solr to manage your index, even if you have no interest in using Solr to search your index over HTTP, you might want to take a look at the distribution scripts that come with Solr to provide replication. At the core they are incredibly simple, just taking advantage of rsync, hard links, and properties of the lucene fileformat to help minimize the amount of data that needs to go over the wire when you want to index on one box and then replicate that index to 10 other boxes to distribute the search load. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]