Greetings,
I'm creating an application that requires the indexing of millions of documents on behalf of a large group of users, and was hoping to get an opinion on whether I should use one index per user or one index per day. My application will have to handle the following: - the indexing of about 1 million 5K documents per day, with each document containing about 5 fields - expiration of documents, since after a while, my hard drive would run out of room - queries that consist of boolean expressions (e.g., the body field contains "a" AND "b", and the title field contains "c"), as well as ranges (e.g., the document needs to have been indexed between 2/25/07 10:00 am and 2/28/07 9:00 pm) - permissions; in other words, user A might be able to search on documents X and Y, but user B might be able to search on documents Y and Z. - up to 1,000 users So, I was considering the following: 1) Using one index per user This would entail creating and using up to 1,000 indices. Document Y in the example above would have to be duplicated. Expiration is performed via IndexWriter.deleteDocuments. The advantage here is that querying should be reasonably quick, because each index would only contain tens of thousands of documents, instead of millions. The disadvantages: I'm concerned about the "too many open files" error, and I'm also concerned about the performance of deleteDocuments. 2) Using one index per day Each day, I create a new index. Again, document Y in the example above would have to be duplicated (is there any way around this?) The advantage here is that expiring documents means simply deleting the index corresponding to a particular day. The disadvantage is the query performance, since the queries, which are already very complex, would have to be performed using MultiSearcher (if expiration is after 10 days, that's 10 indices to search across). Tough to know for sure which option is better without testing, but does anyone have a gut reaction? Any advice would be greatly appreciated! Thanks, Ariel ____________________________________________________________________________________ Need Mail bonding? Go to the Yahoo! Mail Q&A for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=list&sid=396546091 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]