Re: Index Concurrency
On 5/10/07, joestelmach [EMAIL PROTECTED] wrote: Yes, coordination between the main index searcher, the index writer, and the index reader needed to delete other documents. Can you point me to any documentation/code that describes this implementation? Look at SolrCore.getSearcher() and DirectUpdateHandler2. -Yonik
Re: Index Concurrency
Though, isn't there a recent patch to allow multiple indices under a single Solr instance in JIRA? Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Yonik Seeley [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Wednesday, May 9, 2007 6:32:33 PM Subject: Re: Index Concurrency On 5/9/07, joestelmach [EMAIL PROTECTED] wrote: My first intuition is to give each user their own index. My thinking here is that querying would be faster (since each user's index would be much smaller than one big index,) and, more importantly, that I would dodge any concurrency issues stemming from multiple threads trying to update the same index simultaneously. I realize that Lucene implements a locking mechanism to protect against concurrent access, but I seem to hit the lock access timeout quite easily with only a couple threads. After looking at solr, I would really like to take advantage of the many features it adds to Lucene, but it doesn't look like I'll be able to achieve multiple indexes. No, not currently. Start your implementation with just a single index... unless it is very large, it will likely be fast enough. Solr also handles all the concurrency issues, and you should never hit lock access timeout when updating from multiple threads. -Yonik
Re: Index Concurrency
Yes, coordination between the main index searcher, the index writer, and the index reader needed to delete other documents. Can you point me to any documentation/code that describes this implementation? That's weird... I've never seen that. The lucene write lock is only obtained when the IndexWriter is created. Can you post the relevant part of the log file where the exception happens? After doing some more testing, I believe it was a stale lock file that was causing me to have these lock issues yesterday - sorry for the false alarm :) Also, unless you have at least 6 CPU cores or so, you are unlikely to see greater throughput with 10 threads. If you add multiple documents per HTTP-POST (such that HTTP latency is minimized), the best setting would probably be nThreads == nCores. For a single doc per POST, more threads will serve to cover the latency and keep Solr busy. I agree with your thinking here. My requirement for a large number of threads is somewhat of an artifact of my current system design. I'm trying not to serialize the system's processing at the point of indexing. -- View this message in context: http://www.nabble.com/Index-Concurrency-tf3718634.html#a10424207 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Index Concurrency
On 5/9/07, joestelmach [EMAIL PROTECTED] wrote: My first intuition is to give each user their own index. My thinking here is that querying would be faster (since each user's index would be much smaller than one big index,) and, more importantly, that I would dodge any concurrency issues stemming from multiple threads trying to update the same index simultaneously. I realize that Lucene implements a locking mechanism to protect against concurrent access, but I seem to hit the lock access timeout quite easily with only a couple threads. After looking at solr, I would really like to take advantage of the many features it adds to Lucene, but it doesn't look like I'll be able to achieve multiple indexes. No, not currently. Start your implementation with just a single index... unless it is very large, it will likely be fast enough. Solr also handles all the concurrency issues, and you should never hit lock access timeout when updating from multiple threads. -Yonik
Re: Index Concurrency
Yonik, Thanks for your fast reply. No, not currently. Start your implementation with just a single index... unless it is very large, it will likely be fast enough. My index will get quite large Solr also handles all the concurrency issues, and you should never hit lock access timeout when updating from multiple threads. Does solr provide any additional concurrency control over what Lucene provides? In my simple testing of indexing 2,000 messages, solr would issue lock access timeouts with as little as 10 threads. Running all 2,000 messages through sequentially yields no problems at all. Actually, I'm able churn through over 100,000 messages when no threads are involved. Am I missing some concurrency settings? Thanks, Joe -- View this message in context: http://www.nabble.com/Index-Concurrency-tf3718634.html#a10406382 Sent from the Solr - User mailing list archive at Nabble.com.
Re: Index Concurrency
On 5/9/07, joestelmach [EMAIL PROTECTED] wrote: Does solr provide any additional concurrency control over what Lucene provides? Yes, coordination between the main index searcher, the index writer, and the index reader needed to delete other documents. In my simple testing of indexing 2,000 messages, solr would issue lock access timeouts with as little as 10 threads. That's weird... I've never seen that. The lucene write lock is only obtained when the IndexWriter is created. Can you post the relevant part of the log file where the exception happens? Also, unless you have at least 6 CPU cores or so, you are unlikely to see greater throughput with 10 threads. If you add multiple documents per HTTP-POST (such that HTTP latency is minimized), the best setting would probably be nThreads == nCores. For a single doc per POST, more threads will serve to cover the latency and keep Solr busy. -Yonik