Re: Index Concurrency

2007-05-11 Thread Yonik Seeley

On 5/10/07, joestelmach [EMAIL PROTECTED] wrote:

 Yes, coordination between the main index searcher, the index writer,
 and the index reader needed to delete other documents.

Can you point me to any documentation/code that describes this
implementation?


Look at SolrCore.getSearcher() and DirectUpdateHandler2.

-Yonik


Re: Index Concurrency

2007-05-10 Thread Otis Gospodnetic
Though, isn't there a recent patch to allow multiple indices under a single 
Solr instance in JIRA?

Otis
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Yonik Seeley [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, May 9, 2007 6:32:33 PM
Subject: Re: Index Concurrency

On 5/9/07, joestelmach [EMAIL PROTECTED] wrote:
 My first intuition is to give each user their own index. My thinking here is
 that querying would be faster (since each user's index would be much smaller
 than one big index,) and, more importantly, that I would dodge any
 concurrency issues stemming from multiple threads trying to update the same
 index simultaneously.  I realize that Lucene implements a locking mechanism
 to protect against concurrent access, but I seem to hit the lock access
 timeout quite easily with only a couple threads.

 After looking at solr, I would really like to take advantage of the many
 features it adds to Lucene, but it doesn't look like I'll be able to achieve
 multiple indexes.

No, not currently.  Start your implementation with just a single
index... unless it is very large, it will likely be fast enough.

Solr also handles all the concurrency issues, and you should never hit
lock access timeout when updating from multiple threads.

-Yonik





Re: Index Concurrency

2007-05-10 Thread joestelmach


 Yes, coordination between the main index searcher, the index writer,
 and the index reader needed to delete other documents.

Can you point me to any documentation/code that describes this
implementation?

 That's weird... I've never seen that.
 The lucene write lock is only obtained when the IndexWriter is created.
 Can you post the relevant part of the log file where the exception
 happens?

After doing some more testing, I believe it was a stale lock file that was
causing me to have these lock issues yesterday - sorry for the false alarm
:)

 Also, unless you have at least 6 CPU cores or so, you are unlikely to
 see greater throughput with 10 threads.  If you add multiple documents
 per HTTP-POST (such that HTTP latency is minimized), the best setting
 would probably be nThreads == nCores.  For a single doc per POST, more
 threads will serve to cover the latency and keep Solr busy.

I agree with your thinking here.  My requirement for a large number of
threads is somewhat of an artifact of my current system design.  I'm trying
not to serialize the system's processing at the point of indexing.
-- 
View this message in context: 
http://www.nabble.com/Index-Concurrency-tf3718634.html#a10424207
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Index Concurrency

2007-05-09 Thread Yonik Seeley

On 5/9/07, joestelmach [EMAIL PROTECTED] wrote:

My first intuition is to give each user their own index. My thinking here is
that querying would be faster (since each user's index would be much smaller
than one big index,) and, more importantly, that I would dodge any
concurrency issues stemming from multiple threads trying to update the same
index simultaneously.  I realize that Lucene implements a locking mechanism
to protect against concurrent access, but I seem to hit the lock access
timeout quite easily with only a couple threads.

After looking at solr, I would really like to take advantage of the many
features it adds to Lucene, but it doesn't look like I'll be able to achieve
multiple indexes.


No, not currently.  Start your implementation with just a single
index... unless it is very large, it will likely be fast enough.

Solr also handles all the concurrency issues, and you should never hit
lock access timeout when updating from multiple threads.

-Yonik


Re: Index Concurrency

2007-05-09 Thread joestelmach

Yonik,

Thanks for  your fast reply.

 No, not currently.  Start your implementation with just a single
 index... unless it is very large, it will likely be fast enough.

My index will get quite large

 Solr also handles all the concurrency issues, and you should never hit
 lock access timeout when updating from multiple threads.

Does solr provide any additional concurrency control over what Lucene
provides?  In my simple testing of indexing 2,000 messages, solr would issue
lock access timeouts with as little as 10 threads.   Running all 2,000
messages through sequentially yields no problems at all.   Actually, I'm
able churn through over 100,000 messages when no threads are involved.  Am I
missing some concurrency settings?

Thanks,
Joe


-- 
View this message in context: 
http://www.nabble.com/Index-Concurrency-tf3718634.html#a10406382
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Index Concurrency

2007-05-09 Thread Yonik Seeley

On 5/9/07, joestelmach [EMAIL PROTECTED] wrote:

Does solr provide any additional concurrency control over what Lucene
provides?


Yes, coordination between the main index searcher, the index writer,
and the index reader needed to delete other documents.


In my simple testing of indexing 2,000 messages, solr would issue
lock access timeouts with as little as 10 threads.


That's weird... I've never seen that.
The lucene write lock is only obtained when the IndexWriter is created.
Can you post the relevant part of the log file where the exception happens?

Also, unless you have at least 6 CPU cores or so, you are unlikely to
see greater throughput with 10 threads.  If you add multiple documents
per HTTP-POST (such that HTTP latency is minimized), the best setting
would probably be nThreads == nCores.  For a single doc per POST, more
threads will serve to cover the latency and keep Solr busy.

-Yonik