[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979276#action_12979276
 ] 

Earwin Burrfoot commented on LUCENE-2840:
-

bq. But doesn't that mean that an app w/ rare queries but each query is massive 
fails to use all available concurrency?
Yes. But that's not my case. And likely not someone else's.

I think if you want to be super-generic, it's better to defer exact threading 
to the user, instead of doing a one-size-fits-all solution. Else you risk 
conjuring another ConcurrentMergeScheduler.
While we're at it, we can throw in some sample implementation, which can 
satisfy some of the users, but not everyone.

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Doron Cohen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979284#action_12979284
 ] 

Doron Cohen commented on LUCENE-2840:
-

Is it a possible that with this, searching a large optimized index (single 
segment) might be slower than searching an un-optimzed index of the same size, 
since the latter enjoys concurrency? If so, is it too wild for more than one 
thread to handle that single segment?

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979293#action_12979293
 ] 

Michael McCandless commented on LUCENE-2840:


bq. I think if you want to be super-generic, it's better to defer exact 
threading to the user, instead of doing a one-size-fits-all solution. Else you 
risk conjuring another ConcurrentMergeScheduler.

I think something like CMS (basically a custom ES w/ proper thread 
prio/scheduling) will be necessary here.

Until Java can schedule threads the way an OS schedules processes we'll need to 
emulate it ourselves.

You want long running queries (or, merges) to be gracefully down prioritized so 
that new/fast queries (merges) finish quickly.

And you want searches (merges) to use the allowed concurrency fully.

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979306#action_12979306
 ] 

Earwin Burrfoot commented on LUCENE-2840:
-

A lot of fork-join type frameworks don't even care. Even though scheduling 
threads is something people supposedly use them for.
Why? I guess that's due to low yield/cost ratio.
You frequently quote progress, not perfection in relation to the code, but 
why don't we apply this same principle to our threading guarantees?
I don't want to use allowed concurrency fully. That's not realistic. I want 85% 
of it. That's already a huge leap ahead of single-threaded searches.


 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979337#action_12979337
 ] 

Michael McCandless commented on LUCENE-2840:


bq. You frequently quote progress, not perfection in relation to the code, 
but why don't we apply this same principle to our threading guarantees?

Oh we should definitely apply progress not perfection here -- in fact we 
already are: for starters (today), we bind concurrency to segments (so eg an 
optimized index has no concurrency), and we just use an ES (punt this thread 
scheduling problem to the caller).  This is better than nothing, but not good 
enough -- we can do better.

There's another quote that applies here: big dreams, small steps.  My comment 
above is dreaming but when it comes time to actually get the real work done / 
making progress towards that dream, of course we take baby steps / progress not 
perfection.

Design discussions should start w/ the big dreams but then once you've got a 
rough sense of where you want to get to in the future you shift back to the 
baby steps you do today, in the direction of that future goal.

Maybe I should wrap my comments in /dream tags and /babysteps tags!

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-03 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976928#action_12976928
 ] 

Michael McCandless commented on LUCENE-2840:


bq. Using fewer threads per-search than total available is a precaution against 
biggie searches blocking fast ones.

But doesn't that mean that an app w/ rare queries but each query is massive 
fails to use all available concurrency?

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2010-12-30 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976027#action_12976027
 ] 

Earwin Burrfoot commented on LUCENE-2840:
-

I use the following scheme:
* There is a fixed pool of threads shared by all searches, that limits total 
concurrency.
* Each new search apprehends at most a fixed number of threads from this pool 
(say, 2-3 of 8 in my setup),
* and these threads churn through segments as through a queue (in maxDoc order, 
but I think even that is unnecessary).

No special smart binding between threads and segments (eg. 1 thread for each 
biggie, 1 thread for all of the small ones) -
means simpler code, and zero possibility of stalling, when there are threads to 
run, segments to search, but binding policy does not connect them.
Using fewer threads per-search than total available is a precaution against 
biggie searches blocking fast ones.

 Multi-Threading in IndexSearcher (after removal of MultiSearcher and 
 ParallelMultiSearcher)
 ---

 Key: LUCENE-2840
 URL: https://issues.apache.org/jira/browse/LUCENE-2840
 Project: Lucene - Java
  Issue Type: Sub-task
  Components: Search
Reporter: Uwe Schindler
Priority: Minor
 Fix For: 4.0


 Spin-off from parent issue:
 {quote}
 We should discuss about how many threads should be spawned. If you have an 
 index with many segments, even small ones, I think only the larger segments 
 should be separate threads, all others should be handled sequentially. So 
 maybe add a maxThreads cound, then sort the IndexReaders by maxDoc and then 
 only spawn maxThreads-1 threads for the bigger readers and then one 
 additional thread for the rest?
 {quote}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org