[
https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185751#comment-14185751
]
Shikhar Bhushan commented on LUCENE-5299:
-----------------------------------------
Just an update that the code rebased against recent trunk lives at
https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various
tweaks, like being able to throttle per-request parallelism in
{{ParallelSearchStrategy}}.
luceneutil bench numbers when running with ^
+ hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new
ForkJoinPool(128), 8)}}
+ luceneutil constants.py SEARCH_NUM_THREADS = 16
Against trunk, on a 32 core (with HT) Sandy Bridge server, with source
{{wikimedium500k}}
{noformat}
Report after iter 19:
TaskQPS baseline StdDev QPS parcol StdDev
Pct diff
Fuzzy1 81.91 (43.2%) 52.96 (39.7%)
-35.3% ( -82% - 83%)
LowTerm 2550.11 (11.9%) 1927.28 (5.6%)
-24.4% ( -37% - -7%)
Respell 43.02 (39.4%) 35.23 (31.5%)
-18.1% ( -63% - 87%)
Fuzzy2 19.32 (25.1%) 16.40 (34.8%)
-15.1% ( -59% - 59%)
MedTerm 1679.37 (12.2%) 1743.27 (8.6%)
3.8% ( -15% - 28%)
PKLookup 221.58 (8.3%) 257.36 (13.2%)
16.1% ( -4% - 41%)
AndHighLow 1027.99 (11.6%) 1278.39 (15.9%)
24.4% ( -2% - 58%)
AndHighMed 741.50 (10.0%) 1198.04 (27.5%)
61.6% ( 21% - 110%)
MedPhrase 709.04 (11.6%) 1203.02 (24.3%)
69.7% ( 30% - 119%)
LowSpanNear 601.13 (16.9%) 1127.30 (16.7%)
87.5% ( 46% - 145%)
LowSloppyPhrase 554.87 (10.8%) 1130.25 (30.5%)
103.7% ( 56% - 162%)
OrHighMed 408.55 (10.4%) 977.56 (20.1%)
139.3% ( 98% - 189%)
LowPhrase 364.36 (10.8%) 893.27 (41.0%)
145.2% ( 84% - 220%)
OrHighLow 355.78 (12.7%) 893.63 (19.6%)
151.2% ( 105% - 210%)
AndHighHigh 390.73 (10.3%) 1004.70 (24.3%)
157.1% ( 111% - 213%)
HighTerm 399.01 (11.8%) 1067.67 (12.1%)
167.6% ( 128% - 217%)
Wildcard 754.76 (11.6%) 2067.96 (28.0%)
174.0% ( 120% - 241%)
HighSpanNear 153.57 (14.8%) 463.54 (24.3%)
201.8% ( 141% - 282%)
OrHighHigh 212.16 (12.4%) 665.56 (28.2%)
213.7% ( 154% - 290%)
HighPhrase 170.49 (13.1%) 547.72 (17.3%)
221.3% ( 168% - 289%)
HighSloppyPhrase 66.91 (10.1%) 219.59 (12.0%)
228.2% ( 187% - 278%)
MedSloppyPhrase 128.73 (12.5%) 425.67 (20.3%)
230.7% ( 175% - 300%)
MedSpanNear 130.31 (10.7%) 436.12 (18.2%)
234.7% ( 185% - 295%)
Prefix3 166.91 (14.9%) 652.64 (26.7%)
291.0% ( 217% - 390%)
IntNRQ 110.73 (15.0%) 467.72 (33.6%)
322.4% ( 238% - 436%)
{noformat}
> Refactor Collector API for parallelism
> --------------------------------------
>
> Key: LUCENE-5299
> URL: https://issues.apache.org/jira/browse/LUCENE-5299
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Shikhar Bhushan
> Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch,
> LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt
>
>
> h2. Motivation
> We should be able to scale-up better with Solr/Lucene by utilizing multiple
> CPU cores, and not have to resort to scaling-out by sharding (with all the
> associated distributed system pitfalls) when the index size does not warrant
> it.
> Presently, IndexSearcher has an optional constructor arg for an
> ExecutorService, which gets used for searching in parallel for call paths
> where one of the TopDocCollector's is created internally. The
> per-atomic-reader search happens in parallel and then the
> TopDocs/TopFieldDocs results are merged with locking around the merge bit.
> However there are some problems with this approach:
> * If arbitary Collector args come into play, we can't parallelize. Note that
> even if ultimately results are going to a TopDocCollector it may be wrapped
> inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both.
> * The special-casing with parallelism baked on top does not scale, there are
> many Collector's that could potentially lend themselves to parallelism, and
> special-casing means the parallelization has to be re-implemented if a
> different permutation of collectors is to be used.
> h2. Proposal
> A refactoring of collectors that allows for parallelization at the level of
> the collection protocol.
> Some requirements that should guide the implementation:
> * easy migration path for collectors that need to remain serial
> * the parallelization should be composable (when collectors wrap other
> collectors)
> * allow collectors to pick the optimal solution (e.g. there might be memory
> tradeoffs to be made) by advising the collector about whether a search will
> be parallelized, so that the serial use-case is not penalized.
> * encourage use of non-blocking constructs and lock-free parallelism,
> blocking is not advisable for the hot-spot of a search, besides wasting
> pooled threads.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]