[ https://issues.apache.org/jira/browse/LUCENE-5299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14185751#comment-14185751 ]
Shikhar Bhushan commented on LUCENE-5299: ----------------------------------------- Just an update that the code rebased against recent trunk lives at https://github.com/shikhar/lucene-solr/tree/LUCENE-5299. I've made various tweaks, like being able to throttle per-request parallelism in {{ParallelSearchStrategy}}. luceneutil bench numbers when running with ^ + hacked IndexSearcher constructor that uses {{ParallelSearchStrategy(new ForkJoinPool(128), 8)}} + luceneutil constants.py SEARCH_NUM_THREADS = 16 Against trunk, on a 32 core (with HT) Sandy Bridge server, with source {{wikimedium500k}} {noformat} Report after iter 19: TaskQPS baseline StdDev QPS parcol StdDev Pct diff Fuzzy1 81.91 (43.2%) 52.96 (39.7%) -35.3% ( -82% - 83%) LowTerm 2550.11 (11.9%) 1927.28 (5.6%) -24.4% ( -37% - -7%) Respell 43.02 (39.4%) 35.23 (31.5%) -18.1% ( -63% - 87%) Fuzzy2 19.32 (25.1%) 16.40 (34.8%) -15.1% ( -59% - 59%) MedTerm 1679.37 (12.2%) 1743.27 (8.6%) 3.8% ( -15% - 28%) PKLookup 221.58 (8.3%) 257.36 (13.2%) 16.1% ( -4% - 41%) AndHighLow 1027.99 (11.6%) 1278.39 (15.9%) 24.4% ( -2% - 58%) AndHighMed 741.50 (10.0%) 1198.04 (27.5%) 61.6% ( 21% - 110%) MedPhrase 709.04 (11.6%) 1203.02 (24.3%) 69.7% ( 30% - 119%) LowSpanNear 601.13 (16.9%) 1127.30 (16.7%) 87.5% ( 46% - 145%) LowSloppyPhrase 554.87 (10.8%) 1130.25 (30.5%) 103.7% ( 56% - 162%) OrHighMed 408.55 (10.4%) 977.56 (20.1%) 139.3% ( 98% - 189%) LowPhrase 364.36 (10.8%) 893.27 (41.0%) 145.2% ( 84% - 220%) OrHighLow 355.78 (12.7%) 893.63 (19.6%) 151.2% ( 105% - 210%) AndHighHigh 390.73 (10.3%) 1004.70 (24.3%) 157.1% ( 111% - 213%) HighTerm 399.01 (11.8%) 1067.67 (12.1%) 167.6% ( 128% - 217%) Wildcard 754.76 (11.6%) 2067.96 (28.0%) 174.0% ( 120% - 241%) HighSpanNear 153.57 (14.8%) 463.54 (24.3%) 201.8% ( 141% - 282%) OrHighHigh 212.16 (12.4%) 665.56 (28.2%) 213.7% ( 154% - 290%) HighPhrase 170.49 (13.1%) 547.72 (17.3%) 221.3% ( 168% - 289%) HighSloppyPhrase 66.91 (10.1%) 219.59 (12.0%) 228.2% ( 187% - 278%) MedSloppyPhrase 128.73 (12.5%) 425.67 (20.3%) 230.7% ( 175% - 300%) MedSpanNear 130.31 (10.7%) 436.12 (18.2%) 234.7% ( 185% - 295%) Prefix3 166.91 (14.9%) 652.64 (26.7%) 291.0% ( 217% - 390%) IntNRQ 110.73 (15.0%) 467.72 (33.6%) 322.4% ( 238% - 436%) {noformat} > Refactor Collector API for parallelism > -------------------------------------- > > Key: LUCENE-5299 > URL: https://issues.apache.org/jira/browse/LUCENE-5299 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Shikhar Bhushan > Attachments: LUCENE-5299.patch, LUCENE-5299.patch, LUCENE-5299.patch, > LUCENE-5299.patch, LUCENE-5299.patch, benchmarks.txt > > > h2. Motivation > We should be able to scale-up better with Solr/Lucene by utilizing multiple > CPU cores, and not have to resort to scaling-out by sharding (with all the > associated distributed system pitfalls) when the index size does not warrant > it. > Presently, IndexSearcher has an optional constructor arg for an > ExecutorService, which gets used for searching in parallel for call paths > where one of the TopDocCollector's is created internally. The > per-atomic-reader search happens in parallel and then the > TopDocs/TopFieldDocs results are merged with locking around the merge bit. > However there are some problems with this approach: > * If arbitary Collector args come into play, we can't parallelize. Note that > even if ultimately results are going to a TopDocCollector it may be wrapped > inside e.g. a EarlyTerminatingCollector or TimeLimitingCollector or both. > * The special-casing with parallelism baked on top does not scale, there are > many Collector's that could potentially lend themselves to parallelism, and > special-casing means the parallelization has to be re-implemented if a > different permutation of collectors is to be used. > h2. Proposal > A refactoring of collectors that allows for parallelization at the level of > the collection protocol. > Some requirements that should guide the implementation: > * easy migration path for collectors that need to remain serial > * the parallelization should be composable (when collectors wrap other > collectors) > * allow collectors to pick the optimal solution (e.g. there might be memory > tradeoffs to be made) by advising the collector about whether a search will > be parallelized, so that the serial use-case is not penalized. > * encourage use of non-blocking constructs and lock-free parallelism, > blocking is not advisable for the hot-spot of a search, besides wasting > pooled threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org