I just now looked at IndexSearcherCloseableSecureBase.java Guess if we want to cap each search request with max-of "n" threads, we can plug the above logic into this class directly instead of BlurIndexSimpleWriter.java
On Wed, Jun 29, 2016 at 6:04 PM, Ravikumar Govindarajan < [email protected]> wrote: > This is really nice Aaron. You've done the bulk of work already!!! > > I think parallelism can be provided too for searching a single shard.... > > Just as a quick proposal, we can do a static initialization in > BlurIndexSimpleWriter > > static LinkedBlockingQueue executorQueue = new LBQ(128/4); > > static { > for(int i=0;i<128/4;i++) { > queue.add(Executors.newFixedThreadPool(4)); > } > } > ---- > > Incoming search request per-shard... > > public IndexSearcher getIndexSearcher() { > ..... > Executor current = executorQueue.poll(); > > if (current==null) { > //All thread-pools are busy or user has explicitly switched off via > config. > //Search proceeds in single threaded fashion utilizing calling-thread > itself > } > > return new IndexSearcherCloseable(indexReader, current); > } > --- > > Btw, we can do this by over-riding a single method > IndexSearcher.slices(...) in lucene 5.x & above!!! > > > On Tue, Jun 28, 2016 at 8:01 PM, Aaron McCurry <[email protected]> wrote: > >> Some time ago I created something similar, it's kinda a backport into >> Lucene 4.3: >> >> >> https://github.com/apache/incubator-blur/blob/65640200a8e7dd539c1dd4d920255c717102b9b2/blur-query/src/main/java/org/apache/blur/lucene/search/CloneableCollector.java#L25 >> >> It's handles the execution of searching the segments in parallel but >> doesn't provide any limitations on parallelism. >> >> Aaron >> >> >> >> On Tue, Jun 28, 2016 at 6:37 AM, Ravikumar Govindarajan < >> [email protected]> wrote: >> >> > Aaron, >> > >> > Just an update.. >> > >> > https://issues.apache.org/jira/browse/LUCENE-5299 >> > >> > You can now use any collector & get guaranteed parallel execution. They >> > have also provided a "parallelism" hint that will limit the number of >> > search threads at request level... >> > >> > i.e., we can fix blur executor thread-count at 128 & limit >> "parallelism" at >> > a max of 4 threads per request.. >> > >> > On Fri, Feb 6, 2015 at 5:25 PM, Ravikumar Govindarajan < >> > [email protected]> wrote: >> > >> > > Thanks for the clarifications. >> > > >> > > Another point I thought about is the disk efficiency of a serving a >> > > random-IO. Many parallel threads could end-up hitting just one or two >> > disks >> > > in the cluster… >> > > >> > > Think I can skip it safely for my work-loads. >> > > >> > > -- >> > > Ravi >> > > >> > > On Fri, Feb 6, 2015 at 3:09 PM, Aaron McCurry <[email protected]> >> > wrote: >> > > >> > >> The ServiceExecutor (thread pool) put inside the IndexSearcher was an >> > >> attempt at making the segments search in parallel when available. >> > However >> > >> there is a limitation in Lucene that does not allow segment parallel >> > >> searches when you are using Collectors. >> > >> >> > >> >> > >> >> > >> https://github.com/apache/lucene-solr/blob/lucene_solr_4_3_0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L595 >> > >> >> > >> We override this method to allow for Tracing: >> > >> >> > >> >> > >> >> > >> https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/server/IndexSearcherCloseableBase.java#L46 >> > >> >> > >> and here: >> > >> >> > >> >> > >> >> > >> https://github.com/apache/incubator-blur/blob/master/blur-core/src/main/java/org/apache/blur/server/IndexSearcherCloseableSecureBase.java#L51 >> > >> >> > >> I agree that if you are already running a lot of shards per server >> that >> > if >> > >> we were to enhance Lucene to allow for parallel searching of >> segments it >> > >> could become counter productive. I have seen underutilized systems >> that >> > >> could take advantage of the parallel segment search, so as with any >> > >> feature >> > >> like this, it depends. :-) >> > >> >> > >> Aaron >> > >> >> > >> On Fri, Feb 6, 2015 at 2:39 AM, Ravikumar Govindarajan < >> > >> [email protected]> wrote: >> > >> >> > >> > Blur by default uses a SearchExecutor for IndexSearcher. I believe >> > >> lucene >> > >> > helps searching segments of a single shard in parallel. >> > >> > >> > >> > Our previous index was built on a lower version of lucene where >> such a >> > >> > feature was absent and we ran sequential search per shard only… >> > >> > >> > >> > What is the general recommendation for blur? Is it advisable to use >> > the >> > >> > SearchExecutor? What will happen when there are many parallel >> queries >> > >> for >> > >> > different shards. Will SearchExecutor become a bottle-neck? >> > >> > >> > >> > Any help is much appreciated... >> > >> > >> > >> > -- >> > >> > Ravi >> > >> > >> > >> >> > > >> > > >> > >> > >
