[jira] [Commented] (LUCENE-7588) A parallel DrillSideways implementation

Michael McCandless (JIRA) Sun, 18 Dec 2016 15:37:16 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-7588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15759696#comment-15759696
 ]


Michael McCandless commented on LUCENE-7588:
--------------------------------------------

bq. I am working on providing a benchmark. What is the good practice for Lucene 
? It it okay to provide a benchmark as a test case ?

We don't usually do benchmarks as test cases; we could e.g. push the benchmark 
sources to https://github.com/mikemccand/luceneutil which holds various random 
Lucene benchmarking utilities.  Or if you just have some simple results to 
share w/o having the full source code to share, that's better than nothing too 
:)

Hmm it looks like nothing is testing the new {{ParallelDrillSideways.search}}?

{quote}
bq. I wonder if we could absorb ParallelDrillSideways under DrillSideways such 
that if you pass an executor it uses the concurrent implementation? It's really 
an implementation/execution detail I think? Similar to how IndexSearcher takes 
an optional executor.

I agree. I think that it is the way it should be. I don't wanted to be too 
intrusive.
{quote}

Maybe we could just add another ctor to {{DrillSideways}} taking all
the current arguments, plus an executor?  I.e.:

{noformat}
  public DrillSideways(IndexSearcher searcher, FacetsConfig config, 
TaxonomyReader taxoReader, SortedSetDocValuesReaderState state, ExecutorService 
executor) {
    ...
  }
{noformat}

Then, in the {{DrillSideways.search}} method, if executor is non-null,
we invoke the concurrent version ({{ParallelDrillSideways.search}}
from your patch) internally, as a private method?


> A parallel DrillSideways implementation
> ---------------------------------------
>
>                 Key: LUCENE-7588
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7588
>             Project: Lucene - Core
>          Issue Type: Improvement
>    Affects Versions: master (7.0), 6.3.1
>            Reporter: Emmanuel Keller
>            Priority: Minor
>              Labels: facet, faceting
>             Fix For: master (7.0), 6.3.1
>
>         Attachments: LUCENE-7588.patch
>
>
> Currently DrillSideways implementation is based on the single threaded 
> IndexSearcher.search(Query query, Collector results).
> On large document set, the single threaded collection can be really slow.
> The ParallelDrillSideways implementation could:
> 1. Use the CollectionManager based method IndexSearcher.search(Query query, 
> CollectorManager collectorManager)  to get the benefits of multithreading on 
> index segments,
> 2. Compute each DrillSideway subquery on a single thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7588) A parallel DrillSideways implementation

Reply via email to