Re: Custom SliceExecutor and slices computation in IndexSearcher

Luca Cavanna Thu, 18 May 2023 23:39:54 -0700

Hi Sorabh,
You'll want to override the protected slices method to include your custom
logic for creating the leaf slices. Your IndexSearcher extension can also
retrieve the slices through the getSlices public method. I don't understand
what makes the additional constructor necessary, could you clarify that for
me?


One thing that may make sense to do is making the SliceExecutor extensible.
Currently it is package private, and I can see how users may want to
provide their own implementation when it comes to handling rejections,
executing on the caller thread in certain scenarios. Possibly even the task
creation, and the coordination of their execution could be moved to the
SliceExecutor too.

Cheers
Luca

On Fri, May 19, 2023, 03:27 SorabhApache <[email protected]> wrote:

> Hi All,
>
> For concurrent segment search, lucene uses the *slices* method to compute
> the number of work units which can be processed concurrently.
>
> a) It calculates *slices* in the constructor of *IndexSearcher*
> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L239>
> with default thresholds for document count and segment counts.
> b) Provides an implementation of *SliceExecutor* (i.e.
> QueueSizeBasedExecutor)
> <https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L1008>
> based on executor type which applies the backpressure in concurrent
> execution based on a limiting factor of 1.5 times the passed in threadpool
> maxPoolSize.
>
> In OpenSearch, we have a search threadpool which serves the search request
> to all the lucene indices (or OpenSearch shards) assigned to a node. Each
> node can get the requests to some or all the indices on that node.
> I am exploring a mechanism such that I can dynamically control the max
> slices for each lucene index search request. For example: search requests
> to some indices on that node to have max 4 slices each and others to have 2
> slices each. Then the threadpool shared to execute these slices does not
> have any limiting factor. In this model the top level search threadpool
> will limit the number of active search requests which will limit the number
> of work units in the SliceExecutor threadpool.
>
> For this the derived implementation of IndexSearcher can get an input
> value in the constructor to control the slice count computation. Even
> though the slice method is protected it gets called from the constructor of
> base IndexSearcher class which prevents the derived class from using the
> passed in input.
>
> To achieve this I can think of the following ways (in order of preference)
> and would like to submit a pull request for it. But I wanted to get some
> feedback if option 1 looks fine or take some other approach.
>
> 1. Provide another constructor in IndexSearcher which takes in 4 input
> parameters:
>   protected IndexSearcher(IndexReaderContext context, Executor executor,
> SliceExecutor sliceExecutor, Function<List<LeafReaderContext>, LeafSlice[]>
> sliceProvider)
>
> 2. Make the `leafSlices` member protected and non final. After it is
> initialized by the IndexSearcher (using default mechanism in lucene), the
> derived implementation can again update it if need be (like based on some
> input parameter to its own constructor). Also make the constructor with
> SliceExecutor input protected such that derived implementation can provide
> its own implementation of SliceExecutor. This mechanism will have redundant
> computation of leafSlices.
>
>
> Thanks,
> Sorabh
>

Re: Custom SliceExecutor and slices computation in IndexSearcher

Reply via email to