[
https://issues.apache.org/jira/browse/LUCENE-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758862#comment-16758862
]
Atri Sharma commented on LUCENE-8675:
-------------------------------------
{quote}If some segments are getting large enough that intra-segment parallelism
becomes appealing, then maybe an easier and more efficient way to increase
parallelism is to instead reduce the maximum segment size so that inter-segment
parallelism has more potential for parallelizing query execution.
{quote}
Would that not lead to a much higher number of segments than required? That
could lead to issues such as a lot of open file handles and too many threads
required for scanning (although we would assign multiple small segments to a
single thread).
Thanks for the point about range queries, that is an important thought. I will
follow up with a separate patch on top of this which will do the first phase of
BKD iteration and share the generated bitset across N parallel threads, where N
is equal to the remaining clauses and each thread intersects a clause with the
bitset.
> Divide Segment Search Amongst Multiple Threads
> ----------------------------------------------
>
> Key: LUCENE-8675
> URL: https://issues.apache.org/jira/browse/LUCENE-8675
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Reporter: Atri Sharma
> Priority: Major
>
> Segment search is a single threaded operation today, which can be a
> bottleneck for large analytical queries which index a lot of data and have
> complex queries which touch multiple segments (imagine a composite query with
> range query and filters on top). This ticket is for discussing the idea of
> splitting a single segment into multiple threads based on mutually exclusive
> document ID ranges.
> This will be a two phase effort, the first phase targeting queries returning
> all matching documents (collectors not terminating early). The second phase
> patch will introduce staged execution and will build on top of this patch.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]