[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-24 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868449#comment-17868449
 ] 

David Smiley commented on SOLR-13350:
-

There is no dedicated test for this functionality, not that there needs to be 
one.  But nonetheless, can someone recommend a particular test I might use that 
exploits the CollectorManager functionality, especially the DocSet construction 
aspect?  [~cpoerschke] maybe you can recommend one as you were working on that.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 15h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868369#comment-17868369
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit 8b3d5b0f57160d61b5b7510049a7235413df03ae in solr's branch 
refs/heads/branch_9x from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=8b3d5b0f571 ]

SOLR-13350: multi-threaded search: default to 'available processors' thread 
count (#2569)

Co-authored-by: Ishan Chattopadhyaya 
Co-authored-by: Gus Heck <46900717+gus-...@users.noreply.github.com>
(cherry picked from commit ffde4199a094136bde4f63ca4037b05f6388e488)


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-24 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868367#comment-17868367
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit ffde4199a094136bde4f63ca4037b05f6388e488 in solr's branch 
refs/heads/main from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=ffde4199a09 ]

SOLR-13350: multi-threaded search: default to 'available processors' thread 
count (#2569)

Co-authored-by: Ishan Chattopadhyaya 
Co-authored-by: Gus Heck <46900717+gus-...@users.noreply.github.com>

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868123#comment-17868123
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit cfec121bab2ecfc4c06e20a5533596025ae63d98 in solr's branch 
refs/heads/branch_9x from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=cfec121bab2 ]

SOLR-13350: multi-threaded search: replace cached with fixed threadpool (#2508)

(cherry picked from commit dfdbf85a1da491377ab519b025b80d60d4b2d534)


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868122#comment-17868122
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit dfdbf85a1da491377ab519b025b80d60d4b2d534 in solr's branch 
refs/heads/main from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=dfdbf85a1da ]

SOLR-13350: multi-threaded search: replace cached with fixed threadpool (#2508)



> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-23 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868111#comment-17868111
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit 06950c656f21577db624102b913fb659ef1f0306 in solr's branch 
refs/heads/main from Christine Poerschke
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=06950c656f2 ]

SOLR-13350: multi-threaded search: (undocumented) opt-out ability (#2570)

On a request level, multiThreaded=false is already possible but for full (node 
level) opt-out SolrIndexSearcher must pass a null executor to Lucene's 
IndexSearcher.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-07-18 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867066#comment-17867066
 ] 

Christine Poerschke commented on SOLR-13350:


bq. ... currently nothing controls the passing of the executor to Lucene's 
{{IndexSearcher}} constructor. ...

Something like https://github.com/apache/solr/pull/2570 would be a way to 
control it, on a node level.

bq. ... Why this change in bin/solr to get the max CPUs instead of calling 
ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors() ? ...

https://github.com/apache/solr/pull/2569 takes up [~gus]'s suggestion to use 
{{Runtime.getRuntime().availableProcessors()}} mixed together with 
[~ichattopadhyaya]'s change to remove {{nproc}} use, plus matching config and 
doc update.

bq. ... Lucene has {{MAX_DOCS_PER_SLICE}} and {{MAX_SEGMENTS_PER_SLICE}} 
constants and slicing logic ...

Experiments with that did not provide support for making this configurable.

bq. ... I'm benchmarking more thoroughly at the moment. ...

Looking forward to learning more about the results, as and when they are 
available. Thanks for re-benchmarking on this!

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-26 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860143#comment-17860143
 ] 

David Smiley commented on SOLR-13350:
-

I saw an obscure failure for [this 
assertion|https://github.com/apache/solr/blob/2fe98d962bb498c29a032708739c9a41e1a263d9/solr/core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java#L119]
 in TestCollapseQParserPlugin.testMultiSort and I have a suspicion that 
subtleties in segment processing order from concurrent segment search may have 
tickled it.  The assertion tests something rather arbitrary, arguably should 
not be tested like this -- i.e. it should test that one of the returned docs 
has ID 2 or 3 as both are valid answers.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 9.7
>
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-25 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860043#comment-17860043
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

> https://github.com/apache/solr/pull/2508 opened to explore replacing the 
> cached with a fixed threadpool.

Thanks [~cpoerschke]! This seems definitely faster than using cached 
threadpools. I'm benchmarking more thoroughly at the moment.
I've marked this issue as a release blocker, as we shouldn't have a release in 
this state (without your patch, and potentially other fixes on top of that).

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-19 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856297#comment-17856297
 ] 

Christine Poerschke commented on SOLR-13350:


[~dsmiley] at https://github.com/apache/solr/pull/2508#issuecomment-2176117831 
wrote:

bq. Can you recommend a way to benchmark this myself, like if I wanted to tweak 
it to see how my ideas work out?

[~ichattopadhyaya] above wrote:

bq. ... There are a few more things I've identified that can be improved here, 
around the defaults, ...

I don't know what the ideas or things might be but just wanted to share that 
Lucene has {{MAX_DOCS_PER_SLICE}} and {{MAX_SEGMENTS_PER_SLICE}} constants and 
slicing logic -- 
https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L325-L332
 -- and I'm planning to run some experiments w.r.t. reducing the 
{{MAX_SEGMENTS_PER_SLICE}} to 1.

(And to be clear, I'm not suggesting that {{SolrIndexSearcher}} provide a way 
for users to be able to configure these {{MAX_*_PER_SLICE}} values, just 
sharing that the slicing logic is there.)

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-18 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856077#comment-17856077
 ] 

David Smiley commented on SOLR-13350:
-

Why this change in 
[bin/solr|https://github.com/apache/solr/blob/ff6607d25f023a59f866a66820037bb215342ca8/solr/bin/solr#L1445]
 to get the max CPUs instead of calling 
ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors() ?

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 12h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-18 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856008#comment-17856008
 ] 

Christine Poerschke commented on SOLR-13350:


bq. ... The "just pass an executor" patch was onto a Solr 9.5 cloud, ...

I've attached the {{SOLR-13350-pre-PR-2508.patch}} in case that might be useful 
for others.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, 
> SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-07 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853217#comment-17853217
 ] 

Christine Poerschke commented on SOLR-13350:


https://github.com/apache/solr/pull/2508 opened to explore replacing the cached 
with a fixed threadpool.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-06-07 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853178#comment-17853178
 ] 

Christine Poerschke commented on SOLR-13350:


Hello again.
{quote}Overall the results could be summarised as "unexpected" or "surprising" 
– passing an executor increased latency by around 20x with a reduction in 
container CPU use to approximately match that. The thread count used seemed to 
make no difference, we've tried a few different ones.
{quote}
{quote}The "just pass an executor" patch was onto a Solr 9.5 cloud, and I 
haven't really dived into the details much, but I'm wondering if the 
implications for this ticket here might be that searches passing 
{{multiThreaded=false}} could be impacted because currently nothing controls 
the passing of the executor to Lucene's {{IndexSearcher}} constructor.
{quote}
So the unexpected increase in latency can be explained as follows I think:
 * A threadpool executor with {{corePoolSize}} of 0 and {{maximumPoolSize}} of 
N and {{queueCapacity}} N*1000 was/is used.
 ** 
[https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L446-L450]
 ** 
[https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/solrj/src/java/org/apache/solr/common/util/ExecutorUtil.java#L234-L243]
 * The 
[https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/ThreadPoolExecutor.html]
 documentation says _"If corePoolSize or more threads are running, the Executor 
always prefers queuing a request rather than adding a new thread."_ and _"If a 
request cannot be queued, a new thread is created unless this would exceed 
maximumPoolSize, in which case, the task will be rejected."_
 ** With a very generous queue size queuing would have been always possible and 
I think that meant we were effectively running single threaded.

Experimentally using a fixed-size threadpool executor i.e. 
[https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/solrj/src/java/org/apache/solr/common/util/ExecutorUtil.java#L198-L202]
 removed the slowness (and +very nicely+ improved latency relative to the 
pre-experimental baseline).

A fixed-size threadpool executor with an unlimited queue capacity might be 
undesirable though, in which case an alternative approach could be to retain 
the queueCapacity and to (say) have {{corePoolSize}} match {{maximumPoolSize}} 
in value, or something along those lines.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-29 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850405#comment-17850405
 ] 

Christine Poerschke commented on SOLR-13350:


Hello. I'd like to share some results from experiments with a subset of this 
ticket's changes when benchmarking dense vector search.

Having noticed that {{AbstractKnnVectorQuery.rewrite}} has parallelism and that 
Solr's {{SolrIndexSearcher}} constructor did not yet pass an executor to its 
super class i.e. Lucene's {{IndexSearcher}} I was curious if picking out only 
the "pass an executor" part of the changes in this ticket would be beneficial.

Links for Solr 9.4.1 using Lucene 9.8.0 version:
 * 
[https://github.com/apache/lucene/blob/releases/lucene/9.8.0/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L83-L87]
 * 
[https://github.com/apache/lucene/blob/releases/lucene/9.8.0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L235]

Links for Solr 9.5.0 using Lucene 9.9.2 version:
 * 
[https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L82-L88]
 * 
[https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L233-L234]

The links are for 9.4.1 and 9.5.0 versions for convenience and to reflect 
initial code reading and subsequent experimental code base.

Overall the results could be summarised as "unexpected" or "surprising" – 
passing an executor increased latency by around 20x with a reduction in 
container CPU use to approximately match that. The thread count used seemed to 
make no difference, we've tried a few different ones.

The "just pass an executor" patch was onto a Solr 9.5 cloud, and I haven't 
really dived into the details much, but I'm wondering if the implications for 
this ticket here might be that searches passing {{multiThreaded=false}} could 
be impacted because currently nothing controls the passing of the executor to 
Lucene's {{IndexSearcher}} constructor.

(And last but not least, a shoutout to [~abenedetti] because his 
[comment|https://lists.apache.org/thread/xc174z8qnls3o0by644n3fbtt28po7lc] on 
[this|https://lists.apache.org/thread/0olh0z2dc78y01k34yg06yrpzts2ggmp] user 
mailing list thread really encouraged me to take the time to write-up and share 
these results here.)

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-22 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848668#comment-17848668
 ] 

Andrzej Bialecki commented on SOLR-13350:
-

{quote}As of now, the timeAllowed requests are anyway executed without 
multithreading
{quote}
This is based on a {{QueryCommand.timeAllowed}} flag that is set only from the 
{{timeAllowed}} param. However, this concept was extended in SOLR-17138 to 
{{QueryLimits}} that is now initialized also using other params. There is 
indeed some inconsistency here that's a left-over from that change, in the 
sense that `QueryCommand.timeAllowed` should have been either removed 
completely or replaced with something like {{{}queryLimits{}}}, to make sure to 
check the current SolrRequestInfo for QueryLimits.

In any case, the minimal workaround for this could be to check 
{{QueryLimits.getCurrentLimits().isLimitsEnabled()}} instead of 
{{{}QueryCommand.timeAllowed{}}}. But a better fix would be to properly unbreak 
the tracking of the parent {{SolrRequestInfo}} in MT search.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-20 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847967#comment-17847967
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. Are there other risks/trade-offs for enabling it besides timeAllowed being 
unsupported? Basically, when would a user not want this?

I'll add some guidance for users to the reference guide on this, before closing 
this issue.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-20 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847965#comment-17847965
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. This is caused by breaking the end-to-end tracking of request context in 
SolrRequestInfo, which uses a thread-local deque to provide the same context 
for both the main and all sub-requests. This tracking is needed to setup the 
correct query timeout instance on the searcher ( QueryLimits ) for time-limited 
searches in the SolrIndexSearcher:727 . However, now that this method is 
executed in a separate "searcherCollector" thread the SolrRequestInfo instance 
it obtains is empty because it doesn't match the original thread that set it.

QueryLimits has two parts: timeAllowed and cpuThreadLimits.

# For timeAllowed, I can see that this value is passed to the searcher 
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L726-L735.
 Hence, I think timeAllowed will be honoured by Lucene properly.
# For cpuThreadLimits, the limits are set in SolrRequestInfo 
(https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/request/SolrRequestInfo.java#L86-L87).
 Seems like they are not getting carried over to the sub-threads. Noble Paul 
pointed me to the InheritableThreadLocalProvider for this.

As of now, the timeAllowed requests are anyway executed without multithreading: 
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java#L125
I'm considering adding the cpuThreadLimits based requests also to this 
exception list.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-18 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847589#comment-17847589
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. Yes please revert, this is not fully baked, and 9x is supposed to be stable.

We are far away from the next release, so let us work towards fixing the query 
limits to work with MT search. If we are unable to, there is still not need to 
revert this change but just changing the defaults should be fine for the 
release.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-16 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846929#comment-17846929
 ] 

Gus Heck commented on SOLR-13350:
-

As it is, this will cause a break in back compatibility for folks that have 
adopted the new query limits functionality. The effect will be for the feature 
they intend to rely on to protect themselves from runaway queries to silently 
stop working.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-16 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846927#comment-17846927
 ] 

Gus Heck commented on SOLR-13350:
-

Yes please revert, this is not fully baked, and 9x is supposed to be stable.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846029#comment-17846029
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit b8410234993e44f9b64b02f053159f52c73f4433 in solr's branch 
refs/heads/branch_9x from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=b8410234993 ]

SOLR-13350: Multithreaded search


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845332#comment-17845332
 ] 

David Smiley commented on SOLR-13350:
-

I suspect timeAllowed support could be added with relative ease.  For reasons 
above, it won't work OOTB though.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-10 Thread Andrzej Bialecki (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845258#comment-17845258
 ] 

Andrzej Bialecki commented on SOLR-13350:
-

This is caused by breaking the end-to-end tracking of request context in 
{{{}SolrRequestInfo{}}}, which uses a thread-local deque to provide the same 
context for both the main and all sub-requests. This tracking is needed to 
setup the correct query timeout instance on the searcher ( {{QueryLimits}} ) 
for time-limited searches in the {{SolrIndexSearcher:727}} . However, now that 
this method is executed in a separate "searcherCollector" thread the 
{{SolrRequestInfo}} instance it obtains is empty because it doesn't match the 
original thread that set it.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-09 Thread Gus Heck (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845149#comment-17845149
 ] 

Gus Heck commented on SOLR-13350:
-

I'm a little concerned that multithreaded is set to false [in the 
TestCpuAllowed 
test|https://github.com/apache/solr/commit/ff6607d25f023a59f866a66820037bb215342ca8#diff-874aaa9daaf158f756a5bd6d61ec8ec5a6961063c5916e54641d2be7ad39a4b9R189]
 and 
[TestQueryLimits|https://github.com/apache/solr/commit/ff6607d25f023a59f866a66820037bb215342ca8#diff-0d9937f23efd32a0eadf335a72ea5c79727d15d1ed5a53bb89834da6164b8a36R89]
 tests.

Was that meant to be revisited? if not, it seems to say that multi-threaded 
search is incompatible with query limits. That's an important thing to document 
if it is the intention. Even if documented, it doesn't seem great that users 
have to chose... Especially since fanning out on multiple threads presents an 
even greater opportunity to starve innocent bystander queries with a 
misbehaving query.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-09 Thread Houston Putman (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845138#comment-17845138
 ] 

Houston Putman commented on SOLR-13350:
---

I've been looking around and can't quite tell why timeAllowed isn't allowed 
with multithreaded search. Since each multi-threaded collector can respect 
timeAllowed itself, it seems easy enough to support. I'm sure I'm missing 
something though.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843845#comment-17843845
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit f823e9b8695b05566a428cd7f7055714b6c325bd in solr's branch 
refs/heads/main from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=f823e9b8695 ]

SOLR-13350: Fixing logging to be less verbose


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843842#comment-17843842
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit a4979e0b80b93bbf7d76e7101de4fa34332672d7 in solr's branch 
refs/heads/main from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=a4979e0b80b ]

SOLR-13350: Fixing tidy


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-06 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843841#comment-17843841
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. Do we have a limited queue for the pool, leading to the 
"RejectedExecution"?  I think we'd like a caller-runs policy so we don't wait & 
starve

Changed it to a larger queue than then number of threads, and removed the 
rejected execution handler as well.

bq. Commit ff6607d25f023a59f866a66820037bb215342ca8 in solr's branch 
refs/heads/main from Ishan Chattopadhyaya
Will backport to branch_9x after a few days of baking on main.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-05-06 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843838#comment-17843838
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit ff6607d25f023a59f866a66820037bb215342ca8 in solr's branch 
refs/heads/main from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=ff6607d25f0 ]

SOLR-13350: Multithreaded search (closes #2248)


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-04-01 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832928#comment-17832928
 ] 

David Smiley commented on SOLR-13350:
-

Do we have a limited queue for the pool, leading to the "RejectedExecution"?  I 
think we'd like a caller-runs policy so we don't wait & starve

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-04-01 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832913#comment-17832913
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

{code}
java.util.concurrent.RejectedExecutionException: Task 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$250/0x000800376040@2c13c7b6
 rejected from 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@407b44ad[Running,
 pool size = 6, active threads = 0, queued tasks = 6, completed tasks = 643]
at 
java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055)
 ~[?:?]
at 
java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
 ~[?:?]
at 
java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
 ~[?:?]
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:275)
 ~[solrj/:?]
at 
org.apache.lucene.search.TaskExecutor$TaskGroup.invokeAll(TaskExecutor.java:153)
 ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at 
org.apache.lucene.search.TaskExecutor.invokeAll(TaskExecutor.java:76) 
~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at org.apache.lucene.index.TermStates.build(TermStates.java:116) 
~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:275) 
~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:900) 
~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:691) 
~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 
2024-01-25 09:51:09]
at 
org.apache.solr.search.SolrIndexSearcher.searchCollectorManagers(SolrIndexSearcher.java:2108)
 ~[core/:?]
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1922)
 ~[core/:?]
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1702)
 ~[core/:?]

{code}

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-04-01 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832907#comment-17832907
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

I think I understand what is going on:

This issue in Lucene now uses the provided threadpool that Solr passes to the 
searcher to spawn concurrent sub-tasks for building the term states: 
https://github.com/apache/lucene/pull/12183

However, this leads to starvation, because once all threads are executing user 
queries, there are no free threads in the threadpool to spawn more tasks for 
building these term states.

Related issue (but not the cause) for this is: 
https://github.com/apache/lucene/commit/2106bf5172a9c38a8457db383eb9f5cd1918ddc5


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-10 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816348#comment-17816348
 ] 

David Smiley commented on SOLR-13350:
-

I'm confused why some of you propose other than "b" – Global.  How would the 
client/user know how many threads are appropriate?  They are not 
running/operating Solr; they don't know it's hardware or load.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-08 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815862#comment-17815862
 ] 

Noble Paul commented on SOLR-13350:
---

I think this must be purely a request parameter. If the user wishes the value 
to be fixed without explicitly psssing the parameter, they can use 
{{"defaults"}} in {{solrconfix.xml}}

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-07 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815426#comment-17815426
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

One thing that's important is configuration of the number of threads to be used 
for querying. There are several options:

a) Per collection configuration
b) Global configuration (core container based thread pool, as in the current 
patch)
c) Request time configuration (possibly overriding any global defaults)

I'm thinking (a) is not the best, and thinking of a combination of (b) and (c) 
here.
WDYT [~cpoerschke]?

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-06 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815016#comment-17815016
 ] 

David Smiley commented on SOLR-13350:
-

I'm really excited about this; thanks for resuming [~ichattopadhyaya] !  And 
[~atri] for his initial work, of course.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-06 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814975#comment-17814975
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

Same 20k queries as above:

1 user thread at a time, 6 search executor threads at a time:
{code}
50th=5.1998165,
90th=8.449806,
95th=10.19132270002,
mean=29.829016496896898,
total-queries=19980,
total-time=678569
{code}

6 user threads at a time, 1 search executor threads at a time:
{code}
50th=92.9081930001,
90th=378.8129378,
95th=575.193647003,
mean=200.04457501921922,
total-queries=19980,
total-time=671495
{code}

Seems like the throughput is very close to same (671s vs 678s for 20k queries), 
this indicates that overheads of the solution are are negligible.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-06 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814949#comment-17814949
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

With 100 million documents, and 20k queries (6 at a time) [0]:

With this change (it has a executor thread pool size of 6, so total 36 threads 
are processing at most):
{code}
50th=16.619524,
90th=30.94703160027,
95th=38.94829185025,
mean=20.014714108908905,
total-queries=19980,
total-time=67360
{code}

Before this change:
{code}
50th=92.9081930001,
90th=378.8129378,
95th=575.193647003,
mean=200.04457501921922,
total-queries=19980,
total-time=671495
{code}

[0] - 
https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-06 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814927#comment-17814927
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. It's surprising perhaps that GitHub can't identify you, looking at the 
https://infra.apache.org/new-committers-guide.html#config-access documentation 
nothing seems obviously new there.

Thanks Christine, those instructions worked great. All these years, I had added 
my details to id.apache.org and was expecting it to work (as per instructions I 
followed). This boxer thing is nice.

Latest PR is here now: https://github.com/apache/solr/pull/2248

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-06 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814821#comment-17814821
 ] 

Christine Poerschke commented on SOLR-13350:


bq. Latest patch here: ... (I can't create a PR, since Github can't identify me 
as a collaborator). ...

Thanks [~ichattopadhyaya] for continuing to pursue this!

It's surprising perhaps that GitHub can't identify you, looking at the 
https://infra.apache.org/new-committers-guide.html#config-access documentation 
nothing seems obviously new there.

If you'd like someone else to try and create a PR from the above branch, I'd be 
happy to give that a go if that would help.



> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2024-02-05 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814496#comment-17814496
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

Latest patch here: 
https://github.com/apache/solr/compare/apache:main...apache:jira/solr-13350
(I can't create a PR, since Github can't identify me as a collaborator).

Status:
* Search is faster
* Faceting is slower

Investigating the slowdown in faceting when using multiple threads. Just a 
note: When I replace the ThreadSafeBitSet (originally borrowed from Netflix's 
zeno project) with a non-threadsafe version, speed is better but results are 
wrong.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-21 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735961#comment-17735961
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

Thanks Mark!

bq.  If each thread is working on an index segment wouldn't it make sense to 
collect segment level bitsets? This would eliminate the need for thread safe 
bitset.

I thought about it. Doing it naïvely might end up requiring a lot of memory 
this way. But, it should be possible to do so efficiently (each segment level 
bitset being as large as the range of docs it holds, then merging them properly 
at the end). Thanks for the clue.


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-21 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735958#comment-17735958
 ] 

Joel Bernstein commented on SOLR-13350:
---

If each thread is working on an index segment wouldn't it make sense to collect 
segment level bitsets? This would eliminate the need for thread safe bitset. 

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-21 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735929#comment-17735929
 ] 

Mark Robert Miller commented on SOLR-13350:
---

I don't recall this code, but if the bitset was like an array, you wouldn't 
need a memory barrier for threads to be able to work on disjoint sections of 
the array. But you would need a memory barrier or something like an effectively 
final array variable for the threads to access the array reference, and you'd 
need a memory barrier on the values of the array for another thread to see them 
correctly. 

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-21 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735817#comment-17735817
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

[~markrmil...@gmail.com], the ThreadSafeBitSet has a spin lock and compare and 
set,
https://github.com/chatman/solr/blob/jira/solr-13350-9x/solr/core/src/java/org/apache/solr/search/ThreadSafeBitSet.java#L81-L89
How important is a thread safe bitset here? I am thinking that separate threads 
will operate on disjoint regions within the same bitset, and likely never 
collide on the same bits. Is that alone not sufficient to ensure consistency 
under concurrent updates to the bitset, without cas loop or locking etc.?

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-21 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735586#comment-17735586
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

bq. Did a performance testing on this suite: 
https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json
Querying performance is about 24x slower! :-o Seems like some obvious 
bottleneck, that I'll be working on chasing down.

Seems like running a faceting query causes a massive slowdown (26 seconds, as 
opposed to 300ms), but running a scores (need scores) query first, and then 
firing a follow up faceting query (using the cached docset from the previous 
step) seems to be quick and fast. I'm looking into why this could be happening.

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2023-06-20 Thread Ishan Chattopadhyaya (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735331#comment-17735331
 ] 

Ishan Chattopadhyaya commented on SOLR-13350:
-

I've updated the patches here to latest main and 9x:
https://github.com/chatman/solr/tree/jira/solr-13350-5
https://github.com/chatman/solr/tree/jira/solr-13350-9x

Did a performance testing on this suite: 
https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json
Querying performance is about 24x slower! :-o Seems like some obvious 
bottleneck, that I'll be working on chasing down.

FYI [~markrmil...@gmail.com].

> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org



[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search

2021-04-05 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315118#comment-17315118
 ] 

ASF subversion and git services commented on SOLR-13350:


Commit 36f268b65b12bb05da700f0a2c843acca7f30af5 in lucene-solr's branch 
refs/heads/tmp from Ishan Chattopadhyaya
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=36f268b ]

SOLR-13350: Multi-threaded search using collectors manager


> Explore collector managers for multi-threaded search
> 
>
> Key: SOLR-13350
> URL: https://issues.apache.org/jira/browse/SOLR-13350
> Project: Solr
>  Issue Type: New Feature
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Major
> Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> AFAICT, SolrIndexSearcher can be used only to search all the segments of an 
> index in series. However, using CollectorManagers, segments can be searched 
> concurrently and result in reduced latency. Opening this issue to explore the 
> effectiveness of using CollectorManagers in SolrIndexSearcher from latency 
> and throughput perspective.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org