[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868449#comment-17868449 ] David Smiley commented on SOLR-13350: - There is no dedicated test for this functionality, not that there needs to be one. But nonetheless, can someone recommend a particular test I might use that exploits the CollectorManager functionality, especially the DocSet construction aspect? [~cpoerschke] maybe you can recommend one as you were working on that. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 15h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868369#comment-17868369 ] ASF subversion and git services commented on SOLR-13350: Commit 8b3d5b0f57160d61b5b7510049a7235413df03ae in solr's branch refs/heads/branch_9x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=solr.git;h=8b3d5b0f571 ] SOLR-13350: multi-threaded search: default to 'available processors' thread count (#2569) Co-authored-by: Ishan Chattopadhyaya Co-authored-by: Gus Heck <46900717+gus-...@users.noreply.github.com> (cherry picked from commit ffde4199a094136bde4f63ca4037b05f6388e488) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13h 40m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868367#comment-17868367 ] ASF subversion and git services commented on SOLR-13350: Commit ffde4199a094136bde4f63ca4037b05f6388e488 in solr's branch refs/heads/main from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=solr.git;h=ffde4199a09 ] SOLR-13350: multi-threaded search: default to 'available processors' thread count (#2569) Co-authored-by: Ishan Chattopadhyaya Co-authored-by: Gus Heck <46900717+gus-...@users.noreply.github.com> > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13h 40m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868123#comment-17868123 ] ASF subversion and git services commented on SOLR-13350: Commit cfec121bab2ecfc4c06e20a5533596025ae63d98 in solr's branch refs/heads/branch_9x from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=solr.git;h=cfec121bab2 ] SOLR-13350: multi-threaded search: replace cached with fixed threadpool (#2508) (cherry picked from commit dfdbf85a1da491377ab519b025b80d60d4b2d534) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868122#comment-17868122 ] ASF subversion and git services commented on SOLR-13350: Commit dfdbf85a1da491377ab519b025b80d60d4b2d534 in solr's branch refs/heads/main from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=solr.git;h=dfdbf85a1da ] SOLR-13350: multi-threaded search: replace cached with fixed threadpool (#2508) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17868111#comment-17868111 ] ASF subversion and git services commented on SOLR-13350: Commit 06950c656f21577db624102b913fb659ef1f0306 in solr's branch refs/heads/main from Christine Poerschke [ https://gitbox.apache.org/repos/asf?p=solr.git;h=06950c656f2 ] SOLR-13350: multi-threaded search: (undocumented) opt-out ability (#2570) On a request level, multiThreaded=false is already possible but for full (node level) opt-out SolrIndexSearcher must pass a null executor to Lucene's IndexSearcher. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17867066#comment-17867066 ] Christine Poerschke commented on SOLR-13350: bq. ... currently nothing controls the passing of the executor to Lucene's {{IndexSearcher}} constructor. ... Something like https://github.com/apache/solr/pull/2570 would be a way to control it, on a node level. bq. ... Why this change in bin/solr to get the max CPUs instead of calling ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors() ? ... https://github.com/apache/solr/pull/2569 takes up [~gus]'s suggestion to use {{Runtime.getRuntime().availableProcessors()}} mixed together with [~ichattopadhyaya]'s change to remove {{nproc}} use, plus matching config and doc update. bq. ... Lucene has {{MAX_DOCS_PER_SLICE}} and {{MAX_SEGMENTS_PER_SLICE}} constants and slicing logic ... Experiments with that did not provide support for making this configurable. bq. ... I'm benchmarking more thoroughly at the moment. ... Looking forward to learning more about the results, as and when they are available. Thanks for re-benchmarking on this! > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: pull-request-available > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 13h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860143#comment-17860143 ] David Smiley commented on SOLR-13350: - I saw an obscure failure for [this assertion|https://github.com/apache/solr/blob/2fe98d962bb498c29a032708739c9a41e1a263d9/solr/core/src/test/org/apache/solr/search/TestCollapseQParserPlugin.java#L119] in TestCollapseQParserPlugin.testMultiSort and I have a suspicion that subtleties in segment processing order from concurrent segment search may have tickled it. The assertion tests something rather arbitrary, arguably should not be tested like this -- i.e. it should test that one of the returned docs has ID 2 or 3 as both are valid answers. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Fix For: 9.7 > > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 12h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17860043#comment-17860043 ] Ishan Chattopadhyaya commented on SOLR-13350: - > https://github.com/apache/solr/pull/2508 opened to explore replacing the > cached with a fixed threadpool. Thanks [~cpoerschke]! This seems definitely faster than using cached threadpools. I'm benchmarking more thoroughly at the moment. I've marked this issue as a release blocker, as we shouldn't have a release in this state (without your patch, and potentially other fixes on top of that). > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 12h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856297#comment-17856297 ] Christine Poerschke commented on SOLR-13350: [~dsmiley] at https://github.com/apache/solr/pull/2508#issuecomment-2176117831 wrote: bq. Can you recommend a way to benchmark this myself, like if I wanted to tweak it to see how my ideas work out? [~ichattopadhyaya] above wrote: bq. ... There are a few more things I've identified that can be improved here, around the defaults, ... I don't know what the ideas or things might be but just wanted to share that Lucene has {{MAX_DOCS_PER_SLICE}} and {{MAX_SEGMENTS_PER_SLICE}} constants and slicing logic -- https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L325-L332 -- and I'm planning to run some experiments w.r.t. reducing the {{MAX_SEGMENTS_PER_SLICE}} to 1. (And to be clear, I'm not suggesting that {{SolrIndexSearcher}} provide a way for users to be able to configure these {{MAX_*_PER_SLICE}} values, just sharing that the slicing logic is there.) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 12h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856077#comment-17856077 ] David Smiley commented on SOLR-13350: - Why this change in [bin/solr|https://github.com/apache/solr/blob/ff6607d25f023a59f866a66820037bb215342ca8/solr/bin/solr#L1445] to get the max CPUs instead of calling ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors() ? > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 12h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17856008#comment-17856008 ] Christine Poerschke commented on SOLR-13350: bq. ... The "just pass an executor" patch was onto a Solr 9.5 cloud, ... I've attached the {{SOLR-13350-pre-PR-2508.patch}} in case that might be useful for others. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350-pre-PR-2508.patch, SOLR-13350.patch, > SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 12h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853217#comment-17853217 ] Christine Poerschke commented on SOLR-13350: https://github.com/apache/solr/pull/2508 opened to explore replacing the cached with a fixed threadpool. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 50m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17853178#comment-17853178 ] Christine Poerschke commented on SOLR-13350: Hello again. {quote}Overall the results could be summarised as "unexpected" or "surprising" – passing an executor increased latency by around 20x with a reduction in container CPU use to approximately match that. The thread count used seemed to make no difference, we've tried a few different ones. {quote} {quote}The "just pass an executor" patch was onto a Solr 9.5 cloud, and I haven't really dived into the details much, but I'm wondering if the implications for this ticket here might be that searches passing {{multiThreaded=false}} could be impacted because currently nothing controls the passing of the executor to Lucene's {{IndexSearcher}} constructor. {quote} So the unexpected increase in latency can be explained as follows I think: * A threadpool executor with {{corePoolSize}} of 0 and {{maximumPoolSize}} of N and {{queueCapacity}} N*1000 was/is used. ** [https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L446-L450] ** [https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/solrj/src/java/org/apache/solr/common/util/ExecutorUtil.java#L234-L243] * The [https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/ThreadPoolExecutor.html] documentation says _"If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread."_ and _"If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected."_ ** With a very generous queue size queuing would have been always possible and I think that meant we were effectively running single threaded. Experimentally using a fixed-size threadpool executor i.e. [https://github.com/apache/solr/blob/26365679bde2f620b399f7801387e1cbee68cdc5/solr/solrj/src/java/org/apache/solr/common/util/ExecutorUtil.java#L198-L202] removed the slowness (and +very nicely+ improved latency relative to the pre-experimental baseline). A fixed-size threadpool executor with an unlimited queue capacity might be undesirable though, in which case an alternative approach could be to retain the queueCapacity and to (say) have {{corePoolSize}} match {{maximumPoolSize}} in value, or something along those lines. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850405#comment-17850405 ] Christine Poerschke commented on SOLR-13350: Hello. I'd like to share some results from experiments with a subset of this ticket's changes when benchmarking dense vector search. Having noticed that {{AbstractKnnVectorQuery.rewrite}} has parallelism and that Solr's {{SolrIndexSearcher}} constructor did not yet pass an executor to its super class i.e. Lucene's {{IndexSearcher}} I was curious if picking out only the "pass an executor" part of the changes in this ticket would be beneficial. Links for Solr 9.4.1 using Lucene 9.8.0 version: * [https://github.com/apache/lucene/blob/releases/lucene/9.8.0/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L83-L87] * [https://github.com/apache/lucene/blob/releases/lucene/9.8.0/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L235] Links for Solr 9.5.0 using Lucene 9.9.2 version: * [https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L82-L88] * [https://github.com/apache/lucene/blob/releases/lucene/9.9.2/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L233-L234] The links are for 9.4.1 and 9.5.0 versions for convenience and to reflect initial code reading and subsequent experimental code base. Overall the results could be summarised as "unexpected" or "surprising" – passing an executor increased latency by around 20x with a reduction in container CPU use to approximately match that. The thread count used seemed to make no difference, we've tried a few different ones. The "just pass an executor" patch was onto a Solr 9.5 cloud, and I haven't really dived into the details much, but I'm wondering if the implications for this ticket here might be that searches passing {{multiThreaded=false}} could be impacted because currently nothing controls the passing of the executor to Lucene's {{IndexSearcher}} constructor. (And last but not least, a shoutout to [~abenedetti] because his [comment|https://lists.apache.org/thread/xc174z8qnls3o0by644n3fbtt28po7lc] on [this|https://lists.apache.org/thread/0olh0z2dc78y01k34yg06yrpzts2ggmp] user mailing list thread really encouraged me to take the time to write-up and share these results here.) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848668#comment-17848668 ] Andrzej Bialecki commented on SOLR-13350: - {quote}As of now, the timeAllowed requests are anyway executed without multithreading {quote} This is based on a {{QueryCommand.timeAllowed}} flag that is set only from the {{timeAllowed}} param. However, this concept was extended in SOLR-17138 to {{QueryLimits}} that is now initialized also using other params. There is indeed some inconsistency here that's a left-over from that change, in the sense that `QueryCommand.timeAllowed` should have been either removed completely or replaced with something like {{{}queryLimits{}}}, to make sure to check the current SolrRequestInfo for QueryLimits. In any case, the minimal workaround for this could be to check {{QueryLimits.getCurrentLimits().isLimitsEnabled()}} instead of {{{}QueryCommand.timeAllowed{}}}. But a better fix would be to properly unbreak the tracking of the parent {{SolrRequestInfo}} in MT search. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847967#comment-17847967 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. Are there other risks/trade-offs for enabling it besides timeAllowed being unsupported? Basically, when would a user not want this? I'll add some guidance for users to the reference guide on this, before closing this issue. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847965#comment-17847965 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. This is caused by breaking the end-to-end tracking of request context in SolrRequestInfo, which uses a thread-local deque to provide the same context for both the main and all sub-requests. This tracking is needed to setup the correct query timeout instance on the searcher ( QueryLimits ) for time-limited searches in the SolrIndexSearcher:727 . However, now that this method is executed in a separate "searcherCollector" thread the SolrRequestInfo instance it obtains is empty because it doesn't match the original thread that set it. QueryLimits has two parts: timeAllowed and cpuThreadLimits. # For timeAllowed, I can see that this value is passed to the searcher https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L726-L735. Hence, I think timeAllowed will be honoured by Lucene properly. # For cpuThreadLimits, the limits are set in SolrRequestInfo (https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/request/SolrRequestInfo.java#L86-L87). Seems like they are not getting carried over to the sub-threads. Noble Paul pointed me to the InheritableThreadLocalProvider for this. As of now, the timeAllowed requests are anyway executed without multithreading: https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java#L125 I'm considering adding the cpuThreadLimits based requests also to this exception list. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847589#comment-17847589 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. Yes please revert, this is not fully baked, and 9x is supposed to be stable. We are far away from the next release, so let us work towards fixing the query limits to work with MT search. If we are unable to, there is still not need to revert this change but just changing the defaults should be fine for the release. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11.5h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846929#comment-17846929 ] Gus Heck commented on SOLR-13350: - As it is, this will cause a break in back compatibility for folks that have adopted the new query limits functionality. The effect will be for the feature they intend to rely on to protect themselves from runaway queries to silently stop working. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846927#comment-17846927 ] Gus Heck commented on SOLR-13350: - Yes please revert, this is not fully baked, and 9x is supposed to be stable. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846029#comment-17846029 ] ASF subversion and git services commented on SOLR-13350: Commit b8410234993e44f9b64b02f053159f52c73f4433 in solr's branch refs/heads/branch_9x from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=solr.git;h=b8410234993 ] SOLR-13350: Multithreaded search > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845332#comment-17845332 ] David Smiley commented on SOLR-13350: - I suspect timeAllowed support could be added with relative ease. For reasons above, it won't work OOTB though. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845258#comment-17845258 ] Andrzej Bialecki commented on SOLR-13350: - This is caused by breaking the end-to-end tracking of request context in {{{}SolrRequestInfo{}}}, which uses a thread-local deque to provide the same context for both the main and all sub-requests. This tracking is needed to setup the correct query timeout instance on the searcher ( {{QueryLimits}} ) for time-limited searches in the {{SolrIndexSearcher:727}} . However, now that this method is executed in a separate "searcherCollector" thread the {{SolrRequestInfo}} instance it obtains is empty because it doesn't match the original thread that set it. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845149#comment-17845149 ] Gus Heck commented on SOLR-13350: - I'm a little concerned that multithreaded is set to false [in the TestCpuAllowed test|https://github.com/apache/solr/commit/ff6607d25f023a59f866a66820037bb215342ca8#diff-874aaa9daaf158f756a5bd6d61ec8ec5a6961063c5916e54641d2be7ad39a4b9R189] and [TestQueryLimits|https://github.com/apache/solr/commit/ff6607d25f023a59f866a66820037bb215342ca8#diff-0d9937f23efd32a0eadf335a72ea5c79727d15d1ed5a53bb89834da6164b8a36R89] tests. Was that meant to be revisited? if not, it seems to say that multi-threaded search is incompatible with query limits. That's an important thing to document if it is the intention. Even if documented, it doesn't seem great that users have to chose... Especially since fanning out on multiple threads presents an even greater opportunity to starve innocent bystander queries with a misbehaving query. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845138#comment-17845138 ] Houston Putman commented on SOLR-13350: --- I've been looking around and can't quite tell why timeAllowed isn't allowed with multithreaded search. Since each multi-threaded collector can respect timeAllowed itself, it seems easy enough to support. I'm sure I'm missing something though. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843845#comment-17843845 ] ASF subversion and git services commented on SOLR-13350: Commit f823e9b8695b05566a428cd7f7055714b6c325bd in solr's branch refs/heads/main from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=solr.git;h=f823e9b8695 ] SOLR-13350: Fixing logging to be less verbose > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 10h 40m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843842#comment-17843842 ] ASF subversion and git services commented on SOLR-13350: Commit a4979e0b80b93bbf7d76e7101de4fa34332672d7 in solr's branch refs/heads/main from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=solr.git;h=a4979e0b80b ] SOLR-13350: Fixing tidy > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 10h 40m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843841#comment-17843841 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. Do we have a limited queue for the pool, leading to the "RejectedExecution"? I think we'd like a caller-runs policy so we don't wait & starve Changed it to a larger queue than then number of threads, and removed the rejected execution handler as well. bq. Commit ff6607d25f023a59f866a66820037bb215342ca8 in solr's branch refs/heads/main from Ishan Chattopadhyaya Will backport to branch_9x after a few days of baking on main. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 10h 40m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843838#comment-17843838 ] ASF subversion and git services commented on SOLR-13350: Commit ff6607d25f023a59f866a66820037bb215342ca8 in solr's branch refs/heads/main from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=solr.git;h=ff6607d25f0 ] SOLR-13350: Multithreaded search (closes #2248) > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 10h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832928#comment-17832928 ] David Smiley commented on SOLR-13350: - Do we have a limited queue for the pool, leading to the "RejectedExecution"? I think we'd like a caller-runs policy so we don't wait & starve > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 7h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832913#comment-17832913 ] Ishan Chattopadhyaya commented on SOLR-13350: - {code} java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$250/0x000800376040@2c13c7b6 rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@407b44ad[Running, pool size = 6, active threads = 0, queued tasks = 6, completed tasks = 643] at java.base/java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2055) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825) ~[?:?] at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355) ~[?:?] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:275) ~[solrj/:?] at org.apache.lucene.search.TaskExecutor$TaskGroup.invokeAll(TaskExecutor.java:153) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.lucene.search.TaskExecutor.invokeAll(TaskExecutor.java:76) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.lucene.index.TermStates.build(TermStates.java:116) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:275) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:900) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:691) ~[lucene-core-9.9.2.jar:9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25 09:51:09] at org.apache.solr.search.SolrIndexSearcher.searchCollectorManagers(SolrIndexSearcher.java:2108) ~[core/:?] at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1922) ~[core/:?] at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1702) ~[core/:?] {code} > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 7h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832907#comment-17832907 ] Ishan Chattopadhyaya commented on SOLR-13350: - I think I understand what is going on: This issue in Lucene now uses the provided threadpool that Solr passes to the searcher to spawn concurrent sub-tasks for building the term states: https://github.com/apache/lucene/pull/12183 However, this leads to starvation, because once all threads are executing user queries, there are no free threads in the threadpool to spawn more tasks for building these term states. Related issue (but not the cause) for this is: https://github.com/apache/lucene/commit/2106bf5172a9c38a8457db383eb9f5cd1918ddc5 > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 7h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816348#comment-17816348 ] David Smiley commented on SOLR-13350: - I'm confused why some of you propose other than "b" – Global. How would the client/user know how many threads are appropriate? They are not running/operating Solr; they don't know it's hardware or load. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815862#comment-17815862 ] Noble Paul commented on SOLR-13350: --- I think this must be purely a request parameter. If the user wishes the value to be fixed without explicitly psssing the parameter, they can use {{"defaults"}} in {{solrconfix.xml}} > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815426#comment-17815426 ] Ishan Chattopadhyaya commented on SOLR-13350: - One thing that's important is configuration of the number of threads to be used for querying. There are several options: a) Per collection configuration b) Global configuration (core container based thread pool, as in the current patch) c) Request time configuration (possibly overriding any global defaults) I'm thinking (a) is not the best, and thinking of a combination of (b) and (c) here. WDYT [~cpoerschke]? > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 50m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815016#comment-17815016 ] David Smiley commented on SOLR-13350: - I'm really excited about this; thanks for resuming [~ichattopadhyaya] ! And [~atri] for his initial work, of course. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814975#comment-17814975 ] Ishan Chattopadhyaya commented on SOLR-13350: - Same 20k queries as above: 1 user thread at a time, 6 search executor threads at a time: {code} 50th=5.1998165, 90th=8.449806, 95th=10.19132270002, mean=29.829016496896898, total-queries=19980, total-time=678569 {code} 6 user threads at a time, 1 search executor threads at a time: {code} 50th=92.9081930001, 90th=378.8129378, 95th=575.193647003, mean=200.04457501921922, total-queries=19980, total-time=671495 {code} Seems like the throughput is very close to same (671s vs 678s for 20k queries), this indicates that overheads of the solution are are negligible. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814949#comment-17814949 ] Ishan Chattopadhyaya commented on SOLR-13350: - With 100 million documents, and 20k queries (6 at a time) [0]: With this change (it has a executor thread pool size of 6, so total 36 threads are processing at most): {code} 50th=16.619524, 90th=30.94703160027, 95th=38.94829185025, mean=20.014714108908905, total-queries=19980, total-time=67360 {code} Before this change: {code} 50th=92.9081930001, 90th=378.8129378, 95th=575.193647003, mean=200.04457501921922, total-queries=19980, total-time=671495 {code} [0] - https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814927#comment-17814927 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. It's surprising perhaps that GitHub can't identify you, looking at the https://infra.apache.org/new-committers-guide.html#config-access documentation nothing seems obviously new there. Thanks Christine, those instructions worked great. All these years, I had added my details to id.apache.org and was expecting it to work (as per instructions I followed). This boxer thing is nice. Latest PR is here now: https://github.com/apache/solr/pull/2248 > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814821#comment-17814821 ] Christine Poerschke commented on SOLR-13350: bq. Latest patch here: ... (I can't create a PR, since Github can't identify me as a collaborator). ... Thanks [~ichattopadhyaya] for continuing to pursue this! It's surprising perhaps that GitHub can't identify you, looking at the https://infra.apache.org/new-committers-guide.html#config-access documentation nothing seems obviously new there. If you'd like someone else to try and create a PR from the above branch, I'd be happy to give that a go if that would help. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814496#comment-17814496 ] Ishan Chattopadhyaya commented on SOLR-13350: - Latest patch here: https://github.com/apache/solr/compare/apache:main...apache:jira/solr-13350 (I can't create a PR, since Github can't identify me as a collaborator). Status: * Search is faster * Faceting is slower Investigating the slowdown in faceting when using multiple threads. Just a note: When I replace the ThreadSafeBitSet (originally borrowed from Netflix's zeno project) with a non-threadsafe version, speed is better but results are wrong. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735961#comment-17735961 ] Ishan Chattopadhyaya commented on SOLR-13350: - Thanks Mark! bq. If each thread is working on an index segment wouldn't it make sense to collect segment level bitsets? This would eliminate the need for thread safe bitset. I thought about it. Doing it naïvely might end up requiring a lot of memory this way. But, it should be possible to do so efficiently (each segment level bitset being as large as the range of docs it holds, then merging them properly at the end). Thanks for the clue. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735958#comment-17735958 ] Joel Bernstein commented on SOLR-13350: --- If each thread is working on an index segment wouldn't it make sense to collect segment level bitsets? This would eliminate the need for thread safe bitset. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735929#comment-17735929 ] Mark Robert Miller commented on SOLR-13350: --- I don't recall this code, but if the bitset was like an array, you wouldn't need a memory barrier for threads to be able to work on disjoint sections of the array. But you would need a memory barrier or something like an effectively final array variable for the threads to access the array reference, and you'd need a memory barrier on the values of the array for another thread to see them correctly. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735817#comment-17735817 ] Ishan Chattopadhyaya commented on SOLR-13350: - [~markrmil...@gmail.com], the ThreadSafeBitSet has a spin lock and compare and set, https://github.com/chatman/solr/blob/jira/solr-13350-9x/solr/core/src/java/org/apache/solr/search/ThreadSafeBitSet.java#L81-L89 How important is a thread safe bitset here? I am thinking that separate threads will operate on disjoint regions within the same bitset, and likely never collide on the same bits. Is that alone not sufficient to ensure consistency under concurrent updates to the bitset, without cas loop or locking etc.? > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735586#comment-17735586 ] Ishan Chattopadhyaya commented on SOLR-13350: - bq. Did a performance testing on this suite: https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json Querying performance is about 24x slower! :-o Seems like some obvious bottleneck, that I'll be working on chasing down. Seems like running a faceting query causes a massive slowdown (26 seconds, as opposed to 300ms), but running a scores (need scores) query first, and then firing a follow up faceting query (using the cached docset from the previous step) seems to be quick and fast. I'm looking into why this could be happening. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735331#comment-17735331 ] Ishan Chattopadhyaya commented on SOLR-13350: - I've updated the patches here to latest main and 9x: https://github.com/chatman/solr/tree/jira/solr-13350-5 https://github.com/chatman/solr/tree/jira/solr-13350-9x Did a performance testing on this suite: https://github.com/fullstorydev/solr-bench/blob/master/suites/stress-facets-local.json Querying performance is about 24x slower! :-o Seems like some obvious bottleneck, that I'll be working on chasing down. FYI [~markrmil...@gmail.com]. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315118#comment-17315118 ] ASF subversion and git services commented on SOLR-13350: Commit 36f268b65b12bb05da700f0a2c843acca7f30af5 in lucene-solr's branch refs/heads/tmp from Ishan Chattopadhyaya [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=36f268b ] SOLR-13350: Multi-threaded search using collectors manager > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 3h > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org