[jira] [Resolved] (SOLR-15447) Reproduce with line inaccurate for class methods
[ https://issues.apache.org/jira/browse/SOLR-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-15447. - Fix Version/s: 9.7 Assignee: David Smiley Resolution: Fixed > Reproduce with line inaccurate for class methods > > > Key: SOLR-15447 > URL: https://issues.apache.org/jira/browse/SOLR-15447 > Project: Solr > Issue Type: Test > Components: Tests >Reporter: Mike Drob >Assignee: David Smiley >Priority: Major > Fix For: 9.7 > > Attachments: SOLR-15447.patch > > > I had this failure running tests recently: > {noformat} > - org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod > (:solr:core) > Test output: > /Users/mdrob/code/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.cloud.api.collections.ShardSplitTest.txt > Reproduce with: gradlew :solr:core:test --tests > "org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod" > -Ptests.jvms=16 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 > -Ptests.seed=40D22CF6F9086FB4 -Ptests.file.encoding=UTF-8 {noformat} > There is no "classMethod" method, so the given command line fails to > reproduce the failure. Unfortunately, I do not have the logs anymore, but I > strongly expect this was a failure either in the test setup, teardown, or > possibly something like the thread leak detector. > > We should figure out how we can provide a better reproduce with line for this > type of failure. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-15447) Reproduce with line inaccurate for class methods
[ https://issues.apache.org/jira/browse/SOLR-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851882#comment-17851882 ] ASF subversion and git services commented on SOLR-15447: Commit 27ea8262994aef1a37b920a6f88586ea0540a0b7 in solr's branch refs/heads/branch_9x from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=27ea8262994 ] Build: report test history URLs. Fix classMethod suffix (#2487) * Test failures reported at the end of a build now include links to view test history at ge.apache.org and fucit.org * Use "./gradlew" (dot-slash) so we can paste the repro line to a shell * SOLR-15447: classMethod suffix is removed Refactored the gradle script some too. (cherry picked from commit c80f344d0dfddf9124a3c29953722a2a000d3296) > Reproduce with line inaccurate for class methods > > > Key: SOLR-15447 > URL: https://issues.apache.org/jira/browse/SOLR-15447 > Project: Solr > Issue Type: Test > Components: Tests >Reporter: Mike Drob >Priority: Major > Attachments: SOLR-15447.patch > > > I had this failure running tests recently: > {noformat} > - org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod > (:solr:core) > Test output: > /Users/mdrob/code/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.cloud.api.collections.ShardSplitTest.txt > Reproduce with: gradlew :solr:core:test --tests > "org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod" > -Ptests.jvms=16 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 > -Ptests.seed=40D22CF6F9086FB4 -Ptests.file.encoding=UTF-8 {noformat} > There is no "classMethod" method, so the given command line fails to > reproduce the failure. Unfortunately, I do not have the logs anymore, but I > strongly expect this was a failure either in the test setup, teardown, or > possibly something like the thread leak detector. > > We should figure out how we can provide a better reproduce with line for this > type of failure. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Build: report test history via GE [solr]
dsmiley merged PR #2487: URL: https://github.com/apache/solr/pull/2487 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-15447) Reproduce with line inaccurate for class methods
[ https://issues.apache.org/jira/browse/SOLR-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851880#comment-17851880 ] ASF subversion and git services commented on SOLR-15447: Commit c80f344d0dfddf9124a3c29953722a2a000d3296 in solr's branch refs/heads/main from David Smiley [ https://gitbox.apache.org/repos/asf?p=solr.git;h=c80f344d0df ] Build: report test history URLs. Fix classMethod suffix (#2487) * Test failures reported at the end of a build now include links to view test history at ge.apache.org and fucit.org * Use "./gradlew" (dot-slash) so we can paste the repro line to a shell * SOLR-15447: classMethod suffix is removed Refactored the gradle script some too. > Reproduce with line inaccurate for class methods > > > Key: SOLR-15447 > URL: https://issues.apache.org/jira/browse/SOLR-15447 > Project: Solr > Issue Type: Test > Components: Tests >Reporter: Mike Drob >Priority: Major > Attachments: SOLR-15447.patch > > > I had this failure running tests recently: > {noformat} > - org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod > (:solr:core) > Test output: > /Users/mdrob/code/solr/solr/core/build/test-results/test/outputs/OUTPUT-org.apache.solr.cloud.api.collections.ShardSplitTest.txt > Reproduce with: gradlew :solr:core:test --tests > "org.apache.solr.cloud.api.collections.ShardSplitTest.classMethod" > -Ptests.jvms=16 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 > -Ptests.seed=40D22CF6F9086FB4 -Ptests.file.encoding=UTF-8 {noformat} > There is no "classMethod" method, so the given command line fails to > reproduce the failure. Unfortunately, I do not have the logs anymore, but I > strongly expect this was a failure either in the test setup, teardown, or > possibly something like the thread leak detector. > > We should figure out how we can provide a better reproduce with line for this > type of failure. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Upgrade stackbrew, fix gitCommit, remove 32bit images [solr-docker]
dsmiley commented on PR #23: URL: https://github.com/apache/solr-docker/pull/23#issuecomment-2146399330 I'd imagine such a user that expects/required 32 bit would find out relatively quickly; it's not a subtle thing. It works or it doesn't. So IMO no need to announce-to/warn users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17269: Do not publish synthetic solr core (of Coordinator node) to ZK [solr]
dsmiley commented on code in PR #2438: URL: https://github.com/apache/solr/pull/2438#discussion_r1625218938 ## solr/core/src/java/org/apache/solr/core/ConfigSetService.java: ## @@ -391,6 +402,21 @@ protected NamedList loadConfigSetFlags(SolrResourceLoader loader) throws */ protected abstract SolrResourceLoader createCoreResourceLoader(CoreDescriptor cd); + /** + * Create a SolrResourceLoader for a core with the provided configSetName. + * + * By default, this will just call {@link + * ConfigSetService#createConfigSetService(CoreContainer)}. Child implementation might override + * this to make use of the configSetName directly + * + * @param cd the core's CoreDescriptor + * @param configSetName an optional config set name + * @return a SolrResourceLoader + */ + protected SolrResourceLoader createCoreResourceLoader(CoreDescriptor cd, String configSetName) { Review Comment: > The ultimate goal is to minimize code change and yet avoid calling [this part of the code](https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/cloud/ZkConfigSetService.java#L77-L100) for the synthetic core, so we wouldn't be registering against ZK. Why? It appears it'd have no ultimate effect -- it ought to resolve the same config for the same collection. Oh, is it the same collection for this feature? I don't love subclassing CoreDescriptor to be honest; you could just add a generic property in the core descriptor if you needed to flag one like "synthetic". Nonetheless overall I'm unsure why loading the ConfigSet needs to be different here. Also if for some reason we felt it useful, we could change ZkConfigSetServer.loadConfigSet such that an existing configSet name in the CD will be used, and thus avoid resolving that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17269: Do not publish synthetic solr core (of Coordinator node) to ZK [solr]
patsonluk commented on PR #2438: URL: https://github.com/apache/solr/pull/2438#issuecomment-2146314074 @dsmiley @noblepaul Many thanks for the helpful code review! I have pushed some changes based on the suggestions, would you mind to review again? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Upgrade stackbrew, fix gitCommit, remove 32bit images [solr-docker]
janhoy commented on PR #23: URL: https://github.com/apache/solr-docker/pull/23#issuecomment-2146299104 The history here is that right after the 9.6 release I upgraded a cluster on my M1 Mac to the new version, but got some strange errors, Solr would not start, but threw some weird low-level JDK IO exception. I spend quite some time before realizing that due to some serious delays before Docker builders built the arm64 images, Docker on my M1 mac decided to insteal pull the armv7 32-bit images that fail due to 64/32 bit mismatch. Worse, even after the arm64 images became available, Docker would NOT pull those automatically until I deleted the faulty image or force-pulled the new. I can imagine that the same would happen on some ARM based cloud server if it happened to auto upgrade immediately after a release announcement, but before the arm64 image was built (this can take 12+ hours). I was not aware we even build a 32 bit image. And I cannot imagine what system would run Solr on such a chip. Some old RasPI perhaps, although nowdays even smaller systems are 64 bit, no? Remember that Docker binary images is a convenience release and the supported architectures are clearly labeled on HUB, so we havent taken anything way or broken any promise, other than perhaps it is surprising to potential 32-bit users that support disappears in a minor release. Also, users can build whatever support they like using the Dockerfile we publish. Do you think we should do an ANNOUNCE type email about this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17269: Do not publish synthetic solr core (of Coordinator node) to ZK [solr]
patsonluk commented on code in PR #2438: URL: https://github.com/apache/solr/pull/2438#discussion_r1625146698 ## solr/core/src/test/org/apache/solr/search/TestCoordinatorRole.java: ## @@ -105,14 +102,11 @@ public void testSimple() throws Exception { assertEquals(10, rslt.getResults().size()); + String SYNTHETIC_COLLECTION = CoordinatorHttpSolrCall.getSyntheticCollectionName("conf"); DocCollection collection = cluster.getSolrClient().getClusterStateProvider().getCollection(SYNTHETIC_COLLECTION); - assertNotNull(collection); - - Set expectedNodes = new HashSet<>(); - expectedNodes.add(coordinatorJetty.getNodeName()); - collection.forEachReplica((s, replica) -> expectedNodes.remove(replica.getNodeName())); - assertTrue(expectedNodes.isEmpty()); + // this should be empty as synthetic collection does not register with ZK + assertNull(collection); Review Comment: Good call! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17269: Do not publish synthetic solr core (of Coordinator node) to ZK [solr]
patsonluk commented on code in PR #2438: URL: https://github.com/apache/solr/pull/2438#discussion_r1625146698 ## solr/core/src/test/org/apache/solr/search/TestCoordinatorRole.java: ## @@ -105,14 +102,11 @@ public void testSimple() throws Exception { assertEquals(10, rslt.getResults().size()); + String SYNTHETIC_COLLECTION = CoordinatorHttpSolrCall.getSyntheticCollectionName("conf"); DocCollection collection = cluster.getSolrClient().getClusterStateProvider().getCollection(SYNTHETIC_COLLECTION); - assertNotNull(collection); - - Set expectedNodes = new HashSet<>(); - expectedNodes.add(coordinatorJetty.getNodeName()); - collection.forEachReplica((s, replica) -> expectedNodes.remove(replica.getNodeName())); - assertTrue(expectedNodes.isEmpty()); + // this should be empty as synthetic collection does not register with ZK + assertNull(collection); Review Comment: Good call! And also we should confirm that there's no node created for the synthetic core. Thank you for the suggestion! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17269: Do not publish synthetic solr core (of Coordinator node) to ZK [solr]
patsonluk commented on code in PR #2438: URL: https://github.com/apache/solr/pull/2438#discussion_r1625146172 ## solr/core/src/java/org/apache/solr/core/ConfigSetService.java: ## @@ -391,6 +402,21 @@ protected NamedList loadConfigSetFlags(SolrResourceLoader loader) throws */ protected abstract SolrResourceLoader createCoreResourceLoader(CoreDescriptor cd); + /** + * Create a SolrResourceLoader for a core with the provided configSetName. + * + * By default, this will just call {@link + * ConfigSetService#createConfigSetService(CoreContainer)}. Child implementation might override + * this to make use of the configSetName directly + * + * @param cd the core's CoreDescriptor + * @param configSetName an optional config set name + * @return a SolrResourceLoader + */ + protected SolrResourceLoader createCoreResourceLoader(CoreDescriptor cd, String configSetName) { Review Comment: Yes the choice I took was problematic! The ultimate goal is to minimize code change and yet avoid calling [this part of the code](https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/cloud/ZkConfigSetService.java#L77-L100) for the synthetic core, so we wouldn't be registering against ZK. Since I saw the quoted code above "reads the config name from cluster state and set it against the CoreDescriptor", so I thought maybe a new "explicit" configset can indicate that we can "skip" those logic for `ZkConfigSetService`. But it didn't turn out well if I confused other devs ... Adding any flags/override method in `ConfigSetService` is likely not great since other non ZK implementation shouldn't really care. So im proposing to add a new class `SythethticCoreDescriptor` which extends `CoreDescriptor`, and that `ZkConfigSetService#createCoreResourceLoader` could do a special case (`instanceof SythethticCoreDescriptor`, not great...but at least it's all contained in Zk related class) to bypass the zk logic. Going to commit and push the change, please do lemme know if there are better alternatives! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16122) TestLeaderElectionZkExpiry failing frequently
[ https://issues.apache.org/jira/browse/SOLR-16122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851833#comment-17851833 ] David Smiley commented on SOLR-16122: - This test seems to fail due to thread leaks. Happened yesterday in CI: {noformat} 2> INFO: All leaked threads terminated. > com.carrotsearch.randomizedtesting.ThreadLeakError: 2 threads leaked from SUITE scope at org.apache.solr.cloud.TestLeaderElectionZkExpiry: >1) Thread[id=9557, name=zkConnectionManagerCallback-5960-thread-1-EventThread, state=WAITING, group=TGRP-TestLeaderElectionZkExpiry] > at java.base@11.0.16.1/jdk.internal.misc.Unsafe.park(Native Method) > at java.base@11.0.16.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) > at java.base@11.0.16.1/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081) > at java.base@11.0.16.1/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433) > at app//org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:535) >2) Thread[id=9549, name=zkConnectionManagerCallback-5960-thread-1-EventThread, state=WAITING, group=TGRP-TestLeaderElectionZkExpiry] > at java.base@11.0.16.1/java.lang.Object.wait(Native Method) > at java.base@11.0.16.1/java.lang.Object.wait(Object.java:328) > at app//org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1583) > at app//org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1555) > at app//org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1522) > at app//org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:1227) > at app//org.apache.solr.common.cloud.SolrZkClient.updateKeeper(SolrZkClient.java:863) > at app//org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:190) > at app//org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:59) > at app//org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:179) > at app//org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:564) > at app//org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:539) > at __randomizedtesting.SeedInfo.seed([B35AE6C0068D8659]:0) {noformat} And also for me on Crave recently (this time the OverseerShutdownThread): {noformat} 2> SEVERE: 1 thread leaked from SUITE scope at org.apache.solr.cloud.TestLeaderElectionZkExpiry: 2>1) Thread[id=349, name=OverseerExitThread, state=TIMED_WAITING, group=Overseer state updater.] 2> at java.base@11.0.23/java.lang.Thread.sleep(Native Method) 2> at app//org.apache.solr.common.cloud.ZkCmdExecutor.retryDelay(ZkCmdExecutor.java:101) 2> at app//org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:80) 2> at app//org.apache.solr.common.cloud.SolrZkClient.delete(SolrZkClient.java:345) 2> at app//org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:118) 2> at app//org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:310) 2> at app//org.apache.solr.cloud.LeaderElector.retryElection(LeaderElector.java:395) 2> at app//org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:133) 2> at app//org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:310) 2> at app//org.apache.solr.cloud.LeaderElector.retryElection(LeaderElector.java:395) 2> at app//org.apache.solr.cloud.ZkController.rejoinOverseerElection(ZkController.java:2364) 2> at app//org.apache.solr.cloud.Overseer$ClusterStateUpdater.checkIfIamStillLeader(Overseer.java:511) 2> at app//org.apache.solr.cloud.Overseer$ClusterStateUpdater$$Lambda$1667/0x00010099b840.run(Unknown Source) 2> at java.base@11.0.23/java.lang.Thread.run(Thread.java:829) {noformat} This one above seems clear to me how it could happen since a new Thread is spawned with no wait [here|https://github.com/apache/solr/blob/70b6e4f6952cb7f9b3647865404487c68264668d/solr/core/src/java/org/apache/solr/cloud/Overseer.java#L417]. > TestLeaderElectionZkExpiry failing frequently > - > > Key: SOLR-16122 > URL: https://issues.apache.org/jira/browse/SOLR-16122 > Project: Solr > Issue Type: Bug >Affects Versions: 9.0 >Reporter: Jan Høydahl >Priority: Major > > Failing in 10% of runs - marking as {{@BadApple}} before the 9.0 release -- This message was sent by Atlassian Jira
Re: [PR] Refactor duplication in UpdateLog.init [solr]
dsmiley commented on PR #2491: URL: https://github.com/apache/solr/pull/2491#issuecomment-2146223629 `org.apache.solr.cloud.TestLeaderElectionZkExpiry` failed due to leaked threads in test (that terminated eventually). Also not related to this change. Will comment on https://issues.apache.org/jira/browse/SOLR-16122 where this test is BadApple'ed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17044: Consolidate SolrJ URL-building logic [solr]
dsmiley commented on code in PR #2455: URL: https://github.com/apache/solr/pull/2455#discussion_r1624900136 ## solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java: ## @@ -56,6 +63,51 @@ public static Collection toContentStreams( return streams; } + /** + * Create the full URL for a SolrRequest (excepting query parameters) as a String + * + * @param solrRequest the {@link SolrRequest} to build the URL for + * @param requestWriter a {@link RequestWriter} from the {@link SolrClient} that will be sending + * the request + * @param serverRootUrl the root URL of the Solr server being targeted. May by overridden {@link + * SolrRequest#getBasePath()}, if present. + * @param collection the collection to send the request to. May be null if no collection is + * needed. + * @throws MalformedURLException if {@code serverRootUrl} or {@link SolrRequest#getBasePath()} + * contain a malformed URL string + */ + public static String buildRequestUrl( + SolrRequest solrRequest, + RequestWriter requestWriter, + String serverRootUrl, + String collection) + throws MalformedURLException { +String basePath = solrRequest.getBasePath() == null ? serverRootUrl : solrRequest.getBasePath(); + +if (solrRequest instanceof V2Request) { + if (System.getProperty("solr.v2RealPath") == null) { +basePath = changeV2RequestEndpoint(basePath); + } else { +basePath = serverRootUrl + "/v2"; + } +} + +if (solrRequest.requiresCollection() && collection != null) basePath += "/" + collection; + +String path = requestWriter.getPath(solrRequest); Review Comment: #2494 -- I increased the scope of an existing issue a little as somewhat related -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17320) HttpShardHandler should obey `timeAllowed` parameter in query
[ https://issues.apache.org/jira/browse/SOLR-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Khamesra updated SOLR-17320: --- Description: HttpShardHandler should use `timeAllowed` param in query to timeout for any shard response. We have observed that sometime different shard takes different time to process the query. In those cases, if user has specify timeAllowed, then solr should use that time to return any partial response. i have added the patch for it. [https://github.com/apache/solr/pull/2493] was: HttpShardHandler should use `timeAllowed` param in query to timeout for any shard response. We have observed that sometime different shard takes different time to process the query. In those cases, if user has specify timeAllowed, then solr use the time to return any partial response. i have added the patch for it. https://github.com/apache/solr/pull/2493 > HttpShardHandler should obey `timeAllowed` parameter in query > - > > Key: SOLR-17320 > URL: https://issues.apache.org/jira/browse/SOLR-17320 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, SolrCloud >Affects Versions: main (10.0), 9.6.1 >Reporter: Hitesh Khamesra >Priority: Minor > > HttpShardHandler should use `timeAllowed` param in query to timeout for any > shard response. We have observed that sometime different shard takes > different time to process the query. In those cases, if user has specify > timeAllowed, then solr should use that time to return any partial response. > i have added the patch for it. [https://github.com/apache/solr/pull/2493] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] SOLR-17256: Deprecate SolrRequest get/set BasePath [solr]
dsmiley opened a new pull request, #2494: URL: https://github.com/apache/solr/pull/2494 and RequestWriter.getPath https://issues.apache.org/jira/browse/SOLR-17256 First commit is deprecation; maybe just take this PR now and then follow up in another PR with replacements and removal. Not touching CHANGES.txt merely to mark some obscure methods deprecated but will eventually add to CHANGES.txt when there's something more interesting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17044: Consolidate SolrJ URL-building logic [solr]
gerlowskija commented on code in PR #2455: URL: https://github.com/apache/solr/pull/2455#discussion_r1624884632 ## solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java: ## @@ -56,6 +63,51 @@ public static Collection toContentStreams( return streams; } + /** + * Create the full URL for a SolrRequest (excepting query parameters) as a String + * + * @param solrRequest the {@link SolrRequest} to build the URL for + * @param requestWriter a {@link RequestWriter} from the {@link SolrClient} that will be sending + * the request + * @param serverRootUrl the root URL of the Solr server being targeted. May by overridden {@link + * SolrRequest#getBasePath()}, if present. + * @param collection the collection to send the request to. May be null if no collection is + * needed. + * @throws MalformedURLException if {@code serverRootUrl} or {@link SolrRequest#getBasePath()} + * contain a malformed URL string + */ + public static String buildRequestUrl( + SolrRequest solrRequest, + RequestWriter requestWriter, + String serverRootUrl, + String collection) + throws MalformedURLException { +String basePath = solrRequest.getBasePath() == null ? serverRootUrl : solrRequest.getBasePath(); + +if (solrRequest instanceof V2Request) { + if (System.getProperty("solr.v2RealPath") == null) { +basePath = changeV2RequestEndpoint(basePath); + } else { +basePath = serverRootUrl + "/v2"; Review Comment: Agreed. One of my goals for this PR is to get all the logic into one place so that we can more easily identify the warts that don't make sense anymore, and look at removing them. For this one in particular the sysprop seems to be primarily for our tests, and isn't documented anywhere..maybe it could be removed altogether? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17044: Consolidate SolrJ URL-building logic [solr]
gerlowskija commented on code in PR #2455: URL: https://github.com/apache/solr/pull/2455#discussion_r1624867836 ## solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java: ## @@ -56,6 +63,51 @@ public static Collection toContentStreams( return streams; } + /** + * Create the full URL for a SolrRequest (excepting query parameters) as a String + * + * @param solrRequest the {@link SolrRequest} to build the URL for + * @param requestWriter a {@link RequestWriter} from the {@link SolrClient} that will be sending + * the request + * @param serverRootUrl the root URL of the Solr server being targeted. May by overridden {@link + * SolrRequest#getBasePath()}, if present. + * @param collection the collection to send the request to. May be null if no collection is + * needed. + * @throws MalformedURLException if {@code serverRootUrl} or {@link SolrRequest#getBasePath()} + * contain a malformed URL string + */ + public static String buildRequestUrl( + SolrRequest solrRequest, + RequestWriter requestWriter, + String serverRootUrl, + String collection) + throws MalformedURLException { +String basePath = solrRequest.getBasePath() == null ? serverRootUrl : solrRequest.getBasePath(); + +if (solrRequest instanceof V2Request) { + if (System.getProperty("solr.v2RealPath") == null) { +basePath = changeV2RequestEndpoint(basePath); + } else { +basePath = serverRootUrl + "/v2"; + } +} + +if (solrRequest.requiresCollection() && collection != null) basePath += "/" + collection; + +String path = requestWriter.getPath(solrRequest); Review Comment: Agreed! I didn't tackle that here because, as a public and pluggable class in SolrJ, there are some backcompat implications that I didn't want to drag down this PR with. But I'm all for the change if you want to go for it, or if climbs high enough up my priority list... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-14673: Add bin/solr stream CLI [solr]
gerlowskija commented on PR #2479: URL: https://github.com/apache/solr/pull/2479#issuecomment-2145825393 A few high-level questions/concerns: 1. `bin/solr` already has an "api" tool, which can be used to invoke streaming expressions e.g. `bin/solr api -get "$SOLR_URL/techproducts/stream?expr=search(techproducts)"`. I'm all for syntactic-sugar, but I wonder whether this is worth the maintenance cost if the main thing that it "buys" us is saving people from having to provide the full API path as the "api" tool requires? 2. If I'm reading the PR correctly, it looks like one other capability of the proposed `bin/solr stream` tool is that it can evaluate streams "locally" in some cases i.e. without a full running Solr. Which is pretty cool - you could imagine a real super-user doing some pretty involved ETL that builds off of an expression like: `update(techproducts, unique(cat(...)))`. But I'd worry about some of the documentation challenges surrounding this. For instance, how would a user know which expressions can be run locally, and which require a Solr to execute on? For expressions that have a mix of both locally and remotely-executed clauses, is there any way for a user to know which clauses are executed where? To clarify - I think the upside here is pretty cool, I'm just worried about the documentation end and what we might need to make it usable by folks in practice. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17320) HttpShardHandler should obey `timeAllowed` parameter in query
[ https://issues.apache.org/jira/browse/SOLR-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Khamesra updated SOLR-17320: --- Description: HttpShardHandler should use `timeAllowed` param in query to timeout for any shard response. We have observed that sometime different shard takes different time to process the query. In those cases, if user has specify timeAllowed, then solr use the time to return any partial response. i have added the patch for it. https://github.com/apache/solr/pull/2493 was: HttpShardHandler should use `timeAllowed` param in query to timeout for any shard response. We have observed that sometime different shard takes different time to process the query. In those cases, if user has specify timeAllowed, then solr use the time to return any partial response. i have added the patch for it. > HttpShardHandler should obey `timeAllowed` parameter in query > - > > Key: SOLR-17320 > URL: https://issues.apache.org/jira/browse/SOLR-17320 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, SolrCloud >Affects Versions: main (10.0), 9.6.1 >Reporter: Hitesh Khamesra >Priority: Minor > > HttpShardHandler should use `timeAllowed` param in query to timeout for any > shard response. We have observed that sometime different shard takes > different time to process the query. In those cases, if user has specify > timeAllowed, then solr use the time to return any partial response. > i have added the patch for it. https://github.com/apache/solr/pull/2493 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17320) HttpShardHandler should obey `timeAllowed` parameter in query
Hitesh Khamesra created SOLR-17320: -- Summary: HttpShardHandler should obey `timeAllowed` parameter in query Key: SOLR-17320 URL: https://issues.apache.org/jira/browse/SOLR-17320 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: query, SolrCloud Affects Versions: 9.6.1, main (10.0) Reporter: Hitesh Khamesra HttpShardHandler should use `timeAllowed` param in query to timeout for any shard response. We have observed that sometime different shard takes different time to process the query. In those cases, if user has specify timeAllowed, then solr use the time to return any partial response. i have added the patch for it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] Added support for timeAllowed time out in HttpShardHandler [solr]
hiteshk25 opened a new pull request, #2493: URL: https://github.com/apache/solr/pull/2493 https://issues.apache.org/jira/browse/SOLR-X # Description Please provide a short description of the changes you're making with this pull request. # Solution Please provide a short description of the approach taken to implement your solution. # Tests Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem. # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `main` branch. - [ ] I have run `./gradlew check`. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] Add 'score' field caveat to cursormark docs [solr]
gerlowskija opened a new pull request, #2492: URL: https://github.com/apache/solr/pull/2492 # Description Cursormark can give some funky results if used across multiple replicas in a SolrCloud collection, if `score` is used as a sort field. But there's not currently any warning of this in the ref-guide. # Solution This PR documents the limitation for users in the ref-guide. # Tests N/A # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my code conforms to the standards described there to the best of my ability. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `main` branch. - [x] I have run `./gradlew check`. - [x] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16654) Add support for node-level caches
[ https://issues.apache.org/jira/browse/SOLR-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851733#comment-17851733 ] Michael Gibney commented on SOLR-16654: --- Thanks for the heads-up. I dug into this a bit, and it looks like it's actually quite difficult to come up with concrete expectations regarding the exact entries that will be warmed and/or evicted. I'll re-work this test accordingly (testing for expectations that _can_ reliably be met) and push a fix. > Add support for node-level caches > - > > Key: SOLR-16654 > URL: https://issues.apache.org/jira/browse/SOLR-16654 > Project: Solr > Issue Type: New Feature >Affects Versions: main (10.0) >Reporter: Michael Gibney >Priority: Minor > Fix For: 9.4 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Caches are currently configured only at the level of individual cores, sized > according to expected usage patterns for the core. > The main tradeoff in cache sizing is heap space, which is of course limited > at the JVM/node level. Thus there is a conflict between sizing cache to > per-core use patterns vs. sizing cache to enforce limits on overall heap > usage. > This issue proposes some minor changes to facilitate the introduction of > node-level caches: > # support a {{}} node in {{solr.xml}}, to parse named cache configs, > for caches to be instantiated/accessible at the level of {{CoreContainer}}. > The syntax of this config node would be identical to the syntax of the "user > caches" config in {{solrconfig.xml}}. > # provide a hook in searcher warming to initialize core-level caches with the > initial associated searcher. (analogous to {{warm()}}, but for the initial > searcher -- see SOLR-16017, which fwiw was initially opened to support a > different use case that requires identical functionality). > Part of the appeal of this approach is that the above (minimal) changes are > the only changes required to enable pluggable node-level cache > implementations -- i.e. no further API changes are necessary, and no > behavioral changes are introduced for existing code. > Note: I anticipate that the functionality enabled by nodel-level caches will > mainly be useful for enforcing global resource limits -- it is not primarily > expected to be used for sharing entries across different cores/searchers > (although such use would be possible). > Initial use cases envisioned: > # "thin" core-level caches (filterCache, queryResultCache, etc.) backed by > "node-level" caches. > # dynamic (i.e. not static-"firstSeacher") warming of OrdinalMaps, by > placing OrdinalMaps in an actual cache with, e.g., a time-based expiration > policy. > This functionality would be particularly useful for cases with many cores per > node, and even more so in cases with uneven core usage patterns. But having > the ability to configure resource limits at a level that directly corresponds > to the available resources (i.e., node-level) would be generally useful for > all cases. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17272) PerReplicaState: Replica "state" and "leader" are still in state.json
[ https://issues.apache.org/jira/browse/SOLR-17272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17851715#comment-17851715 ] ASF subversion and git services commented on SOLR-17272: Commit 71160170a4d44ef755e6ff71c47378f45ca79c9e in solr's branch refs/heads/main from Noble Paul [ https://gitbox.apache.org/repos/asf?p=solr.git;h=71160170a4d ] SOLR-17272: PerReplicaState: Replica "state" and "leader" are still in state.json (#2458) > PerReplicaState: Replica "state" and "leader" are still in state.json > - > > Key: SOLR-17272 > URL: https://issues.apache.org/jira/browse/SOLR-17272 > Project: Solr > Issue Type: Improvement >Reporter: Justin Sweeney >Assignee: Noble Paul >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > When PerReplicaState is enabled the replica "state" and "leader" attributes > should be manage per replica and no longer be needed in state.json. These > attributes should be removed from state.json in this case to avoid confusion > about the source of truth for replica state. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17272: PerReplicaState: Replica "state" and "leader" are still in state.json [solr]
noblepaul merged PR #2458: URL: https://github.com/apache/solr/pull/2458 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Introduce support for Reciprocal Rank Fusion (combining queries) [solr]
renatoh commented on code in PR #2489: URL: https://github.com/apache/solr/pull/2489#discussion_r1624657341 ## solr/core/src/java/org/apache/solr/search/combining/ReciprocalRankFusion.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search.combining; + +import static org.apache.lucene.search.TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.StringJoiner; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.document.Document; +import org.apache.lucene.search.Explanation; +import org.apache.lucene.search.Query; +import org.apache.solr.common.params.CombinerParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.common.util.SimpleOrderedMap; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.search.DocIterator; +import org.apache.solr.search.DocList; +import org.apache.solr.search.DocSlice; +import org.apache.solr.search.QueryResult; +import org.apache.solr.search.SolrIndexSearcher; +import org.apache.solr.search.SortedIntDocSet; + +/** + * Reciprocal Rank Fusion (RRF) is an algorithm that takes in input multiple ranked lists to produce + * a unified result set. Examples of use cases where RRF can be used include hybrid search and + * multiple Knn vector queries executed concurrently. RRF is based on the concept of reciprocal + * rank, which is the inverse of the rank of a document in a ranked list of search results. The + * combination of search results happens taking into account the position of the items in the + * original rankings, and giving higher score to items that are ranked higher in multiple lists. RRF + * was introduced the first time by Cormack et al. in [1]. + * + * [1] Cormack, Gordon V. et al. “Reciprocal rank fusion outperforms condorcet and individual + * rank learning methods.” Proceedings of the 32nd international ACM SIGIR conference on Research + * and development in information retrieval (2009) + */ +public class ReciprocalRankFusion extends QueriesCombiner { + int k; + + public ReciprocalRankFusion(SolrParams requestParams) { +super(requestParams); +this.k = +requestParams.getInt(CombinerParams.COMBINER_RRF_K, CombinerParams.COMBINER_RRF_K_DEFAULT); + } + + @Override + public QueryResult combine(QueryResult[] rankedLists) { +QueryResult combinedRankedList = initCombinedResult(rankedLists); +List docLists = new ArrayList<>(rankedLists.length); +for (QueryResult rankedList : rankedLists) { + docLists.add(rankedList.getDocList()); +} +combineResults(combinedRankedList, docLists, false); +return combinedRankedList; + } + + private Map combineResults( + QueryResult combinedRankedList, + List rankedLists, + boolean saveRankPositionsForExplain) { +Map docIdToRanks = null; +HashMap docIdToScore = new HashMap<>(); +for (DocList rankedList : rankedLists) { + DocIterator docs = rankedList.iterator(); + int ranking = 1; + while (docs.hasNext() && ranking <= upTo) { +int docId = docs.nextDoc(); +float rrfScore = 1f / (k + ranking); +docIdToScore.compute(docId, (id, score) -> (score == null) ? rrfScore : score + rrfScore); +ranking++; + } +} +Stream> sortedByScoreDescending = +docIdToScore.entrySet().stream() +.sorted(Collections.reverseOrder(Map.Entry.comparingByValue())); + +int combinedResultsLength = docIdToScore.size(); +int[] combinedResultsDocIds = new int[combinedResultsLength]; +float[] combinedResultScores = new float[combinedResultsLength]; + +int i = 0; +for (Map.Entry scoredDoc : +sortedByScoreDescending.collect(Collectors.toList())) { + combinedResultsDocIds[i] = scoredDoc.getKey(); + combinedResultScores[i] = scoredDoc.getValue(); + i++; +} + +if (saveRankPositionsForExplain) { + docIdToRanks = getRanks(rankedLists,
Re: [PR] Refactor duplication in UpdateLog.init [solr]
dsmiley commented on PR #2491: URL: https://github.com/apache/solr/pull/2491#issuecomment-2145322947 `TestLBHttp2SolrClient.testTwoServers` seemed to fail because Crave was running slow and hit a somewhat short timeout -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Introduce support for Reciprocal Rank Fusion (combining queries) [solr]
alessandrobenedetti commented on code in PR #2489: URL: https://github.com/apache/solr/pull/2489#discussion_r1624531316 ## solr/core/src/java/org/apache/solr/search/combining/ReciprocalRankFusion.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search.combining; + +import static org.apache.lucene.search.TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.StringJoiner; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.document.Document; +import org.apache.lucene.search.Explanation; +import org.apache.lucene.search.Query; +import org.apache.solr.common.params.CombinerParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.common.util.SimpleOrderedMap; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.search.DocIterator; +import org.apache.solr.search.DocList; +import org.apache.solr.search.DocSlice; +import org.apache.solr.search.QueryResult; +import org.apache.solr.search.SolrIndexSearcher; +import org.apache.solr.search.SortedIntDocSet; + +/** + * Reciprocal Rank Fusion (RRF) is an algorithm that takes in input multiple ranked lists to produce + * a unified result set. Examples of use cases where RRF can be used include hybrid search and + * multiple Knn vector queries executed concurrently. RRF is based on the concept of reciprocal + * rank, which is the inverse of the rank of a document in a ranked list of search results. The + * combination of search results happens taking into account the position of the items in the + * original rankings, and giving higher score to items that are ranked higher in multiple lists. RRF + * was introduced the first time by Cormack et al. in [1]. + * + * [1] Cormack, Gordon V. et al. “Reciprocal rank fusion outperforms condorcet and individual + * rank learning methods.” Proceedings of the 32nd international ACM SIGIR conference on Research + * and development in information retrieval (2009) + */ +public class ReciprocalRankFusion extends QueriesCombiner { + int k; + + public ReciprocalRankFusion(SolrParams requestParams) { +super(requestParams); +this.k = +requestParams.getInt(CombinerParams.COMBINER_RRF_K, CombinerParams.COMBINER_RRF_K_DEFAULT); + } + + @Override + public QueryResult combine(QueryResult[] rankedLists) { +QueryResult combinedRankedList = initCombinedResult(rankedLists); +List docLists = new ArrayList<>(rankedLists.length); +for (QueryResult rankedList : rankedLists) { + docLists.add(rankedList.getDocList()); +} +combineResults(combinedRankedList, docLists, false); +return combinedRankedList; + } + + private Map combineResults( + QueryResult combinedRankedList, + List rankedLists, + boolean saveRankPositionsForExplain) { +Map docIdToRanks = null; +HashMap docIdToScore = new HashMap<>(); +for (DocList rankedList : rankedLists) { + DocIterator docs = rankedList.iterator(); + int ranking = 1; + while (docs.hasNext() && ranking <= upTo) { +int docId = docs.nextDoc(); +float rrfScore = 1f / (k + ranking); +docIdToScore.compute(docId, (id, score) -> (score == null) ? rrfScore : score + rrfScore); +ranking++; + } +} +Stream> sortedByScoreDescending = +docIdToScore.entrySet().stream() +.sorted(Collections.reverseOrder(Map.Entry.comparingByValue())); + +int combinedResultsLength = docIdToScore.size(); +int[] combinedResultsDocIds = new int[combinedResultsLength]; +float[] combinedResultScores = new float[combinedResultsLength]; + +int i = 0; +for (Map.Entry scoredDoc : +sortedByScoreDescending.collect(Collectors.toList())) { + combinedResultsDocIds[i] = scoredDoc.getKey(); + combinedResultScores[i] = scoredDoc.getValue(); + i++; +} + +if (saveRankPositionsForExplain) { + docIdToRanks = getRanks(rankedLists,
Re: [PR] Upgrade stackbrew, fix gitCommit, remove 32bit images [solr-docker]
gus-asf commented on PR #23: URL: https://github.com/apache/solr-docker/pull/23#issuecomment-2145182055 Removing support for old stuff seems like something that should involve dev/user list discussion... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17099: snitch does not return spurious tags [solr]
psalagnac commented on code in PR #2278: URL: https://github.com/apache/solr/pull/2278#discussion_r1624124928 ## solr/CHANGES.txt: ## @@ -103,6 +103,7 @@ Optimizations * GITHUB#2217: Scale to 10K+ collections better in ZkStateReader.refreshCollectionsList (David Smiley) +* SOLR-17099: snitch does not return spurious tags (Pierre Salagnac) Review Comment: Thanks for your feedback, and sorry for the time it took me to get back to this change. I updated the changelog to something more readable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Introduce support for Reciprocal Rank Fusion (combining queries) [solr]
renatoh commented on code in PR #2489: URL: https://github.com/apache/solr/pull/2489#discussion_r1623870400 ## solr/core/src/java/org/apache/solr/search/combining/ReciprocalRankFusion.java: ## @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.search.combining; + +import static org.apache.lucene.search.TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.StringJoiner; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.document.Document; +import org.apache.lucene.search.Explanation; +import org.apache.lucene.search.Query; +import org.apache.solr.common.params.CombinerParams; +import org.apache.solr.common.params.SolrParams; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.common.util.SimpleOrderedMap; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.search.DocIterator; +import org.apache.solr.search.DocList; +import org.apache.solr.search.DocSlice; +import org.apache.solr.search.QueryResult; +import org.apache.solr.search.SolrIndexSearcher; +import org.apache.solr.search.SortedIntDocSet; + +/** + * Reciprocal Rank Fusion (RRF) is an algorithm that takes in input multiple ranked lists to produce + * a unified result set. Examples of use cases where RRF can be used include hybrid search and + * multiple Knn vector queries executed concurrently. RRF is based on the concept of reciprocal + * rank, which is the inverse of the rank of a document in a ranked list of search results. The + * combination of search results happens taking into account the position of the items in the + * original rankings, and giving higher score to items that are ranked higher in multiple lists. RRF + * was introduced the first time by Cormack et al. in [1]. + * + * [1] Cormack, Gordon V. et al. “Reciprocal rank fusion outperforms condorcet and individual + * rank learning methods.” Proceedings of the 32nd international ACM SIGIR conference on Research + * and development in information retrieval (2009) + */ +public class ReciprocalRankFusion extends QueriesCombiner { + int k; + + public ReciprocalRankFusion(SolrParams requestParams) { +super(requestParams); +this.k = +requestParams.getInt(CombinerParams.COMBINER_RRF_K, CombinerParams.COMBINER_RRF_K_DEFAULT); + } + + @Override + public QueryResult combine(QueryResult[] rankedLists) { +QueryResult combinedRankedList = initCombinedResult(rankedLists); +List docLists = new ArrayList<>(rankedLists.length); +for (QueryResult rankedList : rankedLists) { + docLists.add(rankedList.getDocList()); +} +combineResults(combinedRankedList, docLists, false); +return combinedRankedList; + } + + private Map combineResults( + QueryResult combinedRankedList, + List rankedLists, + boolean saveRankPositionsForExplain) { +Map docIdToRanks = null; +HashMap docIdToScore = new HashMap<>(); +for (DocList rankedList : rankedLists) { + DocIterator docs = rankedList.iterator(); + int ranking = 1; + while (docs.hasNext() && ranking <= upTo) { +int docId = docs.nextDoc(); +float rrfScore = 1f / (k + ranking); +docIdToScore.compute(docId, (id, score) -> (score == null) ? rrfScore : score + rrfScore); +ranking++; + } +} +Stream> sortedByScoreDescending = +docIdToScore.entrySet().stream() +.sorted(Collections.reverseOrder(Map.Entry.comparingByValue())); + +int combinedResultsLength = docIdToScore.size(); +int[] combinedResultsDocIds = new int[combinedResultsLength]; +float[] combinedResultScores = new float[combinedResultsLength]; + +int i = 0; +for (Map.Entry scoredDoc : +sortedByScoreDescending.collect(Collectors.toList())) { + combinedResultsDocIds[i] = scoredDoc.getKey(); + combinedResultScores[i] = scoredDoc.getValue(); + i++; +} + +if (saveRankPositionsForExplain) { + docIdToRanks = getRanks(rankedLists,