Re: [PR] Build: gradlew avoid downloader [solr]
dsmiley commented on PR #2419: URL: https://github.com/apache/solr/pull/2419#issuecomment-2116609994 I'll merge this in a couple days if I don't hear back. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Minor: Http2SolrClient cleanups [solr]
dsmiley commented on PR #2453: URL: https://github.com/apache/solr/pull/2453#issuecomment-2116609184 I'll merge this in a couple days if I don't hear anything back. CC @jdyer1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17297) Classloading issue with plugin and modules
[ https://issues.apache.org/jira/browse/SOLR-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847147#comment-17847147 ] David Smiley commented on SOLR-17297: - Interesting. SolrResourceLoader.addToClassLoader has a clear contract in its javadocs that it MUST only be called prior to using to get any resources. Yet I suspect the underlying class loader here has already loaded stuff, based on your description. bq. Switching setupSharedLib and initModules might work too (haven't tested) My thoughts exactly. Try that :-) FWIW where I work, our Lucene PostingsFormat/Codec level stuff goes into WEB-INF/lib because we ran into issues with it in sharedLib. I'm glad to see you've looked closely at the matter. > Classloading issue with plugin and modules > -- > > Key: SOLR-17297 > URL: https://issues.apache.org/jira/browse/SOLR-17297 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 9.3 >Reporter: Patson Luk >Priority: Critical > > h2. Summary > Using plugin jar and enabling any modules could trigger > {{java.lang.ClassNotFoundException}} > h2. Description > 1. An implementation of {{org.apache.lucene.codecs.PostingsFormat}} with the > jar within the /lib > 2. Enable modules in solr.xml for example name="modules">opentelemetry > 3. Now on startup. As a part of {{NodeConfig#setupSharedLib}}, it would load > all the SPIs, it locates our jar and loads the class with a > {{FactoryURLClassLoader}} with the classpaths point at the jar of the lib, > which is correct > 4. After {{NodeConfig#setupSharedLib}}, {{NodeConfig#initModules}} is > invoked, which eventually calls {{SolrResourceLoader#addURLsToClassLoader}} > that closes the previous class loader, which is the one used in 3. > 5. Now a core is loaded with that codec, it runs the code which > references other classes within our plugin jar, but unfortunately it would > use the Classloader that loads our class in step 3., and such loader is > marked as "closed" hence no longer load the correct resource/class. This > triggers ClassNotFoundException. > I have tried several things, the only thing that seems to work so far is > commenting out {{IOUtils.closeWhileHandlingException(oldLoader);}} in > {{SolrResourceLoader#addURLsToClassLoader}}, which is likely not the right > workaround as the {{closeWhileHandlingException}} should be there for a > reason ;) > Switching {{setupSharedLib}} and {{initModules}} might work too (haven't > tested), but I don't want to try any weird changes since I don't really know > the ordering significance. > Would appreciate some helps from the Solr experts! :) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847120#comment-17847120 ] Chris M. Hostetter commented on SOLR-17296: --- {quote}I recommend we immediately: ... {quote} I've created a branch/PR with this solution implemented. There is currently just one {{nocommit}} which is a placeholder string for citing the new "SOLR-XXX" jira i proposed above to explore a longer term fix. If there are no objections to the approach I've taken in this branch i'll create that new jira and update the branch before merging. FWIW: I feel like this is a serious enough issue, with a safe enough solution, that I'd like to try and get it in the proposed 9.6.1 -- assuming we can get eyeballs on it ASAP. (I will be offline and unreachable for 2 weeks after May 21) /ping [~jbernste] [~cpoerschke] [~andywebb] > rerank w/scaling (still) broken when using debug to get explain info > > > Key: SOLR-17296 > URL: https://issues.apache.org/jira/browse/SOLR-17296 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 9.4 >Reporter: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17296.test.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The changes made in SOLR-16931 (9.4) attempted to work around problems that > existed when attempting to enable degugging (to get score explanations) in > combination with using {{reRankScale}} ... > {quote}The reason for this is that in order to do proper explain for > minMaxScaling you need to know the min and max score in the result set. This > piece of state is maintained in the ReRankScaler itself which is inside of > the ReRankQuery. But for this information to be populated the query must > first be run. In distributed mode, explain is called in the second pass when > the ids query is run so the state needed for the explain is not populated. ... > {quote} > However, the solution attempted was incomplete and failed to account for > multiple factors... > {quote}... The PR attached to this addresses this problem by doing a single > pass distributed query if debugQuery is turned on and if reRank score scaling > is applied. I'll add a distributed test for this as well. > This change is very limited in scope because the single pass distributed is > only switched on in the very specific case when debugQuery=true and > reRankScaling is on. > {quote} > > * NPEs are still possible... > ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is > what triggers explain logic) the new code only checked for specific debug > request param combinations: > *** {{debuQuery=true}} (a legacy option intended only for backcompat) > * > ** > *** {{debug=true}} (intended as an alias for {{debug=all}} > ** It did not check for either of these options, which if used will still > trigger an NPE... > *** {{debug=results}} (which actually dictates the value of > {{ResponseBuilder.isDebugResults()}} > *** {{debug=all}} (a short hand for setting all debug options) > * the attempt to force a single pass query didn't modify the correct variable > ** The new code modified a conditional based on a {{boolean > distribSinglePass}} for setting {{sreq.purpose}} and > {{rb.onePassDistributedQuery}} > ** But it did not modify the value of the {{boolean distribSinglePass}} > itself - meaning other logic that uses that variable in that method still > assumes multiple passes will be used. > ** In particular, these means that even though a single pass is used for > both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} > requested by the user is not propagated as part of this request > *** Only the uniqueKey and any sot fields are ultimately returned to the > user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] SOLR-17296: fix "explain" bugs with rerank scaling + multi-shard [solr]
hossman opened a new pull request, #2465: URL: https://github.com/apache/solr/pull/2465 https://issues.apache.org/jira/browse/SOLR-17296 Implements the fix proposed in jira... > I recommend we immediately: > > * revert the logic added by [SOLR-16931](https://issues.apache.org/jira/browse/SOLR-16931) > * Open a new SOLR-XXX jira for long term consideration of how to redesign debug component to get the multistage info needed for explaining reranked results > * harden ReRankScaler.explain so that if the dat it expects isn't available, it wraps the data it DOES have with a "0.0 == scaled reranking not available (consider using distrib.singlePass - see SOLR-XXX)" ...branch currently has one `nocommit` pertaining to what (new) jira we want to cite and link to (where we can have a longer term discussion & implementation of a fix that doesn't require using `distrib.singlePass`) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17297) Classloading issue with plugin and modules
Patson Luk created SOLR-17297: - Summary: Classloading issue with plugin and modules Key: SOLR-17297 URL: https://issues.apache.org/jira/browse/SOLR-17297 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Affects Versions: 9.3 Reporter: Patson Luk h2. Summary Using plugin jar and enabling any modules could trigger {{java.lang.ClassNotFoundException}} h2. Description 1. An implementation of {{org.apache.lucene.codecs.PostingsFormat}} with the jar within the /lib 2. Enable modules in solr.xml for example opentelemetry 3. Now on startup. As a part of {{NodeConfig#setupSharedLib}}, it would load all the SPIs, it locates our jar and loads the class with a {{FactoryURLClassLoader}} with the classpaths point at the jar of the lib, which is correct 4. After {{NodeConfig#setupSharedLib}}, {{NodeConfig#initModules}} is invoked, which eventually calls {{SolrResourceLoader#addURLsToClassLoader}} that closes the previous class loader, which is the one used in 3. 5. Now a core is loaded with that codec, it runs the code which references other classes within our plugin jar, but unfortunately it would use the Classloader that loads our class in step 3., and such loader is marked as "closed" hence no longer load the correct resource/class. This triggers ClassNotFoundException. I have tried several things, the only thing that seems to work so far is commenting out {{IOUtils.closeWhileHandlingException(oldLoader);}} in {{SolrResourceLoader#addURLsToClassLoader}}, which is likely not the right workaround as the {{closeWhileHandlingException}} should be there for a reason ;) Switching {{setupSharedLib}} and {{initModules}} might work too (haven't tested), but I don't want to try any weird changes since I don't really know the ordering significance. Would appreciate some helps from the Solr experts! :) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16866) CachingDirectoryFactory Throwing Error When Closing
[ https://issues.apache.org/jira/browse/SOLR-16866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847106#comment-17847106 ] Chris M. Hostetter commented on SOLR-16866: --- Also, FWIW: I'm not familiar enough with the code paths involved to know if switching to {{File.separatorChar}} is the right fix (are these Strings filesystem paths, or "conceptual" paths that have been "normalized" ) but given the windows test failures i'm going to assume your latest patch is valid. If no one with a windows box (or better confidence in the code) can validate, i suggest you commit to main and watch the Windows jenkins builds like a hawk, and if it starts passing then merge to 9x / 9.6.1 > CachingDirectoryFactory Throwing Error When Closing > --- > > Key: SOLR-16866 > URL: https://issues.apache.org/jira/browse/SOLR-16866 > Project: Solr > Issue Type: Bug >Affects Versions: 9.2.1 >Reporter: Stefan Pieper >Assignee: Michael Gibney >Priority: Major > Labels: core > Fix For: 9.6 > > Attachments: solr.blueprint_gnylctsgemcz_users.log > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Observed occasional ERROR log entries when {{CachingDirectryFactory#close()}} > is called. The respective snippet from the log is this: > {noformat} > 2023-07-06 08:43:37.618 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: /var/solr/data/blueprint_gnylctsgemcz_users/data > 2023-07-06 08:43:37.620 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > 2023-07-06 08:43:37.620 ERROR (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Error removing > directory => java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > {noformat} > It seems like the order of directory removal is causing this (first parent > dir _data_, then sub-dir _data/index_). Entries should be re-arranged or just > the parent dir be handled. > Full log with entries dealing with the respective core is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17263) HttpJdkSolrClient doesn't encode curly braces etc
[ https://issues.apache.org/jira/browse/SOLR-17263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Webb updated SOLR-17263: - Resolution: Fixed Status: Resolved (was: Patch Available) > HttpJdkSolrClient doesn't encode curly braces etc > - > > Key: SOLR-17263 > URL: https://issues.apache.org/jira/browse/SOLR-17263 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 >Reporter: Andy Webb >Priority: Major > Fix For: 9.6.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Ref > https://issues.apache.org/jira/browse/SOLR-599?focusedCommentId=17842429#comment-17842429 > - {{HttpJdkSolrClient}} should use {{{}SolrParams{}}}' {{toQueryString()}} > method when constructing URLs to that all URL-unsafe characters are encoded. > It's implicitly using the {{toString()}} method currently which is intended > for logging etc purposes. > Attempting to use alternate query parsers in requests as shown below will > currently fail as the curly braces aren't encoded. > {noformat} > myquery.set("fq", "{!terms f=myfield}value1,value2"); {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17263) HttpJdkSolrClient doesn't encode curly braces etc
[ https://issues.apache.org/jira/browse/SOLR-17263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847092#comment-17847092 ] Andy Webb commented on SOLR-17263: -- Thanks [~dsmiley]! I'm happy with all the above, would say that's all we need to merge in for 9.6.1 here. https://github.com/apache/solr/pull/2454 could be included too subject to review, but it's just a refactoring so it can wait. > HttpJdkSolrClient doesn't encode curly braces etc > - > > Key: SOLR-17263 > URL: https://issues.apache.org/jira/browse/SOLR-17263 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 >Reporter: Andy Webb >Priority: Major > Fix For: 9.6.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Ref > https://issues.apache.org/jira/browse/SOLR-599?focusedCommentId=17842429#comment-17842429 > - {{HttpJdkSolrClient}} should use {{{}SolrParams{}}}' {{toQueryString()}} > method when constructing URLs to that all URL-unsafe characters are encoded. > It's implicitly using the {{toString()}} method currently which is intended > for logging etc purposes. > Attempting to use alternate query parsers in requests as shown below will > currently fail as the curly braces aren't encoded. > {noformat} > myquery.set("fq", "{!terms f=myfield}value1,value2"); {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Refactor preparePutOrPost in HttpJdkSolrClient [solr]
andywebb1975 commented on PR #2454: URL: https://github.com/apache/solr/pull/2454#issuecomment-2116128010 @jdyer1 does this look OK to you? The tests pass but I'm not totally familiar with the use case to know that this is definitely correct. (It's not a bug so I'd be happy to target 9.7.0 with this if we merge it.) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17263: HttpJdkSolrClient doesn't encode curly braces etc (follow-up PR) [solr]
andywebb1975 closed pull request #2435: SOLR-17263: HttpJdkSolrClient doesn't encode curly braces etc (follow-up PR) URL: https://github.com/apache/solr/pull/2435 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17263: HttpJdkSolrClient doesn't encode curly braces etc (follow-up PR) [solr]
andywebb1975 commented on PR #2435: URL: https://github.com/apache/solr/pull/2435#issuecomment-2116124328 > I merged the first PR to main, branch_9x, and branch_9_6. Shall this PR be merged too? There are conflicts. Are we all "Watching" the parent JIRA issue? I commented there yesterday. Thanks @dsmiley - no, this one's redundant now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16866) CachingDirectoryFactory Throwing Error When Closing
[ https://issues.apache.org/jira/browse/SOLR-16866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847081#comment-17847081 ] Chris M. Hostetter commented on SOLR-16866: --- if you have to modify code outside of a test to fix this then it definitely needs a new Jira since 9.6 is already released. > CachingDirectoryFactory Throwing Error When Closing > --- > > Key: SOLR-16866 > URL: https://issues.apache.org/jira/browse/SOLR-16866 > Project: Solr > Issue Type: Bug >Affects Versions: 9.2.1 >Reporter: Stefan Pieper >Assignee: Michael Gibney >Priority: Major > Labels: core > Fix For: 9.6 > > Attachments: solr.blueprint_gnylctsgemcz_users.log > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Observed occasional ERROR log entries when {{CachingDirectryFactory#close()}} > is called. The respective snippet from the log is this: > {noformat} > 2023-07-06 08:43:37.618 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: /var/solr/data/blueprint_gnylctsgemcz_users/data > 2023-07-06 08:43:37.620 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > 2023-07-06 08:43:37.620 ERROR (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Error removing > directory => java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > {noformat} > It seems like the order of directory removal is causing this (first parent > dir _data_, then sub-dir _data/index_). Entries should be re-arranged or just > the parent dir be handled. > Full log with entries dealing with the respective core is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-17296: -- Description: The changes made in SOLR-16931 (9.4) attempted to work around problems that existed when attempting to enable degugging (to get score explanations) in combination with using {{reRankScale}} ... {quote}The reason for this is that in order to do proper explain for minMaxScaling you need to know the min and max score in the result set. This piece of state is maintained in the ReRankScaler itself which is inside of the ReRankQuery. But for this information to be populated the query must first be run. In distributed mode, explain is called in the second pass when the ids query is run so the state needed for the explain is not populated. ... {quote} However, the solution attempted was incomplete and failed to account for multiple factors... {quote}... The PR attached to this addresses this problem by doing a single pass distributed query if debugQuery is turned on and if reRank score scaling is applied. I'll add a distributed test for this as well. This change is very limited in scope because the single pass distributed is only switched on in the very specific case when debugQuery=true and reRankScaling is on. {quote} * NPEs are still possible... ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is what triggers explain logic) the new code only checked for specific debug request param combinations: *** {{debuQuery=true}} (a legacy option intended only for backcompat) * ** *** {{debug=true}} (intended as an alias for {{debug=all}} ** It did not check for either of these options, which if used will still trigger an NPE... *** {{debug=results}} (which actually dictates the value of {{ResponseBuilder.isDebugResults()}} *** {{debug=all}} (a short hand for setting all debug options) * the attempt to force a single pass query didn't modify the correct variable ** The new code modified a conditional based on a {{boolean distribSinglePass}} for setting {{sreq.purpose}} and {{rb.onePassDistributedQuery}} ** But it did not modify the value of the {{boolean distribSinglePass}} itself - meaning other logic that uses that variable in that method still assumes multiple passes will be used. ** In particular, these means that even though a single pass is used for both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} requested by the user is not propagated as part of this request *** Only the uniqueKey and any sot fields are ultimately returned to the user. was: The changes made in SOLR-16931 (9.4) attempted to work around problems that existed when attempting to enable degugging (to get score explanations) in combination with using {{reRankScale}} ... {quote}The reason for this is that in order to do proper explain for minMaxScaling you need to know the min and max score in the result set. This piece of state is maintained in the ReRankScaler itself which is inside of the ReRankQuery. But for this information to be populated the query must first be run. In distributed mode, explain is called in the second pass when the ids query is run so the state needed for the explain is not populated. ... {quote} However, the solution attempted was incomplete and failed to account for multiple factors... {quote}... The PR attached to this addresses this problem by doing a single pass distributed query if debugQuery is turned on and if reRank score scaling is applied. I'll add a distributed test for this as well. This change is very limited in scope because the single pass distributed is only switched on in the very specific case when debugQuery=true and reRankScaling is on. {quote} * NPEs are still possible... ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is what triggers explain logic) the new code only checked for specific debug request param combinations: * ** *** {{debuQuery=true}} (a legacy option intended only for backcompat) *** {{debug=true}} (intended as an alias for {{debug=all}} ** It did not check for either of these options, which if used will still trigger an NPE... *** {{debug=results}} (which actually dictates the value of {{ResponseBuilder.isDebugResults()}} *** {{debug=all}} (a short hand for setting all debug options) * the attempt to force a single pass query didn't modify the correct variable ** The new code modified a conditional based on a {{boolean distribSinglePass}} for setting {{sreq.purpose}} and {{rb.onePassDistributedQuery}} ** But it did not modify the value of the {{boolean distribSinglePass}} itself - meaning other logic that uses that variable in that method still assumes multiple passes will be used. ** In particular, these means that even though a single pass is used for both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full
[jira] [Commented] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847076#comment-17847076 ] Chris M. Hostetter commented on SOLR-17296: --- I would argue that the "old" problem of NPE is the lesser of two evils compared to the "new" problem of silently failing to return fields – i'd much rather people who try to enable debugging get an error if it doesn't work then maybe not noticed that their response is missing fields they expected (when it's {{fl=*}} and you have a lot of stored fields it's really obvious something is wrong – if you use {{fl=id,special_rare_field}} you might not realize that {{special_rare_field}} is missing from docs that should actually have it. I also think that, in general, we should not *CHANGE* the core logic of how requests are processed (in this case to use single pass) because a debugging feature is enabled. If getting accurate debugging of reranked results requires using a single pass, let's *_inform_* our users of that, and let them *c{_}hoose{_}* to do it. I recommend we immediately: * revert the logic added by SOLR-16931 * Open a new SOLR-XXX jira for long term consideration of how to redesign debug component to get the multistage info needed for explaining reranked results * harden {{ReRankScaler.explain}} so that if the dat it expects isn't available, it wraps the data it *DOES* have with a {{"0.0 == scaled reranking not available (consider using distrib.singlePass - see SOLR-XXX)"}} > rerank w/scaling (still) broken when using debug to get explain info > > > Key: SOLR-17296 > URL: https://issues.apache.org/jira/browse/SOLR-17296 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 9.4 >Reporter: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17296.test.patch > > > The changes made in SOLR-16931 (9.4) attempted to work around problems that > existed when attempting to enable degugging (to get score explanations) in > combination with using {{reRankScale}} ... > {quote}The reason for this is that in order to do proper explain for > minMaxScaling you need to know the min and max score in the result set. This > piece of state is maintained in the ReRankScaler itself which is inside of > the ReRankQuery. But for this information to be populated the query must > first be run. In distributed mode, explain is called in the second pass when > the ids query is run so the state needed for the explain is not populated. ... > {quote} > However, the solution attempted was incomplete and failed to account for > multiple factors... > {quote}... The PR attached to this addresses this problem by doing a single > pass distributed query if debugQuery is turned on and if reRank score scaling > is applied. I'll add a distributed test for this as well. > This change is very limited in scope because the single pass distributed is > only switched on in the very specific case when debugQuery=true and > reRankScaling is on. > {quote} * NPEs are still possible... > ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is > what triggers explain logic) the new code only checked for specific debug > request param combinations: > * > ** > *** {{debuQuery=true}} (a legacy option intended only for backcompat) > *** {{debug=true}} (intended as an alias for {{debug=all}} > ** It did not check for either of these options, which if used will still > trigger an NPE... > *** {{debug=results}} (which actually dictates the value of > {{ResponseBuilder.isDebugResults()}} > *** {{debug=all}} (a short hand for setting all debug options) > * the attempt to force a single pass query didn't modify the correct variable > ** The new code modified a conditional based on a {{boolean > distribSinglePass}} for setting {{sreq.purpose}} and > {{rb.onePassDistributedQuery}} > ** But it did not modify the value of the {{boolean distribSinglePass}} > itself - meaning other logic that uses that variable in that method still > assumes multiple passes will be used. > ** In particular, these means that even though a single pass is used for > both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} > requested by the user is not propagated as part of this request > *** Only the uniqueKey and any sot fields are ultimately returned to the > user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-17296: -- Attachment: SOLR-17296.test.patch Status: Open (was: Open) Both of these problems are trivial to reproduce using the cloud example Data setup and baseline rerank request (no debugging) {noformat} $ ./solr/packaging/build/dev/bin/solr start -e cloud -noprompt ... $ curl -X POST -H 'Content-Type: application/csv' 'http://localhost:8983/solr/gettingstarted/update?commit=true' --data-binary @solr/example/exampledocs/books.csv ... $ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*' --form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}' --form-string 'rr=inStock=true' { "responseHeader":{ "zkConnected":true, "status":0, "QTime":26, "params":{ "rr":"inStock=true", "q":"*:*", "rows":"1", "rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}" } }, "response":{ "numFound":10, "start":0, "maxScore":1.0, "numFoundExact":true, "docs":[{ "id":"0812521390", "cat":["book"], "name":["The Black Company"], "price":[6.99], "inStock":[false], "author":["Glen Cook"], "series_t":"The Chronicles of The Black Company", "sequence_i":1, "genre_s":"fantasy", "_version_":1799231031515021312 }] } } {noformat} Try using {{debug=true}} and now none of the stored fields are returned... {noformat} $ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*' --form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}' --form-string 'rr=inStock=true' --form-string 'debug=true' { "responseHeader":{ "zkConnected":true, "status":0, "QTime":13, "params":{ "rr":"inStock=true", "q":"*:*", "debug":"true", "rows":"1", "rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}" } }, "response":{ "numFound":10, "start":0, "maxScore":1.0, "numFoundExact":true, "docs":[{ "id":"0812521390" }] }, "debug":{ "track":{ ... {noformat} Use {{debug=results}} (or {{debug=all}} to trigger NPE... {noformat} $ curl http://localhost:8983/solr/gettingstarted/select --form-string 'q=*:*' --form-string 'rows=1' --form-string 'rq={!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}' --form-string 'rr=inStock=true' --form-string 'debug=results' { "responseHeader":{ "zkConnected":true, "status":500, "QTime":37, "params":{ "rr":"inStock=true", "q":"*:*", "debug":"results", "rows":"1", "rq":"{!rerank reRankQuery=$rr reRankScale=0-1 reRankWeight=0.3 reRankDocs=1000 reRankOperator=add}" } }, "error":{ "metadata":["error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException","root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"], "msg":"Error from server at http://localhost:7574/solr/gettingstarted_shard1_replica_n2/select: java.lang.NullPointerException\n\tat org.apache.solr.search.ReRankScaler.explain(ReRankScaler.java:348)... {noformat} I'm attaching a very basic test patch that demonstrates both of these problems. > rerank w/scaling (still) broken when using debug to get explain info > > > Key: SOLR-17296 > URL: https://issues.apache.org/jira/browse/SOLR-17296 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 9.4 >Reporter: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17296.test.patch > > > The changes made in SOLR-16931 (9.4) attempted to work around problems that > existed when attempting to enable degugging (to get score explanations) in > combination with using {{reRankScale}} ... > {quote}The reason for this is that in order to do proper explain for > minMaxScaling you need to know the min and max score in the result set. This > piece of state is maintained in the ReRankScaler itself which is inside of > the ReRankQuery. But for this information to be populated the query must > first be run. In distributed mode, explain is called in the second pass when > the ids query is run so the state needed for the explain is not populated. ... > {quote} > However, the solution attempted was incomplete and failed to account for > multiple factors... > {quote}... The PR attached to this addresses this
[jira] [Updated] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-17296: -- Description: The changes made in SOLR-16931 (9.4) attempted to work around problems that existed when attempting to enable degugging (to get score explanations) in combination with using {{reRankScale}} ... {quote}The reason for this is that in order to do proper explain for minMaxScaling you need to know the min and max score in the result set. This piece of state is maintained in the ReRankScaler itself which is inside of the ReRankQuery. But for this information to be populated the query must first be run. In distributed mode, explain is called in the second pass when the ids query is run so the state needed for the explain is not populated. ... {quote} However, the solution attempted was incomplete and failed to account for multiple factors... {quote}... The PR attached to this addresses this problem by doing a single pass distributed query if debugQuery is turned on and if reRank score scaling is applied. I'll add a distributed test for this as well. This change is very limited in scope because the single pass distributed is only switched on in the very specific case when debugQuery=true and reRankScaling is on. {quote} * NPEs are still possible... ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is what triggers explain logic) the new code only checked for specific debug request param combinations: * ** *** {{debuQuery=true}} (a legacy option intended only for backcompat) *** {{debug=true}} (intended as an alias for {{debug=all}} ** It did not check for either of these options, which if used will still trigger an NPE... *** {{debug=results}} (which actually dictates the value of {{ResponseBuilder.isDebugResults()}} *** {{debug=all}} (a short hand for setting all debug options) * the attempt to force a single pass query didn't modify the correct variable ** The new code modified a conditional based on a {{boolean distribSinglePass}} for setting {{sreq.purpose}} and {{rb.onePassDistributedQuery}} ** But it did not modify the value of the {{boolean distribSinglePass}} itself - meaning other logic that uses that variable in that method still assumes multiple passes will be used. ** In particular, these means that even though a single pass is used for both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} requested by the user is not propagated as part of this request *** Only the uniqueKey and any sot fields are ultimately returned to the user. was: The changes made in SOLR-16931 (9.4) attempted to work around problems that existed when attempting to enable degugging (to get score explanations) in combination with using {{reRankScale}} ... {quote}The reason for this is that in order to do proper explain for minMaxScaling you need to know the min and max score in the result set. This piece of state is maintained in the ReRankScaler itself which is inside of the ReRankQuery. But for this information to be populated the query must first be run. In distributed mode, explain is called in the second pass when the ids query is run so the state needed for the explain is not populated. ... {quote} However, the solution attempted was incomplete and failed to account for multiple factors... {quote}... The PR attached to this addresses this problem by doing a single pass distributed query if debugQuery is turned on and if reRank score scaling is applied. I'll add a distributed test for this as well. This change is very limited in scope because the single pass distributed is only switched on in the very specific case when debugQuery=true and reRankScaling is on. {quote} * NPEs are still possible... ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is what triggers explain logic) the new code only checked for specific debug request param combinations: *** {{debuQuery=true}} (a legacy option intended only for backcompat) *** {{debug=true}} (intended as an alias for {{debug=all}} ** It did not check for either of these options, which if used will still trigger an NPE... *** {{debug=results}} (which actually dictates the value of {{ResponseBuilder.isDebugResults()}} *** {{debug=all}} (a short hand for setting all debug options) * the attempt to force a single pass query didn't modify the correct variable ** The new code modified a conditional based on a {{boolean distribSinglePass}} for setting {{sreq.purpose}} and {{rb.onePassDistributedQuery}} ** But it did not modify the value of the {{boolean distribSinglePass}} itself - meaning other logic that uses that variable in that method still assumes multiple passes will be used. ** In particular, these means that even though a single pass is used for both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}}
[jira] [Created] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
Chris M. Hostetter created SOLR-17296: - Summary: rerank w/scaling (still) broken when using debug to get explain info Key: SOLR-17296 URL: https://issues.apache.org/jira/browse/SOLR-17296 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Affects Versions: 9.4 Reporter: Chris M. Hostetter The changes made in SOLR-16931 (9.4) attempted to work around problems that existed when attempting to enable degugging (to get score explanations) in combination with using {{reRankScale}} ... {quote}The reason for this is that in order to do proper explain for minMaxScaling you need to know the min and max score in the result set. This piece of state is maintained in the ReRankScaler itself which is inside of the ReRankQuery. But for this information to be populated the query must first be run. In distributed mode, explain is called in the second pass when the ids query is run so the state needed for the explain is not populated. ... {quote} However, the solution attempted was incomplete and failed to account for multiple factors... {quote}... The PR attached to this addresses this problem by doing a single pass distributed query if debugQuery is turned on and if reRank score scaling is applied. I'll add a distributed test for this as well. This change is very limited in scope because the single pass distributed is only switched on in the very specific case when debugQuery=true and reRankScaling is on. {quote} * NPEs are still possible... ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is what triggers explain logic) the new code only checked for specific debug request param combinations: *** {{debuQuery=true}} (a legacy option intended only for backcompat) *** {{debug=true}} (intended as an alias for {{debug=all}} ** It did not check for either of these options, which if used will still trigger an NPE... *** {{debug=results}} (which actually dictates the value of {{ResponseBuilder.isDebugResults()}} *** {{debug=all}} (a short hand for setting all debug options) * the attempt to force a single pass query didn't modify the correct variable ** The new code modified a conditional based on a {{boolean distribSinglePass}} for setting {{sreq.purpose}} and {{rb.onePassDistributedQuery}} ** But it did not modify the value of the {{boolean distribSinglePass}} itself - meaning other logic that uses that variable in that method still assumes multiple passes will be used. ** In particular, these means that even though a single pass is used for both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} requested by the user is not propagated as part of this request *** Only the uniqueKey and any sot fields are ultimately returned to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher sometimes changes field names on the fly which further motivated the dynamic aliasing approach. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names which further motivated the dynamic aliasing approach. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names which further motivated the dynamic aliasing approach. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names which partly motivated the dynamic aliasing approach as well. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names which parlty motivated the dynamic aliasing as well. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names that also warrant a more dynamic approach to handling it. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen very often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17275. - Resolution: Fixed > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 1h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17263: HttpJdkSolrClient doesn't encode curly braces etc (follow-up PR) [solr]
dsmiley commented on PR #2435: URL: https://github.com/apache/solr/pull/2435#issuecomment-2115975927 I merged the first PR to main, branch_9x, and branch_9_6. Shall this PR be merged too? There are conflicts. Are we all "Watching" the parent JIRA issue? I commented there yesterday. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Refactor preparePutOrPost in HttpJdkSolrClient [solr]
dsmiley commented on PR #2454: URL: https://github.com/apache/solr/pull/2454#issuecomment-2115973786 Shall this be merged? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17153) CloudSolrClient should not throw "Collection not found" with an out-dated ClusterState
[ https://issues.apache.org/jira/browse/SOLR-17153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847060#comment-17847060 ] ASF subversion and git services commented on SOLR-17153: Commit 53963292db39803996f76a5229ac3b91c6265b5e in solr's branch refs/heads/branch_9_6 from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=53963292db3 ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. (cherry picked from commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27) > CloudSolrClient should not throw "Collection not found" with an out-dated > ClusterState > -- > > Key: SOLR-17153 > URL: https://issues.apache.org/jira/browse/SOLR-17153 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: David Smiley >Priority: Major > Fix For: 9.6 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Today, CloudSolrClient will locally fail if it's asked to send a request to a > collection that it thinks does not exist due to its local ClusterState view > being out-of-date. We shouldn't fail! And most SolrCloud tests should then > remove their waitForState calls that follow collection creation! Other stale > state matters are out-of-scope. > Proposal: CloudSolrClient shouldn't try and be too smart. Always route a > request to Solr (any node); don't presume its state is up-to-date. Maybe, > after a response is received, it can check if its state has been updated and > if not then explicitly get a new state. Or not if that's too complicated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847059#comment-17847059 ] ASF subversion and git services commented on SOLR-17275: Commit 53963292db39803996f76a5229ac3b91c6265b5e in solr's branch refs/heads/branch_9_6 from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=53963292db3 ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. (cherry picked from commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27) > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 1h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands,
[jira] [Commented] (SOLR-17153) CloudSolrClient should not throw "Collection not found" with an out-dated ClusterState
[ https://issues.apache.org/jira/browse/SOLR-17153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847058#comment-17847058 ] ASF subversion and git services commented on SOLR-17153: Commit 9954fc467e70d3e261f17656c2819fcc2e8faffe in solr's branch refs/heads/branch_9x from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=9954fc467e7 ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. (cherry picked from commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27) > CloudSolrClient should not throw "Collection not found" with an out-dated > ClusterState > -- > > Key: SOLR-17153 > URL: https://issues.apache.org/jira/browse/SOLR-17153 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: David Smiley >Priority: Major > Fix For: 9.6 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Today, CloudSolrClient will locally fail if it's asked to send a request to a > collection that it thinks does not exist due to its local ClusterState view > being out-of-date. We shouldn't fail! And most SolrCloud tests should then > remove their waitForState calls that follow collection creation! Other stale > state matters are out-of-scope. > Proposal: CloudSolrClient shouldn't try and be too smart. Always route a > request to Solr (any node); don't presume its state is up-to-date. Maybe, > after a response is received, it can check if its state has been updated and > if not then explicitly get a new state. Or not if that's too complicated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847057#comment-17847057 ] ASF subversion and git services commented on SOLR-17275: Commit 9954fc467e70d3e261f17656c2819fcc2e8faffe in solr's branch refs/heads/branch_9x from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=9954fc467e7 ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. (cherry picked from commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27) > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 1h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands,
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847054#comment-17847054 ] ASF subversion and git services commented on SOLR-17275: Commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27 in solr's branch refs/heads/main from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=f073dc86e1a ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 1h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17153) CloudSolrClient should not throw "Collection not found" with an out-dated ClusterState
[ https://issues.apache.org/jira/browse/SOLR-17153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847055#comment-17847055 ] ASF subversion and git services commented on SOLR-17153: Commit f073dc86e1ab714c3e8eaa4a989698feb90f8a27 in solr's branch refs/heads/main from aparnasuresh85 [ https://gitbox.apache.org/repos/asf?p=solr.git;h=f073dc86e1a ] SOLR-17275: Revert SolrJ ZkClientClusterStateProvider SOLR-17153 (#2463) SolrJ ZkClientClusterStateProvider: revert SOLR-17153 for perf regression when aliases are used. Other parts of that issue are not reverted. Affects only Solr 9.6. Thanks Rafał Harabień for reporting the problem. > CloudSolrClient should not throw "Collection not found" with an out-dated > ClusterState > -- > > Key: SOLR-17153 > URL: https://issues.apache.org/jira/browse/SOLR-17153 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: David Smiley >Priority: Major > Fix For: 9.6 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Today, CloudSolrClient will locally fail if it's asked to send a request to a > collection that it thinks does not exist due to its local ClusterState view > being out-of-date. We shouldn't fail! And most SolrCloud tests should then > remove their waitForState calls that follow collection creation! Other stale > state matters are out-of-scope. > Proposal: CloudSolrClient shouldn't try and be too smart. Always route a > request to Solr (any node); don't presume its state is up-to-date. Maybe, > after a response is received, it can check if its state has been updated and > if not then explicitly get a new state. Or not if that's too complicated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17275: Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases [solr]
dsmiley merged PR #2463: URL: https://github.com/apache/solr/pull/2463 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17275: Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases [solr]
dsmiley commented on code in PR #2463: URL: https://github.com/apache/solr/pull/2463#discussion_r1603805974 ## solr/CHANGES.txt: ## @@ -136,7 +136,13 @@ Other Changes * SOLR-17248: Refactor ZK related SolrCli tools to separate SolrZkClient and CloudSolrClient instantiation/usage (Lamine Idjeraoui via Eric Pugh) * SOLR-16505: Use Jetty HTTP2 for index replication and other "recovery" operations - (Sanjay Dutt, David Smiley) + (Sanjay Dutt, David Smiley Review Comment: ```suggestion (Sanjay Dutt, David Smiley) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17275: Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases [solr]
dsmiley commented on code in PR #2463: URL: https://github.com/apache/solr/pull/2463#discussion_r1603804215 ## solr/CHANGES.txt: ## @@ -136,7 +136,13 @@ Other Changes * SOLR-17248: Refactor ZK related SolrCli tools to separate SolrZkClient and CloudSolrClient instantiation/usage (Lamine Idjeraoui via Eric Pugh) * SOLR-16505: Use Jetty HTTP2 for index replication and other "recovery" operations - (Sanjay Dutt, David Smiley) + (Sanjay Dutt, David Smiley + + += 9.6.1 +Bug Fixes +-- +* SOLR-17275: SolrJ ZkClientClusterStateProvider, revert SOLR-17153 for regression when aliases are used. (Aparna Suresh)) Review Comment: ```suggestion * SOLR-17275: SolrJ ZkClientClusterStateProvider, revert SOLR-17153 for perf regression when aliases are used. (Aparna Suresh) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16866) CachingDirectoryFactory Throwing Error When Closing
[ https://issues.apache.org/jira/browse/SOLR-16866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847048#comment-17847048 ] Houston Putman commented on SOLR-16866: --- This might actually be a problem on Windows, not just testing, so maybe we should have a separate Jira for this? So it can be tagged to the 9.6.1 release > CachingDirectoryFactory Throwing Error When Closing > --- > > Key: SOLR-16866 > URL: https://issues.apache.org/jira/browse/SOLR-16866 > Project: Solr > Issue Type: Bug >Affects Versions: 9.2.1 >Reporter: Stefan Pieper >Assignee: Michael Gibney >Priority: Major > Labels: core > Fix For: 9.6 > > Attachments: solr.blueprint_gnylctsgemcz_users.log > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Observed occasional ERROR log entries when {{CachingDirectryFactory#close()}} > is called. The respective snippet from the log is this: > {noformat} > 2023-07-06 08:43:37.618 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: /var/solr/data/blueprint_gnylctsgemcz_users/data > 2023-07-06 08:43:37.620 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > 2023-07-06 08:43:37.620 ERROR (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Error removing > directory => java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > {noformat} > It seems like the order of directory removal is causing this (first parent > dir _data_, then sub-dir _data/index_). Entries should be re-arranged or just > the parent dir be handled. > Full log with entries dealing with the respective core is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[PR] SOLR-16866: Use file separator instead of '/' in CachingDirectoryFactory [solr]
HoustonPutman opened a new pull request, #2464: URL: https://github.com/apache/solr/pull/2464 https://issues.apache.org/jira/browse/SOLR-16866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16866) CachingDirectoryFactory Throwing Error When Closing
[ https://issues.apache.org/jira/browse/SOLR-16866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847046#comment-17847046 ] Houston Putman commented on SOLR-16866: --- I think this is due to a bad {{isSubPath()}} method, which used '/' instead of File.separator. I'll put up a PR, but I can't test this locally since I don't have a windows machine. > CachingDirectoryFactory Throwing Error When Closing > --- > > Key: SOLR-16866 > URL: https://issues.apache.org/jira/browse/SOLR-16866 > Project: Solr > Issue Type: Bug >Affects Versions: 9.2.1 >Reporter: Stefan Pieper >Assignee: Michael Gibney >Priority: Major > Labels: core > Fix For: 9.6 > > Attachments: solr.blueprint_gnylctsgemcz_users.log > > Time Spent: 3.5h > Remaining Estimate: 0h > > Observed occasional ERROR log entries when {{CachingDirectryFactory#close()}} > is called. The respective snippet from the log is this: > {noformat} > 2023-07-06 08:43:37.618 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: /var/solr/data/blueprint_gnylctsgemcz_users/data > 2023-07-06 08:43:37.620 DEBUG (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Removing > directory after core close: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > 2023-07-06 08:43:37.620 ERROR (qtp212011969-1801) [ > blueprint_gnylctsgemcz_users] o.a.s.c.CachingDirectoryFactory Error removing > directory => java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > java.nio.file.NoSuchFileException: > /var/solr/data/blueprint_gnylctsgemcz_users/data/index > {noformat} > It seems like the order of directory removal is causing this (first parent > dir _data_, then sub-dir _data/index_). Entries should be re-arranged or just > the parent dir be handled. > Full log with entries dealing with the respective core is attached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603769246 ## solr/modules/monitor/src/java/org/apache/solr/monitor/update/MonitorUpdateRequestProcessor.java: ## @@ -0,0 +1,257 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.solr.monitor.update; + +import java.io.IOException; +import java.io.Reader; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.document.Document; +import org.apache.lucene.document.Field; +import org.apache.lucene.document.InvertableType; +import org.apache.lucene.document.StoredValue; +import org.apache.lucene.index.DocValuesType; +import org.apache.lucene.index.IndexableField; +import org.apache.lucene.index.IndexableFieldType; +import org.apache.lucene.monitor.MonitorFields; +import org.apache.lucene.monitor.MonitorQuery; +import org.apache.lucene.monitor.Presearcher; +import org.apache.lucene.monitor.QCEVisitor; +import org.apache.lucene.util.BytesRef; +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.util.JavaBinCodec; +import org.apache.solr.core.SolrCore; +import org.apache.solr.monitor.MonitorConstants; +import org.apache.solr.monitor.MonitorSchemaFields; +import org.apache.solr.monitor.SimpleQueryParser; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; + +public class MonitorUpdateRequestProcessor extends UpdateRequestProcessor { + + private final SolrCore core; + private final IndexSchema indexSchema; + private final Presearcher presearcher; + private final MonitorSchemaFields monitorSchemaFields; + + public MonitorUpdateRequestProcessor( + UpdateRequestProcessor next, SolrCore core, Presearcher presearcher) { +super(next); +this.core = core; +this.indexSchema = core.getLatestSchema(); +this.presearcher = presearcher; +this.monitorSchemaFields = new MonitorSchemaFields(indexSchema); + } + + @Override + public void processAdd(AddUpdateCommand cmd) throws IOException { +var solrInputDocument = cmd.getSolrInputDocument(); +var queryId = +(String) solrInputDocument.getFieldValue(indexSchema.getUniqueKeyField().getName()); +var queryFieldValue = solrInputDocument.getFieldValue(MonitorFields.MONITOR_QUERY); +if (queryFieldValue != null) { + var payload = + Optional.ofNullable(solrInputDocument.getFieldValue(MonitorFields.PAYLOAD)) + .map(Object::toString) + .orElse(null); + List children = + Optional.of(queryFieldValue) + .filter(String.class::isInstance) + .map(String.class::cast) + .map( + queryStr -> + new MonitorQuery( + queryId, SimpleQueryParser.parse(queryStr, core), queryStr, Map.of())) + .stream() + .flatMap(monitorQuery -> decompose(monitorQuery, payload)) + .map(this::toSolrInputDoc) + .collect(Collectors.toList()); + if (children.isEmpty()) { +throw new SolrException( +SolrException.ErrorCode.INVALID_STATE, "Query could not be decomposed"); + } + SolrInputDocument firstChild = children.get(0); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument.getChildDocuments().clear(); + } + solrInputDocument.addChildDocuments(children.stream().skip(1).collect(Collectors.toList())); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument +.getChildDocuments() +.forEach( +child -> +solrInputDocument.forEach( +field -> { + if
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603762136 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? Ok I am on board with the configuration then. I don't necessarily think it can obviate aliasing because aliasing only applies to fields dynamically generated by the presearcher. These don't exist in `MonitorFields`. For instance, say you have a query `foo:bar` and you naturally have some schema definition that `foo` conforms to in your collection of _regular_ documents. You now want to create a collection of queries for reverse search and we want to make that as easy as possible. So you use should be able to use the same schema (with the same query analyzers etc) with the addition of a few control fields. One of those is `__monitor_alias_*` so that when the presearcher is done tokenizing that query `foo:bar` it actually sends the token stream to the field `__monitor_alias_foo`. So if your original `foo` field had all sorts of fancy configurations like `docValues` or `stored`, the presearcher generated field won't know about any of that and will instead give you a plain old indexed field. Now because of what appears to be an implementation choice, solr will actually let you write a field with a configuration that is different from the schema because the only validation is against the first _written_ field (to make sure that there is compatibility between the first document's field value and all the documents that come after). But relying on this feels hacky. Also, the schema discrepancy mentioned above would be a total blocker if you wanted to store queries alongside your documents in the same collection (because the first written document would be a "true" document with whatever fancy configuration you had defined in the schema). Finally, the multi-pass presearcher does things to field names that also warrant a more dynamic approach to handling it. Now if there was ever a presearcher that gave you back a different _type_ of field then we'd probably have to define another dynamic alias field .. but that doesn't seem like something that will happen ver y often. > https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an _ underscore, is the suffixing needed or does it just make maybe debugging or so easier? I kind of just assumed based on `_root_` and `_version_` that `_` wrapped fields are considered out of the user space, at least by convention, so I wanted to be consistent. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603721535 ## solr/modules/monitor/src/java/org/apache/solr/monitor/update/MonitorUpdateRequestProcessor.java: ## @@ -0,0 +1,257 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.solr.monitor.update; + +import java.io.IOException; +import java.io.Reader; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.document.Document; +import org.apache.lucene.document.Field; +import org.apache.lucene.document.InvertableType; +import org.apache.lucene.document.StoredValue; +import org.apache.lucene.index.DocValuesType; +import org.apache.lucene.index.IndexableField; +import org.apache.lucene.index.IndexableFieldType; +import org.apache.lucene.monitor.MonitorFields; +import org.apache.lucene.monitor.MonitorQuery; +import org.apache.lucene.monitor.Presearcher; +import org.apache.lucene.monitor.QCEVisitor; +import org.apache.lucene.util.BytesRef; +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.util.JavaBinCodec; +import org.apache.solr.core.SolrCore; +import org.apache.solr.monitor.MonitorConstants; +import org.apache.solr.monitor.MonitorSchemaFields; +import org.apache.solr.monitor.SimpleQueryParser; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; + +public class MonitorUpdateRequestProcessor extends UpdateRequestProcessor { + + private final SolrCore core; + private final IndexSchema indexSchema; + private final Presearcher presearcher; + private final MonitorSchemaFields monitorSchemaFields; + + public MonitorUpdateRequestProcessor( + UpdateRequestProcessor next, SolrCore core, Presearcher presearcher) { +super(next); +this.core = core; +this.indexSchema = core.getLatestSchema(); +this.presearcher = presearcher; +this.monitorSchemaFields = new MonitorSchemaFields(indexSchema); + } + + @Override + public void processAdd(AddUpdateCommand cmd) throws IOException { +var solrInputDocument = cmd.getSolrInputDocument(); +var queryId = +(String) solrInputDocument.getFieldValue(indexSchema.getUniqueKeyField().getName()); +var queryFieldValue = solrInputDocument.getFieldValue(MonitorFields.MONITOR_QUERY); +if (queryFieldValue != null) { + var payload = + Optional.ofNullable(solrInputDocument.getFieldValue(MonitorFields.PAYLOAD)) + .map(Object::toString) + .orElse(null); + List children = + Optional.of(queryFieldValue) + .filter(String.class::isInstance) + .map(String.class::cast) + .map( + queryStr -> + new MonitorQuery( + queryId, SimpleQueryParser.parse(queryStr, core), queryStr, Map.of())) + .stream() + .flatMap(monitorQuery -> decompose(monitorQuery, payload)) + .map(this::toSolrInputDoc) + .collect(Collectors.toList()); + if (children.isEmpty()) { +throw new SolrException( +SolrException.ErrorCode.INVALID_STATE, "Query could not be decomposed"); + } + SolrInputDocument firstChild = children.get(0); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument.getChildDocuments().clear(); + } + solrInputDocument.addChildDocuments(children.stream().skip(1).collect(Collectors.toList())); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument +.getChildDocuments() +.forEach( +child -> +solrInputDocument.forEach( +field -> { + if
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
kotman12 commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603720227 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; + public static final String VERSION = "_version_"; Review Comment: I didn't realize this was `public`! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847033#comment-17847033 ] Aparna Suresh commented on SOLR-17275: -- Indeed I confirmed the same once I saw your response. Somehow my test didnt capture the interaction with Zk with how it was designed, but debugging revealed the true issue. Another reason to revert the logic is to have CloudSolrClient not try to be up-to-date with global cluster state, which was the intention of the parent Jira - SOLR-17153. > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603697031 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; Review Comment: > ... is there a particular reason you were exploring making the payload field name overridable? Maybe to let users give it a more self-describing name for their particular use case? ... So I was exploring not specifically w.r.t. making the payload field name overridable but the other fields (query id, cache id, query) also and payload was just first to explore. Yes, optionally-choosable names would let users choose names/terminology to suit their use case, and perhaps might it also then reduce or remove the need for the aliasing functionality (which I haven't yet looked into further)? As an aside, I noticed that the query id and cache id and query field name choices picked the Lucene `QueryIndex` fields -- https://github.com/apache/lucene/blob/releases/lucene/9.10.0/lucene/monitor/src/java/org/apache/lucene/monitor/QueryIndex.java#L46-L48 -- and suffixed an `_` underscore, is the suffixing needed or does it just make maybe debugging or so easier? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603682245 ## solr/modules/monitor/src/java/org/apache/solr/monitor/update/MonitorUpdateRequestProcessor.java: ## @@ -0,0 +1,257 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.solr.monitor.update; + +import java.io.IOException; +import java.io.Reader; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.stream.Collectors; +import java.util.stream.Stream; +import org.apache.lucene.analysis.Analyzer; +import org.apache.lucene.analysis.TokenStream; +import org.apache.lucene.document.Document; +import org.apache.lucene.document.Field; +import org.apache.lucene.document.InvertableType; +import org.apache.lucene.document.StoredValue; +import org.apache.lucene.index.DocValuesType; +import org.apache.lucene.index.IndexableField; +import org.apache.lucene.index.IndexableFieldType; +import org.apache.lucene.monitor.MonitorFields; +import org.apache.lucene.monitor.MonitorQuery; +import org.apache.lucene.monitor.Presearcher; +import org.apache.lucene.monitor.QCEVisitor; +import org.apache.lucene.util.BytesRef; +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrInputDocument; +import org.apache.solr.common.util.JavaBinCodec; +import org.apache.solr.core.SolrCore; +import org.apache.solr.monitor.MonitorConstants; +import org.apache.solr.monitor.MonitorSchemaFields; +import org.apache.solr.monitor.SimpleQueryParser; +import org.apache.solr.schema.IndexSchema; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.AddUpdateCommand; +import org.apache.solr.update.processor.UpdateRequestProcessor; + +public class MonitorUpdateRequestProcessor extends UpdateRequestProcessor { + + private final SolrCore core; + private final IndexSchema indexSchema; + private final Presearcher presearcher; + private final MonitorSchemaFields monitorSchemaFields; + + public MonitorUpdateRequestProcessor( + UpdateRequestProcessor next, SolrCore core, Presearcher presearcher) { +super(next); +this.core = core; +this.indexSchema = core.getLatestSchema(); +this.presearcher = presearcher; +this.monitorSchemaFields = new MonitorSchemaFields(indexSchema); + } + + @Override + public void processAdd(AddUpdateCommand cmd) throws IOException { +var solrInputDocument = cmd.getSolrInputDocument(); +var queryId = +(String) solrInputDocument.getFieldValue(indexSchema.getUniqueKeyField().getName()); +var queryFieldValue = solrInputDocument.getFieldValue(MonitorFields.MONITOR_QUERY); +if (queryFieldValue != null) { + var payload = + Optional.ofNullable(solrInputDocument.getFieldValue(MonitorFields.PAYLOAD)) + .map(Object::toString) + .orElse(null); + List children = + Optional.of(queryFieldValue) + .filter(String.class::isInstance) + .map(String.class::cast) + .map( + queryStr -> + new MonitorQuery( + queryId, SimpleQueryParser.parse(queryStr, core), queryStr, Map.of())) + .stream() + .flatMap(monitorQuery -> decompose(monitorQuery, payload)) + .map(this::toSolrInputDoc) + .collect(Collectors.toList()); + if (children.isEmpty()) { +throw new SolrException( +SolrException.ErrorCode.INVALID_STATE, "Query could not be decomposed"); + } + SolrInputDocument firstChild = children.get(0); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument.getChildDocuments().clear(); + } + solrInputDocument.addChildDocuments(children.stream().skip(1).collect(Collectors.toList())); + if (solrInputDocument.hasChildDocuments()) { +solrInputDocument +.getChildDocuments() +.forEach( +child -> +solrInputDocument.forEach( +field -> { + if
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603659406 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; + public static final String VERSION = "_version_"; Review Comment: minor: wondering if `CommonParams.VERSION_FIELD` could be used for this? https://github.com/apache/solr/blob/releases/solr/9.6.0/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603658364 ## solr/modules/monitor/src/java/org/apache/lucene/monitor/MonitorFields.java: ## @@ -0,0 +1,38 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.lucene.monitor; + +import java.util.Set; + +public class MonitorFields { + + public static final String QUERY_ID = QueryIndex.FIELDS.query_id + "_"; + public static final String CACHE_ID = QueryIndex.FIELDS.cache_id + "_"; + public static final String MONITOR_QUERY = QueryIndex.FIELDS.mq + "_"; + public static final String PAYLOAD = QueryIndex.FIELDS.mq + "_payload_"; + public static final String VERSION = "_version_"; + public static final String ANYTOKEN_FIELD = TermFilteredPresearcher.ANYTOKEN_FIELD; Review Comment: cross-reference https://github.com/apache/lucene/pull/13379 w.r.t. making `TermFilteredPresearcher.ANYTOKEN_FIELD` public -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]
cpoerschke commented on code in PR #2382: URL: https://github.com/apache/solr/pull/2382#discussion_r1603658051 ## solr/modules/monitor/src/java/org/apache/solr/monitor/search/ReverseSearchComponent.java: ## @@ -0,0 +1,260 @@ +/* + * + * * Licensed to the Apache Software Foundation (ASF) under one or more + * * contributor license agreements. See the NOTICE file distributed with + * * this work for additional information regarding copyright ownership. + * * The ASF licenses this file to You under the Apache License, Version 2.0 + * * (the "License"); you may not use this file except in compliance with + * * the License. You may obtain a copy of the License at + * * + * * http://www.apache.org/licenses/LICENSE-2.0 + * * + * * Unless required by applicable law or agreed to in writing, software + * * distributed under the License is distributed on an "AS IS" BASIS, + * * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * * See the License for the specific language governing permissions and + * * limitations under the License. + * + */ + +package org.apache.solr.monitor.search; + +import static org.apache.solr.monitor.MonitorConstants.MONITOR_DOCUMENTS_KEY; +import static org.apache.solr.monitor.MonitorConstants.MONITOR_OUTPUT_KEY; +import static org.apache.solr.monitor.MonitorConstants.QUERY_MATCH_TYPE_KEY; +import static org.apache.solr.monitor.MonitorConstants.WRITE_TO_DOC_LIST_KEY; +import static org.apache.solr.monitor.search.PresearcherFactory.DEFAULT_ALIAS_PREFIX; +import static org.apache.solr.monitor.search.QueryMatchResponseCodec.mergeResponses; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Comparator; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.concurrent.ExecutorService; +import java.util.function.BiPredicate; +import java.util.stream.Collectors; +import org.apache.lucene.document.Document; +import org.apache.lucene.monitor.DocumentBatchVisitor; +import org.apache.lucene.monitor.MonitorFields; +import org.apache.lucene.monitor.Presearcher; +import org.apache.lucene.monitor.QueryDecomposer; +import org.apache.lucene.monitor.TermFilteredPresearcher; +import org.apache.lucene.search.MatchAllDocsQuery; +import org.apache.lucene.search.Query; +import org.apache.lucene.util.BytesRef; +import org.apache.lucene.util.NamedThreadFactory; +import org.apache.solr.common.SolrException; +import org.apache.solr.common.SolrException.ErrorCode; +import org.apache.solr.common.util.ExecutorUtil; +import org.apache.solr.common.util.NamedList; +import org.apache.solr.core.CloseHook; +import org.apache.solr.core.SolrCore; +import org.apache.solr.handler.component.QueryComponent; +import org.apache.solr.handler.component.ResponseBuilder; +import org.apache.solr.handler.component.ShardRequest; +import org.apache.solr.handler.loader.JsonLoader; +import org.apache.solr.monitor.AliasingPresearcher; +import org.apache.solr.monitor.SolrMonitorQueryDecoder; +import org.apache.solr.monitor.cache.MonitorQueryCache; +import org.apache.solr.monitor.cache.SharedMonitorCache; +import org.apache.solr.request.SolrQueryRequest; +import org.apache.solr.schema.SchemaField; +import org.apache.solr.update.DocumentBuilder; +import org.apache.solr.util.SolrPluginUtils; +import org.apache.solr.util.plugin.SolrCoreAware; + +public class ReverseSearchComponent extends QueryComponent implements SolrCoreAware { + + public static final String COMPONENT_NAME = "reverseSearch"; + private static final String MATCHER_THREAD_COUNT_KEY = "threadCount"; + + private static final String SOLR_MONITOR_CACHE_NAME_KEY = "solrMonitorCacheName"; + private static final String SOLR_MONITOR_CACHE_NAME_DEFAULT = "solrMonitorCache"; + private String solrMonitorCacheName = SOLR_MONITOR_CACHE_NAME_DEFAULT; + + private QueryDecomposer queryDecomposer; + private Presearcher presearcher; + private SolrMatcherSinkFactory solrMatcherSinkFactory = new SolrMatcherSinkFactory(); + private ExecutorService executor; + private PresearcherFactory.PresearcherParameters presearcherParameters; + + @Override + public void init(NamedList args) { +super.init(args); +Object solrMonitorCacheName = args.remove(SOLR_MONITOR_CACHE_NAME_KEY); +if (solrMonitorCacheName != null) { + this.solrMonitorCacheName = (String) solrMonitorCacheName; +} +Object threadCount = args.remove(MATCHER_THREAD_COUNT_KEY); +if (threadCount instanceof Integer) { + executor = + ExecutorUtil.newMDCAwareFixedThreadPool( + (Integer) threadCount, new NamedThreadFactory("monitor-matcher")); + solrMatcherSinkFactory = new SolrMatcherSinkFactory(executor); +} +presearcherParameters = new PresearcherFactory.PresearcherParameters(); +SolrPluginUtils.invokeSetters(presearcherParameters, args); + } + + @Override + public void prepare(ResponseBuilder rb) { +var
Re: [PR] SOLR-13350: Multithreaded search [solr]
gus-asf commented on PR #2248: URL: https://github.com/apache/solr/pull/2248#issuecomment-2115586822 The discussion on this is long, so maybe I've missed it, but the actual merged code has introduced the possibility (though I suspect it might never happen) of a non-numeric Max Score... public float getMaxScore(int totalHits) { if (totalHits > 0) { for (Object res : result) { if (res instanceof MaxScoreResult) { return ((MaxScoreResult) res).maxScore; } } return Float.NaN;<< } else { return 0.0f; } } Did I miss discussion of this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847007#comment-17847007 ] Rafał Harabień edited comment on SOLR-17275 at 5/16/24 3:30 PM: [~aparnasuresh] I think communication with ZK actually happens: {noformat} tryLazyCollection.get() {noformat} calls {noformat} ZkStateReader::LazyCollectionRef::get(false) // allowCached = false{noformat} -> cachedDocCollection will be null -> shouldFetch will be true -> ZkStateReader::getCollectionLive will be called -> ZkStateReader::fetchCollectionState will be called -> zkClient.getData() will be called I confirmed it in a debugger. Synchronized block may cause some small delay too, because we handle a relatively big traffic, but sending ZK request sounds like a bigger problem to me. Anyway PR looks good to me so I'm looking forward to 9.6.1 release :) was (Author: JIRAUSER303242): [~aparnasuresh] I think communication with ZK actually happens: {noformat} tryLazyCollection.get() {noformat} calls {noformat} ZkStateReader::LazyCollectionRef::get(false) // allowCached = false{noformat} -> cachedDocCollection will be null -> shouldFetch will be true -> ZkStateReader::getCollectionLive will be called-> ZkStateReader::fetchCollectionState will be called-> zkClient.getData() will be called I confirmed it in a debugger. Synchronized block may cause some small delay too, because we handle a relatively big traffic, but sending ZK request sounds like a bigger problem to me. Anyway PR looks good to me so I'm looking forward to 9.6.1 release :) > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns >
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847007#comment-17847007 ] Rafał Harabień commented on SOLR-17275: --- [~aparnasuresh] I think communication with ZK actually happens: {noformat} tryLazyCollection.get() {noformat} calls {noformat} ZkStateReader::LazyCollectionRef::get(false) // allowCached = false{noformat} -> cachedDocCollection will be null -> shouldFetch will be true -> ZkStateReader::getCollectionLive will be called-> ZkStateReader::fetchCollectionState will be called-> zkClient.getData() will be called I confirmed it in a debugger. Synchronized block may cause some small delay too, because we handle a relatively big traffic, but sending ZK request sounds like a bigger problem to me. Anyway PR looks good to me so I'm looking forward to 9.6.1 release :) > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846929#comment-17846929 ] Gus Heck commented on SOLR-13350: - As it is, this will cause a break in back compatibility for folks that have adopted the new query limits functionality. The effect will be for the feature they intend to rely on to protect themselves from runaway queries to silently stop working. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846927#comment-17846927 ] Gus Heck commented on SOLR-13350: - Yes please revert, this is not fully baked, and 9x is supposed to be stable. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17295) The downloads page should link to our OpenAPI spec artifact
[ https://issues.apache.org/jira/browse/SOLR-17295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-17295: --- Description: We now publish OpenAPI spec in our downloads section: [https://downloads.apache.org/solr/solr/9.6.0/openApi/] We should link to this on the website download page at [https://solr.apache.org/downloads] PR will be done against [https://github.com/apache/solr-site] was: We now publish OpenAPI spec in our downloads section: [https://downloads.apache.org/solr/solr/9.6.0/openApi/] We should link to this on the website download page. PR will be done against [https://github.com/apache/solr-site] > The downloads page should link to our OpenAPI spec artifact > --- > > Key: SOLR-17295 > URL: https://issues.apache.org/jira/browse/SOLR-17295 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: website >Reporter: Jan Høydahl >Priority: Major > Labels: good-first-issue, newdev > > We now publish OpenAPI spec in our downloads section: > [https://downloads.apache.org/solr/solr/9.6.0/openApi/] > > We should link to this on the website download page at > [https://solr.apache.org/downloads] > PR will be done against [https://github.com/apache/solr-site] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17219) Exceptions occur while Solr reads some core's configset (java.io.IOException: Error opening /configs/)
[ https://issues.apache.org/jira/browse/SOLR-17219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846910#comment-17846910 ] Chris Young commented on SOLR-17219: I've also hit this issue on Solr 9.5, collections that fail seem to be random, at times when I've restarted all cores load but in general its very inconsistent. Tried moving to Solr 9.6 but also experiencing the issue there as well. > Exceptions occur while Solr reads some core's configset (java.io.IOException: > Error opening /configs/) > > > Key: SOLR-17219 > URL: https://issues.apache.org/jira/browse/SOLR-17219 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 9.5.0 > Environment: My first attempts were on Windows via services hosted > through procrun (both zookeeper and solr nodes). I also tried with a Docker > Dekstop ensemble. > It seems that this error occurs less frequently via Docker. But it happens > anyway. >Reporter: Guillaume Jactat >Priority: Major > Attachments: stack.txt > > > Hello, > I'm currently testing SolrCloud to get a better idea of how to recover from > node failures. > I have a simple configuration: one ZooKeeper server and 3 *Solr 9.5* nodes. > I upload a configset in Zookeeper via Solr's Configsets API. I create 200 > collections, all bound to the same configset. > I leave the collections empty for the moment. > When I stop/start one node, the process of recovery happens. And almost > everytime, i get the following error (full stack is attached to this issue): > java.io.IOException: Error opening > /configs/CoreModel–CB38FE6CFE/lang/stopwords_fi.txt > Its not always the same configset's file. Sometimes, everything goes fine. > But when this error occurs, the whole process of recovery seem compromised, > leaving a lot of cores/collections "down". No "retry" happens, maybe because > Solr assumes that the configset is wrong and no retry could fix it ? > I've tried the same setup on Windows Service (procrun) and Docker Desktop > containers. It seems that this error occurs less frequently with docker but > it happens anyway. > I didn't find anything close to this error on the web... I have no clue why > this error happens. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17295) The downloads page should link to our OpenAPI spec artifact
Jan Høydahl created SOLR-17295: -- Summary: The downloads page should link to our OpenAPI spec artifact Key: SOLR-17295 URL: https://issues.apache.org/jira/browse/SOLR-17295 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: website Reporter: Jan Høydahl We now publish OpenAPI spec in our downloads section: [https://downloads.apache.org/solr/solr/9.6.0/openApi/] We should link to this on the website download page. PR will be done against [https://github.com/apache/solr-site] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Update resources.md [solr-site]
janhoy merged PR #92: URL: https://github.com/apache/solr-site/pull/92 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Support for 'source' field in CycloneDX VEX output [solr-site]
janhoy merged PR #95: URL: https://github.com/apache/solr-site/pull/95 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] Suppress CVE-2023-51074 [solr-site]
janhoy merged PR #96: URL: https://github.com/apache/solr-site/pull/96 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17290: Update SyncStrategy and PeerSyncWithLeader to use the recovery Http2SolrClient [solr]
iamsanjay commented on PR #2460: URL: https://github.com/apache/solr/pull/2460#issuecomment-2114635995 > Would love some thoughts on how we could handle the auth scenarios better without having to write an explicit test for it on every piece of code, once with auth and once with out ;-) IMO, There are not many test cases that have written along with Auth enabled in SolrCloud environment, and apparently it's none for this specific scenario #2462 where a RecoveryStrategy is begin tested in Auth enabled environment. I am not sure how many cases this new test class #2462 can cover, however the idea would be to keep on extending this one to cover more cases for Cloud related scenario. For instance, after merging #2462, same test case would be extends to test SyncStrategy and PeerSyncWithLeader Http2SolrClient calls to make sure that they are executing without any errors in auth enabled SolrCloud environment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17294: The stall detection in the ConcurrentUpdateSolrClients easily detects false positives. [solr]
sigram commented on code in PR #2461: URL: https://github.com/apache/solr/pull/2461#discussion_r1602919046 ## solr/solrj/src/java/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrClient.java: ## @@ -566,24 +576,28 @@ public NamedList request(final SolrRequest request, String collection if (!success) { // stall prevention int currentQueueSize = queue.size(); + long currentTime = System.nanoTime(); + long processed = processedCount.sum(); if (currentQueueSize != lastQueueSize) { // there's still some progress in processing the queue - not stalled lastQueueSize = currentQueueSize; +lastProcessedCount = processed; +lastCheckTime = currentTime; lastStallTime = -1; } else { -if (lastStallTime == -1) { - // mark a stall but keep trying - lastStallTime = System.nanoTime(); -} else { - long currentStallTime = - TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - lastStallTime); - if (currentStallTime > stallTimeMillis) { +long timeElapsed = TimeUnit.NANOSECONDS.toMillis(currentTime - lastCheckTime); +if (timeElapsed > stallTimeMillis) { Review Comment: Please note that the same logic is present in 3 places - request(), blockUntilFinished() and waitForEmptyQueue(). Also, we're supposed to use `ConcurrentUpdateHttp2SolrClient` now ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org