[jira] [Comment Edited] (SOLR-13998) Add thread safety annotation to classes
[ https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176075#comment-17176075 ] Anshum Gupta edited comment on SOLR-13998 at 8/12/20, 6:56 AM: --- [~marcussorealheis] - The idea behind this came from a branch that was shared by Mark, and while I was planning to pick a bunch of those changes to improve a few things in Solr, I didn't get the bandwidth to continue with this effort. The idea here is to have an option that allows to annotate classes in Solr w.r.t. threadsafety. While retrofitting that might be tricky, it can certainly be used for new classes. This mechanism allows other developers to rightfully annotate classes, so others could safely consume/extend those. I agree, in an ideal world, being able to annotate existing classes would have been an awesome plan, and while that was the plan to some extent, it never for there. Hopefully I'll get back to it, or someone else will and make things better for other devs. was (Author: anshumg): [~marcussorealheis] - The idea behind this came from a branch that was shared by Mark, and while I was planning to pick a bunch of those changes to improve a few things in Solr, I didn't get the bandwidth to continue with this effort. I think the idea here is to have an option that allows to annotate classes in Solr w.r.t. threadsafety. While retrofitting that might be tricky, it can certainly be used for new classes. This mechanism allows other developers to rightfully annotate classes, so others could safely consume/extend those. I agree, in an ideal world, being able to annotate existing classes would have been an awesome plan, and while that was the plan to some extent, it never for there. Hopefully I'll get back to it, or someone else will and make things better for other devs. > Add thread safety annotation to classes > --- > > Key: SOLR-13998 > URL: https://issues.apache.org/jira/browse/SOLR-13998 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > Fix For: master (9.0), 8.4 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Add annotations that can be used to mark classes as thread safe / single > threaded in Solr. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13998) Add thread safety annotation to classes
[ https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176075#comment-17176075 ] Anshum Gupta commented on SOLR-13998: - [~marcussorealheis] - The idea behind this came from a branch that was shared by Mark, and while I was planning to pick a bunch of those changes to improve a few things in Solr, I didn't get the bandwidth to continue with this effort. I think the idea here is to have an option that allows to annotate classes in Solr w.r.t. threadsafety. While retrofitting that might be tricky, it can certainly be used for new classes. This mechanism allows other developers to rightfully annotate classes, so others could safely consume/extend those. I agree, in an ideal world, being able to annotate existing classes would have been an awesome plan, and while that was the plan to some extent, it never for there. Hopefully I'll get back to it, or someone else will and make things better for other devs. > Add thread safety annotation to classes > --- > > Key: SOLR-13998 > URL: https://issues.apache.org/jira/browse/SOLR-13998 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > Fix For: master (9.0), 8.4 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Add annotations that can be used to mark classes as thread safe / single > threaded in Solr. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176071#comment-17176071 ] Jan Høydahl commented on SOLR-14726: cURL is the "default", so probably what we should default to. Instead of mentioning wget or Perl, we could perhaps mention HttPIE (I love it), as well as the DevTools section of YASA, which I suppose can handle POST as well as GET, see screenshot: !yasa-http.png! Once the Ref Guide is consistent in using standard tools, we still don't need to remove bin/post in 9.0, it can remain as a hidden gem throughout 10.x to cause the least surprise for folks who have integrated it into their tooling already, and to allow decent alternatives for indexing a whole folder structure to emerge. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176070#comment-17176070 ] Dawid Weiss commented on LUCENE-9448: - I added a pull request this time - removed Solr distribution bit since we established it's not the right time, but added an optional "standalone distribution-only" JAR as an example (highlighter) and a readme file with substitutable parameters. Once you run: gradlew -p lucene\luke assemble you'll see the readme file will contain a java command to launch Luke. I don't think scripts are needed after this (it's a developer tool after all, so I assume minimal terminal capabilities). I'm leaving the rest to you, Tomoko - continue as you please. > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch, LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Høydahl updated SOLR-14726: --- Attachment: yasa-http.png > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176068#comment-17176068 ] Marcus Eagan commented on LUCENE-8626: -- Great ideas [~dweiss] and [~dsmiley]! > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss opened a new pull request #1742: Standalone distribution assembly for Luke
dweiss opened a new pull request #1742: URL: https://github.com/apache/lucene-solr/pull/1742 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes
dweiss commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r469031590 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -440,7 +440,7 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (format >= VERSION_70) { // oldest supported version CodecUtil.checkFooter(input, priorE); } else { -throw IOUtils.rethrowAlways(priorE); Review comment: And what's wrong about a throw from within finally? A finally block is technically just a block of code, like any other. The compiler very likely assumes you're suppressing an exception if you throw from within finally but it's not the case here. I don't know if moving that throw will change the logic. Maybe not. Maybe yes. Given the two options, I wouldn't touch it. My concern was that you slipped such things as part of an otherwise "trivial" set of patches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes
dweiss commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r469027938 ## File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java ## @@ -367,12 +367,12 @@ public void close() { /** * Original source of the tokens. */ -protected final Consumer source; Review comment: Maybe. It doesn't matter though - this changes the API of a class that's been there for ages. I bet there is a class out there somewhere (let's say A extends Analyzer) and another one (B extends A) where A overrides the getter but B reaches out for the original field. Do we want this to break just to hide a field that can be useful for subclasses just to silence an automatic code inspection? I don't think we should. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13998) Add thread safety annotation to classes
[ https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176031#comment-17176031 ] Marcus Eagan commented on SOLR-13998: - Hi [~anshum] I'm working on some janitorial work in the project, and I noticed that you added annotations, but did not implement them. Thus, this title is misleading, and one of the annotations added here has only been used once since you added this PR in December. I'm curious what you had in mind for the SolrSingleThreaded annotation and why didn't you actually add the annotation anywhere. Futhermore, would you like help to expand the usage of this class, or do you feel it was hastily added and is a waste of time? I doubt the latter because there should be quite a few obvious uses for this code. However, I will defer to you since you added it. I'm looking through all the PRs of this year to create an inventory of the sort of code and behavior (in code) that I'd hope to steward the community away from. I have a few items so far, but this is one I was not totally sure about because of its sparsity and how long it had been just sitting in the repo (7 months before any usage at all). I also don't know if the way Solr operates is, we just throw tools into the toolbox and if someone uses them one day, great. If not, someone will one day. Ideally, if we bring something to the project we ourselves would at least use it because we see value in it. I'm an outsider, learning everyday, and hoping to improve the project. > Add thread safety annotation to classes > --- > > Key: SOLR-13998 > URL: https://issues.apache.org/jira/browse/SOLR-13998 > Project: Solr > Issue Type: Improvement >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > Fix For: master (9.0), 8.4 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Add annotations that can be used to mark classes as thread safe / single > threaded in Solr. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176013#comment-17176013 ] David Smiley commented on LUCENE-8626: -- I request that the Solr side be delayed some, and maybe get its own issue. The reason for my request is [~markrmil...@gmail.com]'s massive branch "reference_impl" which he writes about SOLR-14636. It's unclear how/when master & this branch will somehow get reconciled, but presently I suggest using caution when doing very wide-scale changes, _especially_ for renames or moves. > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176006#comment-17176006 ] Tomoko Uchida commented on SOLR-13412: -- FYI: I opened LUCENE-9459 to make getting started guide for luke module. > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9459) "Getting Started" guide for Luke
[ https://issues.apache.org/jira/browse/LUCENE-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomoko Uchida updated LUCENE-9459: -- Description: While Luke has been very popular, widely accepted tool to Lucene users (including Solr and Elasticsearch users) for 10+ years, it lacks good documentation and/or user guide. Although Luke is a GUI tool that describes itself, it's not good for new users. The lack of documentation is partly due to my laziness though, there is some inherent difficulty of explaining such low-level tool; if you don't know Lucene you don't understand Luke's capability or usefulness, if you once understand Lucene at some level, it's obvious to you and no explanation is needed. :) Nonetheless, it would be great if we have "Getting Started" documentation for Luke on our web site for new users/devs. We may be able to have a Markdown file with some screenshots and usage descriptions, then convert it to HTML by Gradle task, so that we can publish it with whole API documentation. was: While Luke has been very popular, widely accepted tool to Lucene users (including Solr and Elasticsearch users) for 10+ years, it lacks good documentation and/or user guide. Although Luke is a GUI tool that describes itself, it's not good for new users. The lack of documentation is partly due to my laziness though, there is some inherent difficulty of explaining such low-level tool; if you don't know Lucene you don't understand Luke's capability or usefulness, if you once understand Lucene at some level, it's obvious to you and no explanation is needed. :) Nonetheless, it would be great if we have "Getting Started" documentation for Luke on our web site for new userd/devs. We may be able to have a Markdown file with some screenshots and usage descriptions, then convert it to HTML by Gradle task, so that we can publish it with whole API documentation. > "Getting Started" guide for Luke > - > > Key: LUCENE-9459 > URL: https://issues.apache.org/jira/browse/LUCENE-9459 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/luke >Reporter: Tomoko Uchida >Priority: Major > Labels: newdev > > While Luke has been very popular, widely accepted tool to Lucene users > (including Solr and Elasticsearch users) for 10+ years, it lacks good > documentation and/or user guide. Although Luke is a GUI tool that describes > itself, it's not good for new users. > The lack of documentation is partly due to my laziness though, there is some > inherent difficulty of explaining such low-level tool; if you don't know > Lucene you don't understand Luke's capability or usefulness, if you once > understand Lucene at some level, it's obvious to you and no explanation is > needed. :) > Nonetheless, it would be great if we have "Getting Started" documentation for > Luke on our web site for new users/devs. > We may be able to have a Markdown file with some screenshots and usage > descriptions, then convert it to HTML by Gradle task, so that we can publish > it with whole API documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9459) "Getting Started" guide for Luke
Tomoko Uchida created LUCENE-9459: - Summary: "Getting Started" guide for Luke Key: LUCENE-9459 URL: https://issues.apache.org/jira/browse/LUCENE-9459 Project: Lucene - Core Issue Type: Improvement Components: modules/luke Reporter: Tomoko Uchida While Luke has been very popular, widely accepted tool to Lucene users (including Solr and Elasticsearch users) for 10+ years, it lacks good documentation and/or user guide. Although Luke is a GUI tool that describes itself, it's not good for new users. The lack of documentation is partly due to my laziness though, there is some inherent difficulty of explaining such low-level tool; if you don't know Lucene you don't understand Luke's capability or usefulness, if you once understand Lucene at some level, it's obvious to you and no explanation is needed. :) Nonetheless, it would be great if we have "Getting Started" documentation for Luke on our web site for new userd/devs. We may be able to have a Markdown file with some screenshots and usage descriptions, then convert it to HTML by Gradle task, so that we can publish it with whole API documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1724: SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time
CaoManhDat commented on a change in pull request #1724: URL: https://github.com/apache/lucene-solr/pull/1724#discussion_r468970705 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java ## @@ -155,6 +159,7 @@ public ServerIterator(Req req, Map zombieServers) { this.req = req; this.zombieServers = zombieServers; this.timeAllowedNano = getTimeAllowedInNanos(req.getRequest()); + log.info("TimeAllowedNano:{}", this.timeAllowedNano); Review comment: Thank you, I totally forget that on creating this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map
Noble Paul created SOLR-14730: - Summary: Build new SolrJ APIs without concrete classes like NamedList/Map Key: SOLR-14730 URL: https://issues.apache.org/jira/browse/SOLR-14730 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Noble Paul We must minimize weakly typed code. Our public APIs should be programmed against interfaces and wherever possible use POJOs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map
[ https://issues.apache.org/jira/browse/SOLR-14730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-14730: -- Labels: clean-api (was: ) > Build new SolrJ APIs without concrete classes like NamedList/Map > > > Key: SOLR-14730 > URL: https://issues.apache.org/jira/browse/SOLR-14730 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > Labels: clean-api > > We must minimize weakly typed code. Our public APIs should be programmed > against interfaces and wherever possible use POJOs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_
[ https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175086#comment-17175086 ] Chris M. Hostetter edited comment on SOLR-14687 at 8/12/20, 12:32 AM: -- besides that fact that Jira's WYSIWYG editor lied to me and munged up some of the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" in many places, something else has been nagging that i felt like i was overlooking and i finally figured out what it is: I hadn't really accounted for docs that _have_ a "nest path" but their path doesn't have any common ancestors with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up affecting each other? I *think* that what i described above would still mostly work for the "parent" parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as i described above didn't really "rule out" the other docs, because this still wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", but it still wouldn't really be a "correct" "parents bit set filter" as the underlying code expects it to be in terms of identifying all "non children" documents ... but** I'm _pretty sure_ it would be broken for the "child" parser case, because some doc with a n "/x" or "/x/y" path isn't going to be matched by the "parents filter bitset" so might get swallowed up in the list of children. The other thing that bugged me was the (mistaken & missguided) need to ' ... compute a list of all "prefix subpaths" ... ' – i'm not sure way i thought that was necessary, instead of just saying "must _NOT_ have a prefix of the specified path – ie: {code:java} GIVEN:{!foo parentPath="/a/b/c"} ... INSTEAD OF:PARENT FILTER BITSET = ((*:* -_nest_path_:*) OR _nest_path_:(/a /a/b /a/b/c)) JUST USE:PARENT FILTER BITSET = (*:* -{prefix f="_nest_path_" v="/a/b/c/"}) {code} ...which (IIUC) should solve both problems, by matching: * docs w/o any nest path * docs with a nest path that does NOT start with /a/b/c/ ** which includes the immediate "/a/b/c" parents, as well as their ancestors, as well as any docs with completely orthoginal paths (like /x/y/z) But of course: in the case of {{parentFilter="/"}} this would still simply be "docs w/o a nest path" That should work, right? I also think i made some mistakes/types in my examples above in trying to articular what the equivalent "old style" query would be, so let me restate all of the examples in full... {noformat} NEW: q={!parent parentPath="/a/b/c"}c_title:son OLD: q=(+{!field f="_nest_path_" v="/a/b/c"} +{!parent which=$ff v=$vv}) ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) vv=(+c_title:son +{prefix f="_nest_path_" v="/a/b/c/"}) {noformat} {noformat} NEW: q={!parent parentPath="/"}c_title:son OLD: q=(-_nest_path_:* +{!parent which=$ff v=$vv} ff=(*:* -_nest_path_:*) vv=(+c_title:son +_nest_path_:*) {noformat} {noformat} NEW: q={!child parentPath="/a/b/c"}p_title:dad OLD: q={!child of=$ff v=$vv}) ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) vv=(+p_title:dad +{field f="_nest_path_" v="/a/b/c"}) {noformat} {noformat} NEW: q={!child parentPath="/"}p_title:dad OLD: q={!child of=$ff v=$vv}) ff=(*:* -_nest_path_:*) vv=(+p_title:dad -_nest_path_:*) {noformat} [~mkhl] - what do you think about this approach? do you see any flaws in the logic here? ... if the logic looks correct, I'd like to write it up as "how to create a *safe* of/which local param when using nest path" doc tip for SOLR-14383 and move forward there as a documentation improvement, even if there are still feature/implementation/syntax concerns/discussion to happen here as far as a "new feature" *EDIT*: fixed brain fart / typo of + vs - in last example was (Author: hossman): besides that fact that Jira's WYSIWYG editor lied to me and munged up some of the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" in many places, something else has been nagging that i felt like i was overlooking and i finally figured out what it is: I hadn't really accounted for docs that _have_ a "nest path" but their path doesn't have any common ancestors with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up affecting each other? I *think* that what i described above would still mostly work for the "parent" parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as i described above didn't really "rule out" the other docs, because this still wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", but it still wouldn't really be a "correct" "parents bit set filter" as
[jira] [Commented] (SOLR-14677) DIH doesnt close DataSource when import encounters errors
[ https://issues.apache.org/jira/browse/SOLR-14677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175901#comment-17175901 ] Jason Gerlowski commented on SOLR-14677: I created a PR a few minutes ago that ensures that errors arising during close() operations on DIHWriter don't derail efforts to close the EntityProcessors and DataSources. Hoping to commit this in a few days, and then I'll propose it in the plugin repo I guess? > DIH doesnt close DataSource when import encounters errors > - > > Key: SOLR-14677 > URL: https://issues.apache.org/jira/browse/SOLR-14677 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: 7.5, master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Minor > Attachments: error-solr.log, no-error-solr.log > > Time Spent: 10m > Remaining Estimate: 0h > > DIH imports don't close DataSource's (which can hold db connections, etc.) in > all cases. Specifically, if an import runs into an unexpected error > forwarding processed docs to other nodes, it will neglect to close the > DataSource's when it finishes. > This problem goes back to at least 7.5. This is partially mitigated in older > versions of some DataSource implementations (e.g. JdbcDataSource) by means of > a "finalize" hook which invokes "close()" when the DataSource object is > garbage-collected. In practice, this means that resources might be held open > longer than necessary but will be closed within a few seconds or minutes by > GC. This only helps JdbcDataSource though - all other DataSource impl's risk > leaking resources. > In master/9.0, which requires a minimum of Java 11 and doesn't have the > finalize-hook, the connections are never cleaned up when an error is > encountered during DIH. DIH will likely be removed for the 9.0 release, but > if it isn't this bug should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija opened a new pull request #1741: SOLR-14677: Always close DIH EntityProcessor/DataSource
gerlowskija opened a new pull request #1741: URL: https://github.com/apache/lucene-solr/pull/1741 # Description Prior to this commit, the wrapup logic at the end of DocBuilder.execute() closed out a series of DIH objects, but did so in a way that an exception closing any of them resulted in the remainder staying open. This is especially problematic since Writer.close() throws exceptions that DIH uses to determine the success/failure of the run. In practice this caused network errors sending DIH data to other Solr nodes to result in leaked JDBC connections. # Solution This PR changes DocBuilder's termination logic to handle exceptions more gracefully, ensuring that errors closing a DIHWriter (for example) don't prevent the closure of entity-processor and DataSource objects. # Tests None # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14677) DIH doesnt close DataSource when import encounters errors
[ https://issues.apache.org/jira/browse/SOLR-14677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski reassigned SOLR-14677: -- Assignee: Jason Gerlowski > DIH doesnt close DataSource when import encounters errors > - > > Key: SOLR-14677 > URL: https://issues.apache.org/jira/browse/SOLR-14677 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: 7.5, master (9.0) >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Minor > Attachments: error-solr.log, no-error-solr.log > > > DIH imports don't close DataSource's (which can hold db connections, etc.) in > all cases. Specifically, if an import runs into an unexpected error > forwarding processed docs to other nodes, it will neglect to close the > DataSource's when it finishes. > This problem goes back to at least 7.5. This is partially mitigated in older > versions of some DataSource implementations (e.g. JdbcDataSource) by means of > a "finalize" hook which invokes "close()" when the DataSource object is > garbage-collected. In practice, this means that resources might be held open > longer than necessary but will be closed within a few seconds or minutes by > GC. This only helps JdbcDataSource though - all other DataSource impl's risk > leaking resources. > In master/9.0, which requires a minimum of Java 11 and doesn't have the > finalize-hook, the connections are never cleaned up when an error is > encountered during DIH. DIH will likely be removed for the 9.0 release, but > if it isn't this bug should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley opened a new pull request #1740: LUCENE-9458: WDGF and WDF should tie-break by endOffset
dsmiley opened a new pull request #1740: URL: https://github.com/apache/lucene-solr/pull/1740 Can happen with catenateAll and not generating word xor number part when the input ends with the non-generated sub-token. https://issues.apache.org/jira/browse/LUCENE-9458 CC @jimczi maybe you could review this please; I believe you reviewed the predecessor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9458) WordDelimiterGraphFilter (and non-graph) should tie-break order using end offset
David Smiley created LUCENE-9458: Summary: WordDelimiterGraphFilter (and non-graph) should tie-break order using end offset Key: LUCENE-9458 URL: https://issues.apache.org/jira/browse/LUCENE-9458 Project: Lucene - Core Issue Type: Improvement Components: modules/analysis Reporter: David Smiley Assignee: David Smiley WordDelimiterGraphFilter and WordDelimiterFilter do not consult the end offset in their sub-token _ordering_. In the event of a tie-break, I propose the longer token come first. This usually happens already, but not always, and so this also feels like an inconsistency when you see it. This issue can be thought of as a bug fix to LUCENE-9006 or an improvement; I have no strong feelings on the issue classification. Before reading further, definitely read that issue. I see this is a problem when using CATENATE_ALL with either GENERATE_WORD_PARTS xor GENERATE_NUMBER_PARTS when the input ends with that part not being generated. Consider the input: "other-9" and let's assume we want to catenate all, generate word parts, but nothing else (not numbers). This should be tokenized in this order: "other9", "other" but today is emitted in reverse order. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14706) Upgrading 8.6.0 to 8.6.1 causes collection creation to fail
[ https://issues.apache.org/jira/browse/SOLR-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175844#comment-17175844 ] ASF subversion and git services commented on SOLR-14706: Commit dc0d049b62b2dd1e5bddc30beda526b79e4a7383 in lucene-solr's branch refs/heads/branch_8x from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dc0d049 ] SOLR-14706: Fix support for default autoscaling policy (8x forward-port) (#1739) > Upgrading 8.6.0 to 8.6.1 causes collection creation to fail > --- > > Key: SOLR-14706 > URL: https://issues.apache.org/jira/browse/SOLR-14706 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 8.7, 8.6.1 > Environment: 8.6.1 upgraded from 8.6.0 with more than one node >Reporter: Gus Heck >Assignee: Houston Putman >Priority: Blocker > Fix For: 8.6.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > > The following steps will reproduce a situation in which collection creation > fails with this stack trace: > {code:java} > 2020-08-03 12:17:58.617 INFO > (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [ > ] o.a.s.c.a.c.CreateCollectionCmd Create collection test861 > 2020-08-03 12:17:58.751 ERROR > (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [ > ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test861 operation: > create failed:org.apache.solr.common.SolrException > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:347) > at > org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264) > at > org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:517) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Only one extra tag supported for the > tag cores in { > "cores":"#EQUAL", > "node":"#ANY", > "strict":"false"} > at > org.apache.solr.client.solrj.cloud.autoscaling.Clause.(Clause.java:122) > at > org.apache.solr.client.solrj.cloud.autoscaling.Clause.create(Clause.java:235) > at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) > at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) > at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) > at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > at > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > at > org.apache.solr.client.solrj.cloud.autoscaling.Policy.(Policy.java:144) > at > org.apache.solr.client.solrj.cloud.autoscaling.AutoScalingConfig.getPolicy(AutoScalingConfig.java:372) > at > org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:300) > at > org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:277) > at > org.apache.solr.cloud.api.collections.Assign$AssignStrategyFactory.create(Assign.java:661) > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:415) > at > org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:192) > ... 6 more > {code} > Generalized steps: > # Deploy 8.6.0 with separate data directories, create a collection to prove > it's working > # download > https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2/solr/solr-8.6.1.tgz > # Stop the server on all nodes > # replace the 8.6.0 with 8.6.1 > # Start the server > # via the admin UI create a collection > # Observe failure warning box (with no text), check logs, find above trace > Or more exactly here are my actual commands with a checkout of the 8.6.0 tag > in the working dir to which cloud.sh was configured: > # /cloud.sh new -r upgrademe > # Create collection named test860 via admin ui with _default > # ./cloud.sh stop > # cd upgrademe/ > # cp ../8_6_1_RC1/solr-8.6.1.tgz . > # mv solr-8.6.0
[GitHub] [lucene-solr] HoustonPutman merged pull request #1739: SOLR-14706: Fix support for default autoscaling policy (8x forward-port)
HoustonPutman merged pull request #1739: URL: https://github.com/apache/lucene-solr/pull/1739 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14729) Investigate and harden TestExportWriter.testExpr failures
Joel Bernstein created SOLR-14729: - Summary: Investigate and harden TestExportWriter.testExpr failures Key: SOLR-14729 URL: https://issues.apache.org/jira/browse/SOLR-14729 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Joel Bernstein TestExportWriter.testExpr is failing way too much (6.4% of the time). This ticket will fix the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13807) Caching for term facet counts
[ https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175836#comment-17175836 ] Michael Gibney commented on SOLR-13807: --- After SOLR-13132 was merged to master, it was a bit of challenge to reconcile with the complementary "term facet cache" (this issue). I've taken an initial stab at this and pushed to [PR #1357|https://github.com/apache/lucene-solr/pull/1357], and I think it's at the point where it's once again ready for consideration. Below are some naive performance benchmarks, using [^SOLR-13807-benchmarks.tgz] (based on similar benchmarks for SOLR-13132). {{filterCache}} is irrelevant for what's illustrated here (all count or sweep collection, single-shard thus no refinement). I included hooks in the included scripts to easily change the filterCache size and termFacetCache size for evaluation. For purpose of {{relatedness}} evaluation, fgSet == base search result domain. All results discussed here are for single-valued string fields, but multivalued string fields are also included in the benchmark attachment (results for multi-valued didn't differ substantially from those for single-valued). There's a row for each docset domain recall percentage (percentage of \*:* domain returned by main query/fg), and a column for each field cardinality; cell values indicate latency (QTime) in ms against a single core with 3 million docs, no deletes; each value is the average of 10 repeated invocations of the the relevant request (standard deviation isn't captured here, but was quite low, fwiw). Below are for current (including SOLR-13132) master; no caches (filterCache, if present, would be unused): {code} [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, master cdnlty: 10 100 1k 10k 100k1m .1% 0 0 0 0 0 4 1% 1 0 1 1 2 5 10% 7 7 8 8 10 16 20% 17 14 16 15 19 31 30% 22 19 23 20 24 42 40% 27 26 28 28 32 50 50% 33 32 35 32 38 59 99.99% 65 60 67 62 72 107 [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, master cdnlty: 10 100 1k 10k 100k1m .1% 179 174 183 190 192 225 1% 182 177 186 183 194 236 10% 193 191 196 197 226 256 20% 206 200 207 207 234 300 30% 216 210 217 216 239 316 40% 228 225 231 231 253 331 50% 239 234 241 240 266 347 99.99% 285 280 287 287 311 403 {code} Below are for 77daac4ae2a4d1c40652eafbbdb42b582fe2d02d (SOLR-13807), with _no_ termFacetCache configured (apples-to-apples, since there are changes in some of the hot facet code paths): {code} [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, no_cache cdnlty: 10 100 1k 10k 100k1m .1% 0 0 0 0 0 3 1% 1 1 1 1 1 6 10% 8 8 9 8 11 14 20% 16 15 16 15 20 32 30% 21 21 23 22 26 42 40% 28 27 31 28 34 53 50% 35 33 37 34 40 63 99.99% 68 64 71 66 74 108 [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, no_cache cdnlty: 10 100 1k 10k 100k1m .1% 96 80 89 97 96 129 1% 88 83 90 88 101 133 10% 99 97 103 102 122 162 20% 117 107 113 113 135 194 30% 120 117 123 122 144 211 40% 130 129 134 134 156 232 50% 143 140 147 144 169 249 99.99% 179 175 181 179 201 305 {code} Below are for 77daac4ae2a4d1c40652eafbbdb42b582fe2d02d (SOLR-13807), with {{solr.termFacetCacheSize=20}} configured. {code} [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, cache size 20 cdnlty: 10 100 1k 10k 100k1m .1% 0 0 0 0 0 2 1% 0 0 0 0 1 10 10% 3 4 4 4 5 16 20% 8 7 8 7 9 20 30% 11 10 12 11 13 25 40% 13 13 15 15 15 28 50% 15 16 16 18 20 32 99.99% 29 30 30 29 32 45 [magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, cache size 20 cdnlty: 10 100 1k 10k 100k1m .1% 0 0
[jira] [Created] (LUCENE-9457) Why is Kuromoji tokenization throughput bimodal?
Michael McCandless created LUCENE-9457: -- Summary: Why is Kuromoji tokenization throughput bimodal? Key: LUCENE-9457 URL: https://issues.apache.org/jira/browse/LUCENE-9457 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless With the recent accidental regression of Japanese (Kuromoji) tokenization throughput due to exciting FST optimizations, we [added new nightly Lucene benchmarks|https://github.com/mikemccand/luceneutil/issues/64] to measure tokenization throughput for {{JapaneseTokenizer}}: [https://home.apache.org/~mikemccand/lucenebench/analyzers.html] It has already been running for ~5-6 weeks now! But for some reason, it looks bi-modal? "Normally" it is ~.45 M tokens/sec, but for two data points it dropped down to ~.33 M tokens/sec, which is odd. It could be hotspot noise maybe? But would be good to get to the root cause and fix it if possible. Hotspot noise that randomly steals ~27% of your tokenization throughput is no good!! Or does anyone have any other ideas of what could be bi-modal in Kuromoji? I don't think [this performance test|https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/TestAnalyzerPerf.java] has any randomness in it... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13807) Caching for term facet counts
[ https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Gibney updated SOLR-13807: -- Attachment: SOLR-13807-benchmarks.tgz > Caching for term facet counts > - > > Key: SOLR-13807 > URL: https://issues.apache.org/jira/browse/SOLR-13807 > Project: Solr > Issue Type: New Feature > Components: Facet Module >Affects Versions: master (9.0), 8.2 >Reporter: Michael Gibney >Priority: Minor > Attachments: SOLR-13807-benchmarks.tgz, > SOLR-13807__SOLR-13132_test_stub.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Solr does not have a facet count cache; so for _every_ request, term facets > are recalculated for _every_ (facet) field, by iterating over _every_ field > value for _every_ doc in the result domain, and incrementing the associated > count. > As a result, subsequent requests end up redoing a lot of the same work, > including all associated object allocation, GC, etc. This situation could > benefit from integrated caching. > Because of the domain-based, serial/iterative nature of term facet > calculation, latency is proportional to the size of the result domain. > Consequently, one common/clear manifestation of this issue is high latency > for faceting over an unrestricted domain (e.g., {{\*:\*}}), as might be > observed on a top-level landing page that exposes facets. This type of > "static" case is often mitigated by external (to Solr) caching, either with a > caching layer between Solr and a front-end application, or within a front-end > application, or even with a caching layer between the end user and a > front-end application. > But in addition to the overhead of handling this caching elsewhere in the > stack (or, for a new user, even being aware of this as a potential issue to > mitigate), any external caching mitigation is really only appropriate for > relatively static cases like the "landing page" example described above. A > Solr-internal facet count cache (analogous to the {{filterCache}}) would > provide the following additional benefits: > # ease of use/out-of-the-box configuration to address a common performance > concern > # compact (specifically caching count arrays, without the extra baggage that > accompanies a naive external caching approach) > # NRT-friendly (could be implemented to be segment-aware) > # modular, capable of reusing the same cached values in conjunction with > variant requests over the same result domain (this would support common use > cases like paging, but also potentially more interesting direct uses of > facets). > # could be used for distributed refinement (i.e., if facet counts over a > given domain are cached, a refinement request could simply look up the > ordinal value for each enumerated term and directly grab the count out of the > count array that was cached during the first phase of facet calculation) > # composable (e.g., in aggregate functions that calculate values based on > facet counts across different domains, like SKG/relatedness – see SOLR-13132) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer
mikemccand commented on a change in pull request #1733: URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r468812527 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java ## @@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException { } } -Document doc = indexReader.document(ordinal); -FacetLabel ret = new FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL))); +boolean found = MultiDocValues.getBinaryValues(indexReader, Consts.FULL).advanceExact(catIDInteger); Review comment: One more idea: instead of using `MultiDocValues` sugar, I think we should use Lucene's `ReaderUtil` to quickly (binary search) determine which leaf holds this `docId`, then pull `BinaryDocValues` from that `LeafReader`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer
mikemccand commented on a change in pull request #1733: URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r468808904 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java ## @@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException { } } -Document doc = indexReader.document(ordinal); -FacetLabel ret = new FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL))); +boolean found = MultiDocValues.getBinaryValues(indexReader, Consts.FULL).advanceExact(catIDInteger); Review comment: OK, I see one issue -- you are pulling a new `BinaryDocValues`, calling `.advanceExact` on it (good), but then pulling a new `BinaryDocValues` below and not calling `.advanceExact` on it. I think you must add a new local variable, e.g. `BinaryDocValues values`. Pull it once (using the `MultiDocValues.getBinaryValues` sugar API). Then call `.advanceExact` on that and assert it succeeded. Finally, use that same `values` instance (now that it has advanced to the right `docId`) to call `.binaryValue().utf8ToString()`. I think that should fix the `NPE`? This is misuse of the API for the default Lucene Codec for `BinaryDocValues`, since you were calling `.binaryValue()` before `.advanceExact()`. It is somewhat disappointing that the codec threw a confusing `NPE` and not a clearer (best effort) exception stating that you must first call `.advanceExact`. Maybe we could improve the default Codec? (Though, not if that would hurt performance of correct usage). OK I see: the `NPE` is because of `MultiDocValues.currentValues` is `null` since `.advanceExact` was not yet called. Maybe we could add an `assert` there, confirming `.advanceExact` was indeed called and had returned `true`? It would have made debugging this easier, and should not hurt performance when assertions are disabled ... ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java ## @@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException { } } -Document doc = indexReader.document(ordinal); -FacetLabel ret = new FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL))); +boolean found = MultiDocValues.getBinaryValues(indexReader, Consts.FULL).advanceExact(catIDInteger); Review comment: I think instead of the boxed `Integer catIDInteger` we should pass the `int ordinal` to `.advanceExact(...)`? Not the cause of the `NPE`, just cleaner. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13972) Insecure Solr should generate startup warning
[ https://issues.apache.org/jira/browse/SOLR-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175803#comment-17175803 ] Jason Gerlowski commented on SOLR-13972: Reopening based on the mailing list thread Jan referenced above. I'll take care of this this week. > Insecure Solr should generate startup warning > - > > Key: SOLR-13972 > URL: https://issues.apache.org/jira/browse/SOLR-13972 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Jason Gerlowski >Priority: Critical > Fix For: master (9.0), 8.4 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Warning to the effect of, start Solr with: "solr auth enable -credentials > solr:foo -blockUnknown true” (or some other way to achieve the same effect) > if you want to expose this Solr instance directly to users. Maybe the link to > the ref guide discussing all this might be in good measure here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13972) Insecure Solr should generate startup warning
[ https://issues.apache.org/jira/browse/SOLR-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski updated SOLR-13972: --- Status: Reopened (was: Closed) > Insecure Solr should generate startup warning > - > > Key: SOLR-13972 > URL: https://issues.apache.org/jira/browse/SOLR-13972 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Assignee: Jason Gerlowski >Priority: Critical > Fix For: master (9.0), 8.4 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Warning to the effect of, start Solr with: "solr auth enable -credentials > solr:foo -blockUnknown true” (or some other way to achieve the same effect) > if you want to expose this Solr instance directly to users. Maybe the link to > the ref guide discussing all this might be in good measure here. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman opened a new pull request #1739: SOLR-14706: Fix support for default autoscaling policy (8x forward-port)
HoustonPutman opened a new pull request #1739: URL: https://github.com/apache/lucene-solr/pull/1739 forward-porting #1716 for https://issues.apache.org/jira/browse/SOLR-14706 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"
jimczi commented on a change in pull request #1725: URL: https://github.com/apache/lucene-solr/pull/1725#discussion_r468774381 ## File path: lucene/core/src/java/org/apache/lucene/search/FieldValueHitQueue.java ## @@ -160,18 +160,20 @@ private FieldValueHitQueue(SortField[] fields, int size, boolean filterNonCompet * The number of hits to retain. Must be greater than zero. * @param filterNonCompetitiveDocs *{@code true} If comparators should be allowed to filter non-competitive documents, {@code false} otherwise + * @param hasAfter + *{@code true} If this sort has "after" FieldDoc */ public static FieldValueHitQueue create(SortField[] fields, int size, - boolean filterNonCompetitiveDocs) { + boolean filterNonCompetitiveDocs, boolean hasAfter) { Review comment: Can we avoid adding `hasAfter` here ? See my comment below. ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringDocLeafComparator.java ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.lucene.search; + +import org.apache.lucene.index.LeafReaderContext; + +import java.io.IOException; + +/** + * This comparator is used when there is sort by _doc asc together with "after" FieldDoc. + * The comparator provides an iterator that can quickly skip to the desired "after" document. + */ +public class FilteringDocLeafComparator implements FilteringLeafFieldComparator { +private final FieldComparator.DocComparator in; +private DocIdSetIterator topValueIterator; // iterator that starts from topValue if possible +private final int minDoc; +private final int maxDoc; +private final int docBase; +private boolean iteratorUpdated = false; + +public FilteringDocLeafComparator(LeafFieldComparator in, LeafReaderContext context) { Review comment: Can we force the `in` to be a `FieldComparator.DocComparator` ? ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java ## @@ -68,10 +68,12 @@ public int compareValues(T first, T second) { * @param comparator – comparator to wrap * @param reverse – if this sort is reverse * @param singleSort – true if this sort is based on a single field and there are no other sort fields for tie breaking + * @param hasAfter – true if this sort has after FieldDoc * @return comparator wrapped as a filtering comparator or the original comparator if the filtering functionality * is not implemented for it */ - public static FieldComparator wrapToFilteringComparator(FieldComparator comparator, boolean reverse, boolean singleSort) { + public static FieldComparator wrapToFilteringComparator(FieldComparator comparator, boolean reverse, boolean singleSort, + boolean hasAfter) { Review comment: Do we really need to add the `hasAfter` ? Can we check the if the `topValue` in the DocComparator is greater than 0 instead ? ## File path: lucene/core/src/java/org/apache/lucene/search/FieldValueHitQueue.java ## @@ -121,7 +121,7 @@ protected boolean lessThan(final Entry hitA, final Entry hitB) { } // prevent instantiation and extension. - private FieldValueHitQueue(SortField[] fields, int size, boolean filterNonCompetitiveDocs) { + private FieldValueHitQueue(SortField[] fields, int size, boolean filterNonCompetitiveDocs, boolean hasAfter) { Review comment: Not sure that `hasAfter` is really needed here. ## File path: lucene/core/src/java/org/apache/lucene/search/FilteringDocLeafComparator.java ## @@ -0,0 +1,157 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, softwar
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175750#comment-17175750 ] Tomoko Uchida commented on SOLR-13412: -- [~epugh] : {quote}I think that the real issue here is that most of our users don't know that Luke exists, and that its a powerful tool. What if we kept Luke as a standalone artifact, and instead talked about Luke in the Solr Ref Guide? We mention the Luke request handler on [https://lucene.apache.org/solr/guide/8_6/implicit-requesthandlers.html], that could also link to a page with more details on Luke and where to download it from? Which reminds me we should add the word Luke to the Ref Guide glossary page! {quote} {quote}I just poked around the lucene.apache.org site, and there is no mention of Luke anywhere... {quote} Thank you for specifically pointing that. Indeed documentation and/or user guide is the most powerful promotion tool, then Luke lacks any of them. Although Luke is a GUI tool that describes itself, it's not great for new users. I'm partly responsible for that - I once created an issue [https://github.com/DmitryKey/luke/issues/116] and have abandoned it. I always thought we should have "getting started" documentation for Luke in our web site so that we can provide the links for it from Solr Ref Guide or everywhere else. If you have any ideas, please feel free to share it and open an issue if needed :) > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation
madrob commented on a change in pull request #1726: URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468762245 ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { Review comment: https://github.com/apache/lucene-solr/pull/1726/files#diff-d8beef800870f194d61993d701fd9cc2L77 has `>=` https://github.com/apache/lucene-solr/pull/1726/files#diff-65e9f3712efc1ec962ea82a04a1d7aa1L104 has `>` Either way, please update the docs at https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L162 because they are absolutely wrong (`>= 0` means no timeout???) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker
atris commented on a change in pull request #1737: URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468756781 ## File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java ## @@ -811,10 +818,18 @@ private void initLibs(SolrResourceLoader loader, boolean isConfigsetTrusted) { loader.reloadLuceneSPI(); } - private void validateMemoryBreakerThreshold() { + private void validateCircuitBreakerThresholds() { if (useCircuitBreakers) { - if (memoryCircuitBreakerThresholdPct > 95 || memoryCircuitBreakerThresholdPct < 50) { -throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct is 50 - 95"); + if (isMemoryCircuitBreakerEnabled) { +if (memoryCircuitBreakerThresholdPct > 95 || memoryCircuitBreakerThresholdPct < 50) { + throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct is 50 - 95"); +} + } + + if (isCpuCircuitBreakerEnabled) { +if (cpuCircuitBreakerThresholdPct > 95 || cpuCircuitBreakerThresholdPct < 40) { + throw new IllegalArgumentException("Valid value range for cpuCircuitBreakerThresholdPct is 40 - 95"); Review comment: I see values between 0 - 100. Ran stress locally and validated values. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] uschindler commented on pull request #1593: LUCENE-9409: Check file lengths before creating slices.
uschindler commented on pull request #1593: URL: https://github.com/apache/lucene-solr/pull/1593#issuecomment-672106735 I am fine to fix the test. Sure you have to first figure out why the index is out of bounds, and the exact exception may be misleading, but that's actually what's happening here. If you want other exceptions, another fix would be to enforce the IO layer to have a meaningful exception and implement it for all directory implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris merged pull request #1736: Harden RequestRateLimiter Tests
atris merged pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on pull request #1593: LUCENE-9409: Check file lengths before creating slices.
jpountz commented on pull request #1593: URL: https://github.com/apache/lucene-solr/pull/1593#issuecomment-672097226 I repurposed this PR to instead make the test expect out-of-bounds exceptions. Does it look better to you @rmuir @uschindler ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9379) Directory based approach for index encryption
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175699#comment-17175699 ] Rajeswari Natarajan edited comment on LUCENE-9379 at 8/11/20, 4:59 PM: --- [~bruno.roustant] and [~dsmiley] , if we go with implicit router, shard management/rebalancing/routing becomes manual. Solrcloud will not take care of these (In solr mailing lists always I see users are advised against taking this route) , so looking to see if encryption possible with composite id router and multiple tenants per collection . We might have around 3000+ collections going forward , so having one collection per tenant will make our cluster really heavy. Please share your thoughts and if anyone has attempted this kind of encryption was (Author: raji): [~bruno.roustant] and [~dsmiley] , if we go with implicit router, shard management/rebalancing/routing becomes manual. Solrcloud will not take care of these (In solr mailing lists always I see users are advised against taking this route_ , so looking to see if encryption possible with composite id router and multiple tenants per collection . We might have around 3000+ collections going forward , so having one collection per tenant will make our cluster really heavy. Please share your thoughts and if anyone has attempted this kind of encryption > Directory based approach for index encryption > - > > Key: LUCENE-9379 > URL: https://issues.apache.org/jira/browse/LUCENE-9379 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > +Important+: This Lucene Directory wrapper approach is to be considered only > if an OS level encryption is not possible. OS level encryption better fits > Lucene usage of OS cache, and thus is more performant. > But there are some use-case where OS level encryption is not possible. This > Jira issue was created to address those. > > > The goal is to provide optional encryption of the index, with a scope limited > to an encryptable Lucene Directory wrapper. > Encryption is at rest on disk, not in memory. > This simple approach should fit any Codec as it would be orthogonal, without > modifying APIs as much as possible. > Use a standard encryption method. Limit perf/memory impact as much as > possible. > Determine how callers provide encryption keys. They must not be stored on > disk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker
atris commented on a change in pull request #1737: URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468728850 ## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.invoke.MethodHandles; +import java.lang.management.ManagementFactory; +import java.lang.management.OperatingSystemMXBean; + +import org.apache.solr.core.SolrConfig; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * + * Tracks current CPU usage and triggers if the specified threshold is breached. + * + * This circuit breaker gets the average CPU load over the last minute and uses + * that data to take a decision. Ideally, we should be able to cache the value + * locally and only query once the minute has elapsed. However, that will introduce + * more complexity than the current structure and might not get us major performance + * wins. If this ever becomes a performance bottleneck, that can be considered. + * + * + * + * The configuration to define which mode to use and the trigger threshold are defined in + * solrconfig.xml + * + */ +public class CPUCircuitBreaker extends CircuitBreaker { + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + private static final OperatingSystemMXBean operatingSystemMXBean = ManagementFactory.getOperatingSystemMXBean(); + + private final boolean isCpuCircuitBreakerEnabled; + private final double cpuUsageThreshold; + + // Assumption -- the value of these parameters will be set correctly before invoking getDebugInfo() + private final ThreadLocal seenCPUUsage = new ThreadLocal<>(); + private final ThreadLocal allowedCPUUsage = new ThreadLocal<>(); + + public CPUCircuitBreaker(SolrConfig solrConfig) { +super(solrConfig); + +this.isCpuCircuitBreakerEnabled = solrConfig.isCpuCircuitBreakerEnabled; +this.cpuUsageThreshold = solrConfig.cpuCircuitBreakerThresholdPct; + } + + @Override + public boolean isTripped() { +if (!isEnabled()) { + return false; +} + +if (!isCpuCircuitBreakerEnabled) { + return false; +} + +double localAllowedCPUUsage = getCpuUsageThreshold(); +double localSeenCPUUsage = calculateLiveCPUUsage(); + +if (localSeenCPUUsage < 0) { + if (log.isWarnEnabled()) { +String msg = "Unable to get CPU usage"; + +log.warn(msg); + +return false; Review comment: Good catch, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests
atris commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468727485 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -48,7 +47,7 @@ public static void setupCluster() throws Exception { configureCluster(1).addConfig(FIRST_COLLECTION, configset("cloud-minimal")).configure(); } - @Test + @Nightly Review comment: Updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175699#comment-17175699 ] Rajeswari Natarajan commented on LUCENE-9379: - [~bruno.roustant] and [~dsmiley] , if we go with implicit router, shard management/rebalancing/routing becomes manual. Solrcloud will not take care of these (In solr mailing lists always I see users are advised against taking this route_ , so looking to see if encryption possible with composite id router and multiple tenants per collection . We might have around 3000+ collections going forward , so having one collection per tenant will make our cluster really heavy. Please share your thoughts and if anyone has attempted this kind of encryption > Directory based approach for index encryption > - > > Key: LUCENE-9379 > URL: https://issues.apache.org/jira/browse/LUCENE-9379 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > +Important+: This Lucene Directory wrapper approach is to be considered only > if an OS level encryption is not possible. OS level encryption better fits > Lucene usage of OS cache, and thus is more performant. > But there are some use-case where OS level encryption is not possible. This > Jira issue was created to address those. > > > The goal is to provide optional encryption of the index, with a scope limited > to an encryptable Lucene Directory wrapper. > Encryption is at rest on disk, not in memory. > This simple approach should fit any Codec as it would be orthogonal, without > modifying APIs as much as possible. > Use a standard encryption method. Limit perf/memory impact as much as > possible. > Determine how callers provide encryption keys. They must not be stored on > disk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests
madrob commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468724755 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); -assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get()); -assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), -mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); -assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(), -mockQueryRateLimiter.rejectedRequestCount.get() > 0); +assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get()); + +assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == mockQueryRateLimiter.incomingRequestCount.get() Review comment: yea I was looking at the wrong side of the diff This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests
madrob commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468721648 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -48,7 +47,7 @@ public static void setupCluster() throws Exception { configureCluster(1).addConfig(FIRST_COLLECTION, configset("cloud-minimal")).configure(); } - @Test + @Nightly Review comment: s/Nightly/Test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley merged pull request #1735: LUCENE spell: Implement SuggestWord.toString
dsmiley merged pull request #1735: URL: https://github.com/apache/lucene-solr/pull/1735 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1728: SOLR-14596: equals/hashCode for common SolrRequest classes
madrob commented on a change in pull request #1728: URL: https://github.com/apache/lucene-solr/pull/1728#discussion_r468716670 ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java ## @@ -244,6 +247,8 @@ public String getBasePath() { public void addHeader(String key, String value) { if (headers == null) { headers = new HashMap<>(); + final HashMap asdf = new HashMap<>(); Review comment: what? ## File path: solr/solrj/src/test/org/apache/solr/client/solrj/request/TestUpdateRequest.java ## @@ -17,52 +17,164 @@ package org.apache.solr.client.solrj.request; import java.util.Arrays; +import java.util.List; +import com.google.common.collect.Lists; import org.apache.solr.common.SolrInputDocument; import org.junit.Before; import org.junit.Rule; import org.junit.Test; import org.junit.rules.ExpectedException; +import static org.apache.solr.SolrTestCaseJ4.adoc; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; + public class TestUpdateRequest { @Rule public ExpectedException exception = ExpectedException.none(); @Before public void expectException() { -exception.expect(NullPointerException.class); -exception.expectMessage("Cannot add a null SolrInputDocument"); } @Test public void testCannotAddNullSolrInputDocument() { +exception.expect(NullPointerException.class); +exception.expectMessage("Cannot add a null SolrInputDocument"); + UpdateRequest req = new UpdateRequest(); req.add((SolrInputDocument) null); } @Test public void testCannotAddNullDocumentWithOverwrite() { +exception.expect(NullPointerException.class); +exception.expectMessage("Cannot add a null SolrInputDocument"); + UpdateRequest req = new UpdateRequest(); req.add(null, true); } @Test public void testCannotAddNullDocumentWithCommitWithin() { +exception.expect(NullPointerException.class); +exception.expectMessage("Cannot add a null SolrInputDocument"); + UpdateRequest req = new UpdateRequest(); req.add(null, 1); } @Test public void testCannotAddNullDocumentWithParameters() { +exception.expect(NullPointerException.class); +exception.expectMessage("Cannot add a null SolrInputDocument"); + UpdateRequest req = new UpdateRequest(); req.add(null, 1, true); } @Test public void testCannotAddNullDocumentAsPartOfList() { +exception.expect(NullPointerException.class); +exception.expectMessage("Cannot add a null SolrInputDocument"); + UpdateRequest req = new UpdateRequest(); req.add(Arrays.asList(new SolrInputDocument(), new SolrInputDocument(), null)); } + @Test + public void testEqualsMethod() { +final SolrInputDocument doc1 = new SolrInputDocument("id", "1", "value_s", "foo"); +final SolrInputDocument doc2 = new SolrInputDocument("id", "2", "value_s", "bar"); +final SolrInputDocument doc3 = new SolrInputDocument("id", "3", "value_s", "baz"); +/* Review comment: left over from other testing? ## File path: solr/solrj/src/test/org/apache/solr/client/solrj/request/TestUpdateRequest.java ## @@ -17,52 +17,164 @@ package org.apache.solr.client.solrj.request; import java.util.Arrays; +import java.util.List; +import com.google.common.collect.Lists; import org.apache.solr.common.SolrInputDocument; import org.junit.Before; import org.junit.Rule; import org.junit.Test; import org.junit.rules.ExpectedException; +import static org.apache.solr.SolrTestCaseJ4.adoc; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; + public class TestUpdateRequest { @Rule public ExpectedException exception = ExpectedException.none(); @Before public void expectException() { -exception.expect(NullPointerException.class); Review comment: remove the whole method? ## File path: solr/solrj/src/java/org/apache/solr/client/solrj/ResponseParser.java ## @@ -49,4 +52,31 @@ public String getVersion() { return "2.2"; } + + @Override + public int hashCode() { +return new HashCodeBuilder() +.append(getWriterType()) +.append(getContentType()) +.append(getVersion()) +.toHashCode(); + } + + @Override + public boolean equals(Object rhs) { +if (rhs == null || getClass() != rhs.getClass()) { + return false; +} else if (this == rhs) { + return true; +} else if (hashCode() != rhs.hashCode()) { + return false; +} + +final ResponseParser rhsCast = (ResponseParser) rhs; Review comment: I think I prefer Objects.hash, but I'm not sure why? Definitely willing to be convinced the other way if there's a reason or a difference or even if they're equivalent and there is already inertia here. This is a
[GitHub] [lucene-solr] madrob commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker
madrob commented on a change in pull request #1737: URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468711266 ## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.invoke.MethodHandles; +import java.lang.management.ManagementFactory; +import java.lang.management.OperatingSystemMXBean; + +import org.apache.solr.core.SolrConfig; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * + * Tracks current CPU usage and triggers if the specified threshold is breached. + * + * This circuit breaker gets the average CPU load over the last minute and uses + * that data to take a decision. Ideally, we should be able to cache the value + * locally and only query once the minute has elapsed. However, that will introduce + * more complexity than the current structure and might not get us major performance + * wins. If this ever becomes a performance bottleneck, that can be considered. + * + * + * + * The configuration to define which mode to use and the trigger threshold are defined in + * solrconfig.xml + * + */ +public class CPUCircuitBreaker extends CircuitBreaker { + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + private static final OperatingSystemMXBean operatingSystemMXBean = ManagementFactory.getOperatingSystemMXBean(); + + private final boolean isCpuCircuitBreakerEnabled; + private final double cpuUsageThreshold; + + // Assumption -- the value of these parameters will be set correctly before invoking getDebugInfo() + private final ThreadLocal seenCPUUsage = new ThreadLocal<>(); Review comment: thread locals should be static ## File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java ## @@ -811,10 +818,18 @@ private void initLibs(SolrResourceLoader loader, boolean isConfigsetTrusted) { loader.reloadLuceneSPI(); } - private void validateMemoryBreakerThreshold() { + private void validateCircuitBreakerThresholds() { if (useCircuitBreakers) { - if (memoryCircuitBreakerThresholdPct > 95 || memoryCircuitBreakerThresholdPct < 50) { -throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct is 50 - 95"); + if (isMemoryCircuitBreakerEnabled) { +if (memoryCircuitBreakerThresholdPct > 95 || memoryCircuitBreakerThresholdPct < 50) { + throw new IllegalArgumentException("Valid value range of memoryCircuitBreakerThresholdPct is 50 - 95"); +} + } + + if (isCpuCircuitBreakerEnabled) { +if (cpuCircuitBreakerThresholdPct > 95 || cpuCircuitBreakerThresholdPct < 40) { + throw new IllegalArgumentException("Valid value range for cpuCircuitBreakerThresholdPct is 40 - 95"); Review comment: I don't think CPU load average is typically measured on a 0-100 scale. Can you confirm some sample values of what calculateLiveCPUUsage returns? ## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java ## @@ -0,0 +1,108 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.util.circuitbreaker; + +import java.lang.invoke.MethodHandles; +import java.lang.management.ManagementFactory; +import java.lang.manage
[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests
atris commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468713194 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); -assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get()); -assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), -mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); -assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(), -mockQueryRateLimiter.rejectedRequestCount.get() > 0); +assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get()); + +assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == mockQueryRateLimiter.incomingRequestCount.get() Review comment: That is what we do in this assert? assertEquals(mockQueryRateLimiter.incomingRequestCount.get(), mockQueryRateLimiter.acceptedNewRequestCount.get() + mockQueryRateLimiter.rejectedRequestCount.get()); This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation
dsmiley commented on a change in pull request #1726: URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468710559 ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { Review comment: `>` vs `>=` is debatable; there's an argument both ways. I suspect there's a test for it this way but moreover, I don't think we should change it. It's fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation
atris commented on a change in pull request #1726: URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468709958 ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { Review comment: This seems inconsistent -- should we not be marking no timeout as -1? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests
madrob commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468709509 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); -assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get()); -assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), -mockQueryRateLimiter.acceptedNewRequestCount.get() < 25); -assertTrue("Incoming rejected new request count did not match. Expected 20 incoming " + mockQueryRateLimiter.rejectedRequestCount.get(), -mockQueryRateLimiter.rejectedRequestCount.get() > 0); +assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get()); + +assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == mockQueryRateLimiter.incomingRequestCount.get() Review comment: Should we assert that accepted + rejected = total? And that accepted > 0. ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -48,7 +47,7 @@ public static void setupCluster() throws Exception { configureCluster(1).addConfig(FIRST_COLLECTION, configset("cloud-minimal")).configure(); } - @Test + @Nightly Review comment: This isn't what I meant, sorry for being unclear. Keep this as `@Test` but when selecting the number of documents and queries do something like https://github.com/apache/lucene-solr/blob/master/lucene/core/src/test/org/apache/lucene/index/TestMultiDocValues.java#L52 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation
dsmiley commented on a change in pull request #1726: URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468709003 ## File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java ## @@ -67,8 +69,21 @@ public boolean shouldExit() { } /** - * Method to set the time at which the timeOut should happen. - * @param timeAllowed set the time at which this thread should timeout. + * Sets or clears the time allowed based on how much time remains from the start of the request plus the configured + * {@link CommonParams#TIME_ALLOWED}. + */ + public static void set(SolrQueryRequest req) { +long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L); +if (timeAllowed >= 0L) { + set(timeAllowed - (long)req.getRequestTimer().getTime()); // reduce by time already spent +} else { + reset(); +} + } + + /** + * Sets the time allowed (milliseconds), assuming we start a timer immediately. + * You should probably invoke {@link #set(SolrQueryRequest)} instead. */ public static void set(Long timeAllowed) { Review comment: Oh yeah; I forgot that -- indeed a primitive. It's weird it was boxed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module
[ https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175661#comment-17175661 ] Tomoko Uchida commented on LUCENE-9448: --- Thank you [~dweiss] for your help. {quote}luke's build file declares an exportable configuration 'standalone' that includes a set of preassembled dependencies and classpath-enriched Luke JAR. This configuration is separate from the default project JAR. It can be assembled (standaloneAssemble), compressed into a tgz archive (standalonePackage) or exported and reused elsewhere {quote} As for 'standalone' package, there are drop-in runtime only dependencies (many analysis modules) which are not required for development at all. If we make complete standalone distribution package by Gradle script, we need to collect all such jars or add all of them to compile time dependencies. I'll try it a little later (have little time right now). > Make an equivalent to Ant's "run" target for Luke module > > > Key: LUCENE-9448 > URL: https://issues.apache.org/jira/browse/LUCENE-9448 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Tomoko Uchida >Priority: Minor > Attachments: LUCENE-9448.patch, LUCENE-9448.patch > > > With Ant build, Luke Swing app can be launched by "ant run" after checking > out the source code. "ant run" allows developers to immediately see the > effects of UI changes without creating the whole zip/tgz package (originally, > it was suggested when integrating Luke to Lucene). > In Gradle, {{:lucene:luke:run}} task would be easily implemented with > {{JavaExec}}, I think. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9456) Stored fields should store the number of chunks in the meta file
Adrien Grand created LUCENE-9456: Summary: Stored fields should store the number of chunks in the meta file Key: LUCENE-9456 URL: https://issues.apache.org/jira/browse/LUCENE-9456 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Currently stored fields record numChunks/numDirtyChunks at the very end of the data file. They should migrate to the meta file instead, so that they would be validated when opening the index (meta files get their checksum validated entirely, data files don't). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests
atris commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468681532 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); -assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get()); -assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), Review comment: It isnt really a relaxation -- the remaining assert should cover for all cases that can happen for rate limiting. The catch is that rate limiting is not a guaranteed phenomenon -- we create a high load and it should happen. I have added an additional assert -- let me know if it looks fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write
[ https://issues.apache.org/jira/browse/LUCENE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Drob resolved LUCENE-9453. --- Fix Version/s: master (9.0) Lucene Fields: (was: New) Assignee: Mike Drob Resolution: Fixed Thanks for the feedback [~dweiss], [~simonw]. Added the assert and committed this. > DocumentWriterFlushControl missing explicit sync on write > - > > Key: LUCENE-9453 > URL: https://issues.apache.org/jira/browse/LUCENE-9453 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Trivial > Fix For: master (9.0) > > Time Spent: 0.5h > Remaining Estimate: 0h > > checkoutAndBlock is not synchronized, but has a non-atomic write to > {{numPending}}. Meanwhile, all of the other writes to numPending are in sync > methods. > In this case it turns out to be ok because all of the code paths calling this > method are already sync: > {{synchronized doAfterDocument -> checkout -> checkoutAndBlock}} > {{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> > checkoutAndBlock}} > If we make {{synchronized checkoutAndBlock}} that protects us against future > changes, shouldn't cause any performance impact since the code paths will > already be going through a sync block, and will make an IntelliJ warning go > away. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write
[ https://issues.apache.org/jira/browse/LUCENE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175651#comment-17175651 ] ASF subversion and git services commented on LUCENE-9453: - Commit 092076ec39e0f71ae92d36cd4ebe69e21a97ce4e in lucene-solr's branch refs/heads/master from Mike Drob [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=092076e ] LUCENE-9453 Assert lock held before volatile write (#1734) Found via IntelliJ warnings. > DocumentWriterFlushControl missing explicit sync on write > - > > Key: LUCENE-9453 > URL: https://issues.apache.org/jira/browse/LUCENE-9453 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Reporter: Mike Drob >Priority: Trivial > Time Spent: 0.5h > Remaining Estimate: 0h > > checkoutAndBlock is not synchronized, but has a non-atomic write to > {{numPending}}. Meanwhile, all of the other writes to numPending are in sync > methods. > In this case it turns out to be ok because all of the code paths calling this > method are already sync: > {{synchronized doAfterDocument -> checkout -> checkoutAndBlock}} > {{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> > checkoutAndBlock}} > If we make {{synchronized checkoutAndBlock}} that protects us against future > changes, shouldn't cause any performance impact since the code paths will > already be going through a sync block, and will make an IntelliJ warning go > away. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob merged pull request #1734: LUCENE-9453 Add sync around volatile write
madrob merged pull request #1734: URL: https://github.com/apache/lucene-solr/pull/1734 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gautamworah96 commented on pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer
gautamworah96 commented on pull request #1733: URL: https://github.com/apache/lucene-solr/pull/1733#issuecomment-672021513 Changes in this revision (incorporated from feedback on JIRA): * Added a call to `advanceExact()` before calling `.binaryValue()` and an `assert` to check that the field exists in the index * Re-added the `StringField` with the `Field.Store.YES` changed to `Field.Store.NO`. * I've not added new tests at the moment. Trying to get the existing ones to work first. From the error log: Note that the code is able to successfully execute the `assert found` statement (so the field does exist), and it fails on the next line This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-14727. --- Fix Version/s: 8.7 Resolution: Fixed > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Fix For: 8.7 > > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175633#comment-17175633 ] ASF subversion and git services commented on SOLR-14727: Commit 703becc0372fcfaf8c0184a63bfd9a7070458c6d in lucene-solr's branch refs/heads/branch_8x from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=703becc ] SOLR-14727: Add gradle files to the 8x .gitignore file. > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14727: -- Attachment: (was: SOLR-14727.patch) > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14727: -- Attachment: SOLR-14727.patch > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14727: -- Description: This is a little different than I thought. Apparently it's an interaction with IntelliJ. One sequence is something like: * import the Gradle build in IntelliJ on master * switch to branch_8x on the command line * switch back to master from the command line and you can't because "untracked changes would be overwritten by..." there may be other ways to get into this bind. At any rate, I don't see a problem with adding gradle.properties gradle/ gradlew gradlew.bat to .gitignore on branch_8x only. was: This is a little different than I thought. Apparently it's an interaction with IntelliJ. One sequence is something like: * import the Gradle build in IntelliJ on master * switch to branch_8x on the command line * switch back to master from the command line and you can't because "untracked changes would be overwritten by..." there may be other ways to get into this bind. At any rate, I don't see a problem with adding gradle.properties gradle/ gradle/ gradlew gradlew.bat to .gitignore on branch_8x only. > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14727: -- Attachment: SOLR-14727.patch > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-14727.patch > > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.
[ https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14727: -- Description: This is a little different than I thought. Apparently it's an interaction with IntelliJ. One sequence is something like: * import the Gradle build in IntelliJ on master * switch to branch_8x on the command line * switch back to master from the command line and you can't because "untracked changes would be overwritten by..." there may be other ways to get into this bind. At any rate, I don't see a problem with adding gradle.properties gradle/ gradle/ gradlew gradlew.bat to .gitignore on branch_8x only. was: It's annoying to switch from master to 8x after building with Gradle and then be unable to switch back because Git sees files the gradle directory and thinks you have added files. This will be for 8x only > Add gradle files to the 8x .gitignore file. > --- > > Key: SOLR-14727 > URL: https://issues.apache.org/jira/browse/SOLR-14727 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > This is a little different than I thought. Apparently it's an interaction > with IntelliJ. One sequence is something like: > * import the Gradle build in IntelliJ on master > * switch to branch_8x on the command line > * switch back to master from the command line and you can't because > "untracked changes would be overwritten by..." > there may be other ways to get into this bind. > At any rate, I don't see a problem with adding > gradle.properties > gradle/ > gradle/ > gradlew > gradlew.bat > to .gitignore on branch_8x only. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468631394 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java ## @@ -440,7 +440,7 @@ public static final SegmentInfos readCommit(Directory directory, ChecksumIndexIn if (format >= VERSION_70) { // oldest supported version CodecUtil.checkFooter(input, priorE); } else { -throw IOUtils.rethrowAlways(priorE); Review comment: The original compiler complaint was that the throw is inside the finally block. Could I replace the "Unreachable code" at the end with this rethrow? I believe the logic will be the same. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175609#comment-17175609 ] Erick Erickson commented on SOLR-13412: --- [~dweiss] Yes, I saw that. My take at this point is to answer your question with "no, we shouldn't add such a low-level tool to Solr's distribution". I think we can enhance the Luke Request Handler with relative ease to satisfy most Solr users. For those who need more, I suspect the intersection of users who really get value from Luke and the users who would be comfortable building Lucene is quite large, although I have no proof... I'll wait for complaints ;) Thanks again for your help with LUCENE-9448 > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes
madrob commented on a change in pull request #1732: URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468622877 ## File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java ## @@ -367,12 +367,12 @@ public void close() { /** * Original source of the tokens. */ -protected final Consumer source; Review comment: I think it's because the field is final and there is a getter for it, so the code analyzer prefers encapsulation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175606#comment-17175606 ] David Eric Pugh commented on SOLR-13412: I've been following this conversation, and wanted to throw my 2cents in. Luke is a *really* powerful tool, and I've used it a couple of times to troubleshoot some hard to understand problems. Having said that, it functions *very* differently then Solr, especially for those of us using SolrCloud, and adding it to the Solr distribution feels like it goes against the current trend of shrinking the size of the Solr distribution. I think that the real issue here is that most of our users don't know that Luke exists, and that its a powerful tool. What if we kept Luke as a standalone artifact, and instead talked about Luke in the Solr Ref Guide? We mention the Luke request handler on https://lucene.apache.org/solr/guide/8_6/implicit-requesthandlers.html, that could also link to a page with more details on Luke and where to download it from? Which reminds me we should add the word Luke to the Ref Guide glossary page! I just poked around the lucene.apache.org site, and there is no mention of Luke anywhere... > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14650) Default autoscaling policy rules are ineffective
[ https://issues.apache.org/jira/browse/SOLR-14650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-14650. --- Fix Version/s: (was: 8.7) Resolution: Abandoned The default autoscaling policy has been removed as of 8.6.1 (and therefore 8.7) > Default autoscaling policy rules are ineffective > > > Key: SOLR-14650 > URL: https://issues.apache.org/jira/browse/SOLR-14650 > Project: Solr > Issue Type: Bug > Components: AutoScaling >Affects Versions: 8.6 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > There's a faulty logic in {{Assign.usePolicyFramework()}} that makes the > default policy (added in SOLR-12845) ineffective - that is, in the absence of > any user-provided modifications to the policy rules the code reverts to > LEGACY assignment. > The logic in this method is convoluted and opaque, it's difficult for users > to be sure what strategy is used when - instead we should make this choice > explicit. > (BTW, the default ruleset is probably too expensive for large clusters > anyway, given the unresolved performance problems in the policy engine). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption
[ https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175583#comment-17175583 ] Bruno Roustant commented on LUCENE-9379: [~Raji] maybe a better approach would be to have one tenant per collection, but you might have many tenants so the performance for many collection is poor? If this is the case, then I think the root problem is the perf for many collections. Without composite id router you could use an OS encryption per collection. > Directory based approach for index encryption > - > > Key: LUCENE-9379 > URL: https://issues.apache.org/jira/browse/LUCENE-9379 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > +Important+: This Lucene Directory wrapper approach is to be considered only > if an OS level encryption is not possible. OS level encryption better fits > Lucene usage of OS cache, and thus is more performant. > But there are some use-case where OS level encryption is not possible. This > Jira issue was created to address those. > > > The goal is to provide optional encryption of the index, with a scope limited > to an encryptable Lucene Directory wrapper. > Encryption is at rest on disk, not in memory. > This simple approach should fit any Codec as it would be orthogonal, without > modifying APIs as much as possible. > Use a standard encryption method. Limit perf/memory impact as much as > possible. > Determine how callers provide encryption keys. They must not be stored on > disk. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery
[ https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14728: -- Description: A simple optimization can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. (was: A simple optimization can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. ) > Add self join optimization to the TopLevelJoinQuery > --- > > Key: SOLR-14728 > URL: https://issues.apache.org/jira/browse/SOLR-14728 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > A simple optimization can be put in place to massively improve join > performance when the TopLevelJoinQuery is performing a self join (same core) > and the *to* and *from* fields are the same field. In this scenario the top > level doc values ordinals can be used directly as a filter avoiding the most > expensive part of the join which is the bytes ref reconciliation between the > *to* and *from* fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery
[ https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14728: -- Description: A simple optimization can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. (was: A simple optimization that can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. ) > Add self join optimization to the TopLevelJoinQuery > --- > > Key: SOLR-14728 > URL: https://issues.apache.org/jira/browse/SOLR-14728 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > A simple optimization can be put in place to massively improve join > performance when the TopLevelJoinQuery is performing a self join (same core) > and the *to* and *from* fields are the same field. In this scenario the top > level doc values ordinals can be used directly as a filter avoiding the most > expensive part of the join which is the bytes ref reconciliation between the > *to* and *from* fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9455) ExitableTermsEnum (in ExitableDirectoryReader) should sample next()
David Smiley created LUCENE-9455: Summary: ExitableTermsEnum (in ExitableDirectoryReader) should sample next() Key: LUCENE-9455 URL: https://issues.apache.org/jira/browse/LUCENE-9455 Project: Lucene - Core Issue Type: Improvement Components: core/other Reporter: David Smiley ExitableTermsEnum calls "checkAndThrow" on *every* call to next(). This is too expensive; it should sample. I observed ElasticSearch uses the same approach; I think Lucene would benefit from this: https://github.com/elastic/elasticsearch/blob/4af4eb99e18fdaadac879b1223e986227dd2ee71/server/src/main/java/org/elasticsearch/search/internal/ExitableDirectoryReader.java#L151 CC [~jimczi] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery
[ https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14728: -- Description: A simple optimization that can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. (was: A simple strategy can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter, *avoiding* the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. ) > Add self join optimization to the TopLevelJoinQuery > --- > > Key: SOLR-14728 > URL: https://issues.apache.org/jira/browse/SOLR-14728 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > A simple optimization that can be put in place to massively improve join > performance when the TopLevelJoinQuery is performing a self join (same core) > and the *to* and *from* fields are the same field. In this scenario the top > level doc values ordinals can be used directly as a filter avoiding the most > expensive part of the join which is the bytes ref reconciliation between the > *to* and *from* fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14692) JSON Facet "join" domain should take optional "method" property
[ https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski resolved SOLR-14692. Fix Version/s: 8.7 master (9.0) Assignee: Jason Gerlowski Resolution: Fixed All wrapped up; thanks to Munendra for the review comments. > JSON Facet "join" domain should take optional "method" property > --- > > Key: SOLR-14692 > URL: https://issues.apache.org/jira/browse/SOLR-14692 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting, JSON Request API >Affects Versions: master (9.0), 8.6 >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Minor > Fix For: master (9.0), 8.7 > > Time Spent: 50m > Remaining Estimate: 0h > > Solr offers several different join implementations which can be switched off > providing the "method" local-param on JoinQuery's. Each of these > implementations has different performance characteristics and can behave very > differently depending on a user's data and use case. > When joins are used internally as a part of JSON Faceting's "join" > domain-transform though, users have no way to specify which implementation > they would like to use. We should correct this by adding a "method" property > to the join domain-transform. This will let user's choose the join that's > most performant for their use case during JSON Facet requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery
[ https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein updated SOLR-14728: -- Description: A simple strategy can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter, *avoiding* the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. (was: A simple strategy can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. ) > Add self join optimization to the TopLevelJoinQuery > --- > > Key: SOLR-14728 > URL: https://issues.apache.org/jira/browse/SOLR-14728 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > A simple strategy can be put in place to massively improve join performance > when the TopLevelJoinQuery is performing a self join (same core) and the *to* > and *from* fields are the same field. In this scenario the top level doc > values ordinals can be used directly as a filter, *avoiding* the most > expensive part of the join which is the bytes ref reconciliation between the > *to* and *from* fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery
Joel Bernstein created SOLR-14728: - Summary: Add self join optimization to the TopLevelJoinQuery Key: SOLR-14728 URL: https://issues.apache.org/jira/browse/SOLR-14728 Project: Solr Issue Type: New Feature Security Level: Public (Default Security Level. Issues are Public) Reporter: Joel Bernstein A simple strategy can be put in place to massively improve join performance when the TopLevelJoinQuery is performing a self join (same core) and the *to* and *from* fields are the same field. In this scenario the top level doc values ordinals can be used directly as a filter avoiding the most expensive part of the join which is the bytes ref reconciliation between the *to* and *from* fields. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14692) JSON Facet "join" domain should take optional "method" property
[ https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175544#comment-17175544 ] ASF subversion and git services commented on SOLR-14692: Commit d6992f74e0673d2ed5593c6d9312651e94267446 in lucene-solr's branch refs/heads/branch_8x from Jason Gerlowski [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d6992f7 ] SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms (#1707) > JSON Facet "join" domain should take optional "method" property > --- > > Key: SOLR-14692 > URL: https://issues.apache.org/jira/browse/SOLR-14692 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting, JSON Request API >Affects Versions: master (9.0), 8.6 >Reporter: Jason Gerlowski >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Solr offers several different join implementations which can be switched off > providing the "method" local-param on JoinQuery's. Each of these > implementations has different performance characteristics and can behave very > differently depending on a user's data and use case. > When joins are used internally as a part of JSON Faceting's "join" > domain-transform though, users have no way to specify which implementation > they would like to use. We should correct this by adding a "method" property > to the join domain-transform. This will let user's choose the join that's > most performant for their use case during JSON Facet requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14727) Add gradle files to the 8x .gitignore file.
Erick Erickson created SOLR-14727: - Summary: Add gradle files to the 8x .gitignore file. Key: SOLR-14727 URL: https://issues.apache.org/jira/browse/SOLR-14727 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Erick Erickson Assignee: Erick Erickson It's annoying to switch from master to 8x after building with Gradle and then be unable to switch back because Git sees files the gradle directory and thinks you have added files. This will be for 8x only -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution
[ https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175528#comment-17175528 ] Dawid Weiss commented on SOLR-13412: bq. My current thinking is that if the stand-alone Luke app is packaged with Lucene automagically, Look at the patch I provided, Erick. This is simple to do. Question is whether it makes sense to add such a low-level tool to Solr's distribution. > Make the Lucene Luke module available from a Solr distribution > -- > > Key: SOLR-13412 > URL: https://issues.apache.org/jira/browse/SOLR-13412 > Project: Solr > Issue Type: Improvement >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13412.patch > > > Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the > project, I think it would be good to be able to access it from a Solr distro. > I want to go to the right place under the Solr install directory and start > Luke up to examine the local indexes. > This ticket is explicitly _not_ about accessing it from the admin UI, Luke is > a stand-alone app that must be invoked on the node that has a Lucene index on > the local filesystem > We need to > * have it included in Solr when running "ant package". > * add some bits to the ref guide on how to invoke > ** Where to invoke it from > ** mention anything that has to be installed. > ** any other "gotchas" someone just installing Solr should be aware of. > * Ant should not be necessary. > * > > I'll assign this to myself to keep track of, but would not be offended in the > least if someone with more knowledge of "ant package" and the like wanted to > take it over ;) > If we can do it at all -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests
madrob commented on a change in pull request #1736: URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468542925 ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); Review comment: Can we limit the higher footprint to Nightly? ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -124,12 +120,12 @@ private void processTest(CloudSolrClient client) throws Exception { List> futures; try { - for (int i = 0; i < 25; i++) { + for (int i = 0; i < numQueries; i++) { callableList.add(() -> { try { QueryResponse response = client.query(new SolrQuery("*:*")); -assertEquals(100, response.getResults().getNumFound()); +assertEquals(1, response.getResults().getNumFound()); Review comment: should be numDocuments ## File path: solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java ## @@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception { solrDispatchFilter.replaceRateLimitManager(rateLimitManager); -processTest(client); +processTest(client, 1 /* number of documents */, 350 /* number of queries */); MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY); -assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get()); -assertTrue("Incoming accepted new request count did not match. Expected 5 incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(), Review comment: Somewhat concerning that the fix to the test is to relax the assertion conditions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)
[ https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175519#comment-17175519 ] Dawid Weiss commented on LUCENE-9439: - Seems to pass all tests and checks. Works for me in a production system too. > Matches API should enumerate hit fields that have no positions (no iterator) > > > Key: LUCENE-9439 > URL: https://issues.apache.org/jira/browse/LUCENE-9439 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Attachments: LUCENE-9439.patch, matchhighlighter.patch > > Time Spent: 2h 40m > Remaining Estimate: 0h > > I have been fiddling with Matches API and it's great. There is one corner > case that doesn't work for me though -- queries that affect fields without > positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is > problematic as it doesn't carry the field name that caused it (returns null). > The associated fromSubMatches combines all these constants into one (or > swallows them) which is another problem. > I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with > a true match (carrying field name) returning an empty iterator (or a constant > "empty" iterator NO_TERMS). > I have a very compelling use case: I wrote an "auto-highlighter" that runs on > top of Matches API and automatically picks up query-relevant fields and > snippets. Everything works beautifully except for cases where fields are > searchable but don't have any positions (token-like fields). > I can work on a patch but wanted to reach out first - [~romseygeek]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards
[ https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175518#comment-17175518 ] Ivan Djurasevic commented on SOLR-14630: {quote}When you say "batch update", do you mean more than one document in the same request or perhaps something else? If the batch size was one then does the issue happen also, I wonder? {quote} Batch size is not problem. Same issue is happening when batch contains 50 and when contains 1 document. {quote}I'm not very familiar with update request processor chains but the [https://lucene.apache.org/solr/guide/8_6/update-request-processors.html#update-processors-in-solrcloud] documentation was useful and the SOLR-8030 ticket mentioned in it sounds interesting. {quote} Update processor chain is not problem(they have some other issues, i will raise bugs for that team, too), i was describing our process and why is important to hit correct shard without forwarding requests. {quote}What if {{inputCollections}} contained more than one element? {quote} Yes, this is a problem, i was trying to search across collections and with my fix, it doesn't work. It seems that HttpSolrCall class can't parse URL when URL contain more core names. {quote}What if {{inputCollections}} contained an alias that was resolved at [line 1080|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L1080], does it matter that before the alias (e.g. {{collection_one}}) was appended but now a core name (e.g. {{collection1_shard2_replica1}}) is appended? {quote} Aliases should't be a problem, when we solve issue with multiple collections(because we found real collection names before creating URL). Unfortunately, to fix this issue we will need to refactor HttpSolrCall class, too. > CloudSolrClient doesn't pick correct core when server contains more shards > -- > > Key: SOLR-14630 > URL: https://issues.apache.org/jira/browse/SOLR-14630 > Project: Solr > Issue Type: Bug > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1, 8.5.2 >Reporter: Ivan Djurasevic >Priority: Major > Attachments: > 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch > > > Precondition: create collection with 4 shards on one server. > During search and update, solr cloud client picks wrong core even _route_ > exists in query param. In BaseSolrClient class, method sendRequest, > > {code:java} > sortedReplicas.forEach( replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), > joinedInputCollections)); > } > }); > {code} > > Previous part of code adds base url(localhost:8983/solr/collection_name) to > theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If > we change previous code to: > {quote} > {code:java} > sortedReplicas.forEach(replica -> { > if (seenNodes.add(replica.getNodeName())) { > theUrlList.add(replica.getCoreUrl()); > } > });{code} > {quote} > Solr cloud client picks core which is defined with _route_ parameter. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija merged pull request #1707: SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms
gerlowskija merged pull request #1707: URL: https://github.com/apache/lucene-solr/pull/1707 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14692) JSON Facet "join" domain should take optional "method" property
[ https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175514#comment-17175514 ] ASF subversion and git services commented on SOLR-14692: Commit 5887032e95953a8d93d723e1a5210793472def71 in lucene-solr's branch refs/heads/master from Jason Gerlowski [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5887032 ] SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms (#1707) > JSON Facet "join" domain should take optional "method" property > --- > > Key: SOLR-14692 > URL: https://issues.apache.org/jira/browse/SOLR-14692 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting, JSON Request API >Affects Versions: master (9.0), 8.6 >Reporter: Jason Gerlowski >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > Solr offers several different join implementations which can be switched off > providing the "method" local-param on JoinQuery's. Each of these > implementations has different performance characteristics and can behave very > differently depending on a user's data and use case. > When joins are used internally as a part of JSON Faceting's "join" > domain-transform though, users have no way to specify which implementation > they would like to use. We should correct this by adding a "method" property > to the join domain-transform. This will let user's choose the join that's > most performant for their use case during JSON Facet requests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] gerlowskija commented on pull request #1707: SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms
gerlowskija commented on pull request #1707: URL: https://github.com/apache/lucene-solr/pull/1707#issuecomment-671912624 Thanks for the review Munendra; I made the changes you suggested. Merging now. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)
[ https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175481#comment-17175481 ] Dawid Weiss commented on LUCENE-9439: - A cleaned up version in the PR. Running tests and checks now. > Matches API should enumerate hit fields that have no positions (no iterator) > > > Key: LUCENE-9439 > URL: https://issues.apache.org/jira/browse/LUCENE-9439 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Attachments: LUCENE-9439.patch, matchhighlighter.patch > > Time Spent: 2h 40m > Remaining Estimate: 0h > > I have been fiddling with Matches API and it's great. There is one corner > case that doesn't work for me though -- queries that affect fields without > positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is > problematic as it doesn't carry the field name that caused it (returns null). > The associated fromSubMatches combines all these constants into one (or > swallows them) which is another problem. > I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with > a true match (carrying field name) returning an empty iterator (or a constant > "empty" iterator NO_TERMS). > I have a very compelling use case: I wrote an "auto-highlighter" that runs on > top of Matches API and automatically picks up query-relevant fields and > snippets. Everything works beautifully except for cases where fields are > searchable but don't have any positions (token-like fields). > I can work on a patch but wanted to reach out first - [~romseygeek]? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris edited a comment on pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker
atris edited a comment on pull request #1737: URL: https://github.com/apache/lucene-solr/pull/1737#issuecomment-671879842 > The configuration can be as simple as > > `` > > This way you can just read all the attributes all at once from the `PluginInfo` . > CircuitBreaker should be a type of plugin. It should be an interface As discussed offline, I will refactor circuit breaker infrastructure to use PluginInfo as a part of 8.7 (hence will leave this PR's JIRA open for that effort). Not proceeding with that effort in this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
[ https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175453#comment-17175453 ] Atri Sharma commented on SOLR-14588: As discussed offline, this can be done as a refactor before 8.7 – hence leaving this Jira open to track that specific effort. > Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker > -- > > Key: SOLR-14588 > URL: https://issues.apache.org/jira/browse/SOLR-14588 > Project: Solr > Issue Type: Improvement >Reporter: Atri Sharma >Assignee: Atri Sharma >Priority: Blocker > Fix For: master (9.0), 8.7 > > Time Spent: 13h 50m > Remaining Estimate: 0h > > This Jira tracks addition of circuit breakers in the search path and > implements JVM based circuit breaker which rejects incoming search requests > if the JVM heap usage exceeds a defined percentage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker
atris commented on pull request #1737: URL: https://github.com/apache/lucene-solr/pull/1737#issuecomment-671879842 > The configuration can be as simple as > > `` > > This way you can just read all the attributes all at once from the `PluginInfo` . > CircuitBreaker should be a type of plugin. It should be an interface As discussed offline, I will refactor circuit breaker infrastructure to use PluginInfo as a part of 8.7 (hence will leave this PR's JIRA open for that effort). No proceeding with that effort in this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9454) Upgrade hamcrest to version 2.2
[ https://issues.apache.org/jira/browse/LUCENE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9454. - Resolution: Fixed > Upgrade hamcrest to version 2.2 > --- > > Key: LUCENE-9454 > URL: https://issues.apache.org/jira/browse/LUCENE-9454 > Project: Lucene - Core > Issue Type: Task >Affects Versions: master (9.0) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9454) Upgrade hamcrest to version 2.2
[ https://issues.apache.org/jira/browse/LUCENE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175420#comment-17175420 ] ASF subversion and git services commented on LUCENE-9454: - Commit 5375a2d2ada2bb3bd94cffcb49a730ec234c8649 in lucene-solr's branch refs/heads/master from Dawid Weiss [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5375a2d ] LUCENE-9454: upgrade hamcrest to version 2.2. (#1738) > Upgrade hamcrest to version 2.2 > --- > > Key: LUCENE-9454 > URL: https://issues.apache.org/jira/browse/LUCENE-9454 > Project: Lucene - Core > Issue Type: Task >Affects Versions: master (9.0) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss merged pull request #1738: LUCENE-9454: upgrade hamcrest to version 2.2.
dweiss merged pull request #1738: URL: https://github.com/apache/lucene-solr/pull/1738 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
[ https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175402#comment-17175402 ] Noble Paul edited comment on SOLR-14588 at 8/11/20, 9:38 AM: - The configuration can be as follows {code:xml} {code} Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have a reference of circuit breaker was (Author: noble.paul): The configuration can be as follows {code:xml} {code} Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have a reference of circuit breaker > Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker > -- > > Key: SOLR-14588 > URL: https://issues.apache.org/jira/browse/SOLR-14588 > Project: Solr > Issue Type: Improvement >Reporter: Atri Sharma >Assignee: Atri Sharma >Priority: Blocker > Fix For: master (9.0), 8.7 > > Time Spent: 13h 50m > Remaining Estimate: 0h > > This Jira tracks addition of circuit breakers in the search path and > implements JVM based circuit breaker which rejects incoming search requests > if the JVM heap usage exceeds a defined percentage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
[ https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175402#comment-17175402 ] Noble Paul edited comment on SOLR-14588 at 8/11/20, 9:38 AM: - The configuration can be as follows {code:xml} {code} Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have a reference of circuit breaker was (Author: noble.paul): The configuration can be as follows {code:xml} {code} Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have a reference of circuit breaker > Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker > -- > > Key: SOLR-14588 > URL: https://issues.apache.org/jira/browse/SOLR-14588 > Project: Solr > Issue Type: Improvement >Reporter: Atri Sharma >Assignee: Atri Sharma >Priority: Blocker > Fix For: master (9.0), 8.7 > > Time Spent: 13h 50m > Remaining Estimate: 0h > > This Jira tracks addition of circuit breakers in the search path and > implements JVM based circuit breaker which rejects incoming search requests > if the JVM heap usage exceeds a defined percentage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org