[jira] [Comment Edited] (SOLR-13998) Add thread safety annotation to classes

2020-08-11 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176075#comment-17176075
 ] 

Anshum Gupta edited comment on SOLR-13998 at 8/12/20, 6:56 AM:
---

[~marcussorealheis] - The idea behind this came from a branch that was shared 
by Mark, and while I was planning to pick a bunch of those changes to improve a 
few things in Solr, I didn't get the bandwidth to continue with this effort. 
The idea here is to have an option that allows to annotate classes in Solr 
w.r.t. threadsafety. While retrofitting that might be tricky, it can certainly 
be used for new classes. This mechanism allows other developers to rightfully 
annotate classes, so others could safely consume/extend those.

I agree, in an ideal world, being able to annotate existing classes would have 
been an awesome plan, and while that was the plan to some extent, it never for 
there. Hopefully I'll get back to it, or someone else will and make things 
better for other devs.


was (Author: anshumg):
[~marcussorealheis] - The idea behind this came from a branch that was shared 
by Mark, and while I was planning to pick a bunch of those changes to improve a 
few things in Solr, I didn't get the bandwidth to continue with this effort. I 
think the idea here is to have an option that allows to annotate classes in 
Solr w.r.t. threadsafety. While retrofitting that might be tricky, it can 
certainly be used for new classes. This mechanism allows other developers to 
rightfully annotate classes, so others could safely consume/extend those.

I agree, in an ideal world, being able to annotate existing classes would have 
been an awesome plan, and while that was the plan to some extent, it never for 
there. Hopefully I'll get back to it, or someone else will and make things 
better for other devs.

> Add thread safety annotation to classes
> ---
>
> Key: SOLR-13998
> URL: https://issues.apache.org/jira/browse/SOLR-13998
> Project: Solr
>  Issue Type: Improvement
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
>Priority: Major
> Fix For: master (9.0), 8.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add annotations that can be used to mark classes as thread safe / single 
> threaded in Solr.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13998) Add thread safety annotation to classes

2020-08-11 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176075#comment-17176075
 ] 

Anshum Gupta commented on SOLR-13998:
-

[~marcussorealheis] - The idea behind this came from a branch that was shared 
by Mark, and while I was planning to pick a bunch of those changes to improve a 
few things in Solr, I didn't get the bandwidth to continue with this effort. I 
think the idea here is to have an option that allows to annotate classes in 
Solr w.r.t. threadsafety. While retrofitting that might be tricky, it can 
certainly be used for new classes. This mechanism allows other developers to 
rightfully annotate classes, so others could safely consume/extend those.

I agree, in an ideal world, being able to annotate existing classes would have 
been an awesome plan, and while that was the plan to some extent, it never for 
there. Hopefully I'll get back to it, or someone else will and make things 
better for other devs.

> Add thread safety annotation to classes
> ---
>
> Key: SOLR-13998
> URL: https://issues.apache.org/jira/browse/SOLR-13998
> Project: Solr
>  Issue Type: Improvement
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
>Priority: Major
> Fix For: master (9.0), 8.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add annotations that can be used to mark classes as thread safe / single 
> threaded in Solr.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-11 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176071#comment-17176071
 ] 

Jan Høydahl commented on SOLR-14726:


cURL is the "default", so probably what we should default to. Instead of 
mentioning wget or Perl, we could perhaps mention HttPIE (I love it), as well 
as the DevTools section of YASA, which I suppose can handle POST as well as 
GET, see screenshot: 
 !yasa-http.png!

Once the Ref Guide is consistent in using standard tools, we still don't need 
to remove bin/post in 9.0, it can remain as a hidden gem throughout 10.x to 
cause the least surprise for folks who have integrated it into their tooling 
already, and to allow decent alternatives for indexing a whole folder structure 
to emerge.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module

2020-08-11 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176070#comment-17176070
 ] 

Dawid Weiss commented on LUCENE-9448:
-

I added a pull request this time - removed Solr distribution bit since we 
established it's not the right time, but added an optional "standalone 
distribution-only" JAR as an example (highlighter) and a readme file with 
substitutable parameters. Once you run:

gradlew -p lucene\luke assemble

you'll see the readme file will contain a java command to launch Luke. I don't 
think scripts are needed after this (it's a developer tool after all, so I 
assume minimal terminal capabilities).

I'm leaving the rest to you, Tomoko - continue as you please.

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch, LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14726) Streamline getting started experience

2020-08-11 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-14726:
---
Attachment: yasa-http.png

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Ishan Chattopadhyaya
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-11 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176068#comment-17176068
 ] 

Marcus Eagan commented on LUCENE-8626:
--

Great ideas [~dweiss] and [~dsmiley]!

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss opened a new pull request #1742: Standalone distribution assembly for Luke

2020-08-11 Thread GitBox


dweiss opened a new pull request #1742:
URL: https://github.com/apache/lucene-solr/pull/1742


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes

2020-08-11 Thread GitBox


dweiss commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r469031590



##
File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
##
@@ -440,7 +440,7 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (format >= VERSION_70) { // oldest supported version
 CodecUtil.checkFooter(input, priorE);
   } else {
-throw IOUtils.rethrowAlways(priorE);

Review comment:
   And what's wrong about a throw from within finally? A finally block is 
technically just a block of code, like any other. The compiler very likely 
assumes you're suppressing an exception if you throw from within finally but 
it's not the case here. 
   
   I don't know if moving that throw will change the logic. Maybe not. Maybe 
yes. Given the two options, I wouldn't touch it. My concern was that you 
slipped such things as part of an otherwise "trivial" set of patches. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss commented on a change in pull request #1732: Clean up many small fixes

2020-08-11 Thread GitBox


dweiss commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r469027938



##
File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java
##
@@ -367,12 +367,12 @@ public void close() {
 /**
  * Original source of the tokens.
  */
-protected final Consumer source;

Review comment:
   Maybe. It doesn't matter though - this changes the API of a class that's 
been there for ages. I bet there is a class out there somewhere (let's say A 
extends Analyzer) and another one (B extends A) where A overrides the getter 
but B reaches out for the original field. Do we want this to break just to hide 
a field that can be useful for subclasses just to silence an automatic code 
inspection? I don't think we should.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13998) Add thread safety annotation to classes

2020-08-11 Thread Marcus Eagan (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176031#comment-17176031
 ] 

Marcus Eagan commented on SOLR-13998:
-

Hi [~anshum] I'm working on some janitorial work in the project, and I noticed 
that you added annotations, but did not implement them. Thus, this title is 
misleading, and one of the annotations added here has only been used once since 
you added this PR in December. 

I'm curious what you had in mind for the SolrSingleThreaded annotation and why 
didn't you actually add the annotation anywhere.

Futhermore, would you like help to expand the usage of this class, or do you 
feel it was hastily added and is a waste of time? I doubt the latter because 
there should be quite a few obvious uses for this code. However, I will defer 
to you since you added it. I'm looking through all the PRs of this year to 
create an inventory of the sort of code and behavior (in code) that I'd hope to 
steward the community away from. I have a few items so far, but this is one I 
was not totally sure about because of its sparsity and how long it had been 
just sitting in the repo (7 months before any usage at all). 

I also don't know if the way Solr operates is, we just throw tools into the 
toolbox and if someone uses them one day, great. If not, someone will one day. 
Ideally, if we bring something to the project we ourselves would at least use 
it because we see value in it. I'm an outsider, learning everyday, and hoping 
to improve the project. 

> Add thread safety annotation to classes
> ---
>
> Key: SOLR-13998
> URL: https://issues.apache.org/jira/browse/SOLR-13998
> Project: Solr
>  Issue Type: Improvement
>Reporter: Anshum Gupta
>Assignee: Anshum Gupta
>Priority: Major
> Fix For: master (9.0), 8.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add annotations that can be used to mark classes as thread safe / single 
> threaded in Solr.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2020-08-11 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176013#comment-17176013
 ] 

David Smiley commented on LUCENE-8626:
--

I request that the Solr side be delayed some, and maybe get its own issue.  The 
reason for my request is [~markrmil...@gmail.com]'s massive branch 
"reference_impl" which he writes about SOLR-14636.  It's unclear how/when 
master & this branch will somehow get reconciled, but presently I suggest using 
caution when doing very wide-scale changes, _especially_ for renames or moves.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-11 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176006#comment-17176006
 ] 

Tomoko Uchida commented on SOLR-13412:
--

FYI: I opened LUCENE-9459 to make getting started guide for luke module.

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9459) "Getting Started" guide for Luke

2020-08-11 Thread Tomoko Uchida (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomoko Uchida updated LUCENE-9459:
--
Description: 
While Luke has been very popular, widely accepted tool to Lucene users 
(including Solr and Elasticsearch users) for 10+ years, it lacks good 
documentation and/or user guide. Although Luke is a GUI tool that describes 
itself, it's not good for new users.

The lack of documentation is partly due to my laziness though, there is some 
inherent difficulty of explaining such low-level tool; if you don't know Lucene 
you don't understand Luke's capability or usefulness, if you once understand 
Lucene at some level, it's obvious to you and no explanation is needed.  :)

Nonetheless, it would be great if we have "Getting Started" documentation for 
Luke on our web site for new users/devs.

We may be able to have a Markdown file with some screenshots and usage 
descriptions, then convert it to HTML by Gradle task,  so that we can publish 
it with whole API documentation.

  was:
While Luke has been very popular, widely accepted tool to Lucene users 
(including Solr and Elasticsearch users) for 10+ years, it lacks good 
documentation and/or user guide. Although Luke is a GUI tool that describes 
itself, it's not good for new users.

The lack of documentation is partly due to my laziness though, there is some 
inherent difficulty of explaining such low-level tool; if you don't know Lucene 
you don't understand Luke's capability or usefulness, if you once understand 
Lucene at some level, it's obvious to you and no explanation is needed.  :)

Nonetheless, it would be great if we have "Getting Started" documentation for 
Luke on our web site for new userd/devs.

We may be able to have a Markdown file with some screenshots and usage 
descriptions, then convert it to HTML by Gradle task,  so that we can publish 
it with whole API documentation.


> "Getting Started" guide for Luke 
> -
>
> Key: LUCENE-9459
> URL: https://issues.apache.org/jira/browse/LUCENE-9459
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/luke
>Reporter: Tomoko Uchida
>Priority: Major
>  Labels: newdev
>
> While Luke has been very popular, widely accepted tool to Lucene users 
> (including Solr and Elasticsearch users) for 10+ years, it lacks good 
> documentation and/or user guide. Although Luke is a GUI tool that describes 
> itself, it's not good for new users.
> The lack of documentation is partly due to my laziness though, there is some 
> inherent difficulty of explaining such low-level tool; if you don't know 
> Lucene you don't understand Luke's capability or usefulness, if you once 
> understand Lucene at some level, it's obvious to you and no explanation is 
> needed.  :)
> Nonetheless, it would be great if we have "Getting Started" documentation for 
> Luke on our web site for new users/devs.
> We may be able to have a Markdown file with some screenshots and usage 
> descriptions, then convert it to HTML by Gradle task,  so that we can publish 
> it with whole API documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9459) "Getting Started" guide for Luke

2020-08-11 Thread Tomoko Uchida (Jira)
Tomoko Uchida created LUCENE-9459:
-

 Summary: "Getting Started" guide for Luke 
 Key: LUCENE-9459
 URL: https://issues.apache.org/jira/browse/LUCENE-9459
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/luke
Reporter: Tomoko Uchida


While Luke has been very popular, widely accepted tool to Lucene users 
(including Solr and Elasticsearch users) for 10+ years, it lacks good 
documentation and/or user guide. Although Luke is a GUI tool that describes 
itself, it's not good for new users.

The lack of documentation is partly due to my laziness though, there is some 
inherent difficulty of explaining such low-level tool; if you don't know Lucene 
you don't understand Luke's capability or usefulness, if you once understand 
Lucene at some level, it's obvious to you and no explanation is needed.  :)

Nonetheless, it would be great if we have "Getting Started" documentation for 
Luke on our web site for new userd/devs.

We may be able to have a Markdown file with some screenshots and usage 
descriptions, then convert it to HTML by Gradle task,  so that we can publish 
it with whole API documentation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] CaoManhDat commented on a change in pull request #1724: SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-11 Thread GitBox


CaoManhDat commented on a change in pull request #1724:
URL: https://github.com/apache/lucene-solr/pull/1724#discussion_r468970705



##
File path: 
solr/solrj/src/java/org/apache/solr/client/solrj/impl/LBSolrClient.java
##
@@ -155,6 +159,7 @@ public ServerIterator(Req req, Map 
zombieServers) {
   this.req = req;
   this.zombieServers = zombieServers;
   this.timeAllowedNano = getTimeAllowedInNanos(req.getRequest());
+  log.info("TimeAllowedNano:{}", this.timeAllowedNano);

Review comment:
   Thank you, I totally forget that on creating this PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map

2020-08-11 Thread Noble Paul (Jira)
Noble Paul created SOLR-14730:
-

 Summary: Build new SolrJ APIs without concrete classes like 
NamedList/Map
 Key: SOLR-14730
 URL: https://issues.apache.org/jira/browse/SOLR-14730
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Noble Paul


We must minimize weakly typed code. Our public APIs should be programmed 
against interfaces and wherever possible use POJOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14730) Build new SolrJ APIs without concrete classes like NamedList/Map

2020-08-11 Thread Noble Paul (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-14730:
--
Labels: clean-api  (was: )

> Build new SolrJ APIs without concrete classes like NamedList/Map
> 
>
> Key: SOLR-14730
> URL: https://issues.apache.org/jira/browse/SOLR-14730
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Noble Paul
>Priority: Major
>  Labels: clean-api
>
> We must minimize weakly typed code. Our public APIs should be programmed 
> against interfaces and wherever possible use POJOs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14687) Make child/parent query parsers natively aware of _nest_path_

2020-08-11 Thread Chris M. Hostetter (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175086#comment-17175086
 ] 

Chris M. Hostetter edited comment on SOLR-14687 at 8/12/20, 12:32 AM:
--

besides that fact that Jira's WYSIWYG editor lied to me and munged up some of 
the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" 
in many places, something else has been nagging that i felt like i was 
overlooking and i finally figured out what it is: I hadn't really accounted for 
docs that _have_ a "nest path" but their path doesn't have any common ancestors 
with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy 
docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up 
affecting each other?

I *think* that what i described above would still mostly work for the "parent" 
parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as 
i described above didn't really "rule out" the other docs, because this still 
wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", 
but it still wouldn't really be a "correct" "parents bit set filter" as the 
underlying code expects it to be in terms of identifying all "non children" 
documents ... but** I'm _pretty sure_ it would be broken for the "child" parser 
case, because some doc with a n "/x" or  "/x/y" path isn't going to be matched 
by the "parents filter bitset" so might get swallowed up in the list of 
children.

The other thing that bugged me was the (mistaken & missguided) need to ' ... 
compute a list of all "prefix subpaths" ... ' – i'm not sure way i thought that 
was necessary, instead of just saying "must _NOT_ have a prefix of the 
specified path – ie:
{code:java}
 GIVEN:{!foo parentPath="/a/b/c"} ...

INSTEAD OF:PARENT FILTER BITSET = ((*:* -_nest_path_:*) OR _nest_path_:(/a 
/a/b /a/b/c))

  JUST USE:PARENT FILTER BITSET = (*:* -{prefix f="_nest_path_" 
v="/a/b/c/"}) {code}
...which (IIUC) should solve both problems, by matching:
 * docs w/o any nest path
 * docs with a nest path that does NOT start with /a/b/c/
 ** which includes the immediate "/a/b/c" parents, as well as their ancestors, 
as well as any docs with completely orthoginal paths (like /x/y/z)

But of course: in the case of {{parentFilter="/"}} this would still simply be 
"docs w/o a nest path"

That should work, right?

I also think i made some mistakes/types in my examples above in trying to 
articular what the equivalent "old style" query would be, so let me restate all 
of the examples in full...
{noformat}
NEW:  q={!parent parentPath="/a/b/c"}c_title:son

OLD:  q=(+{!field f="_nest_path_" v="/a/b/c"} +{!parent which=$ff v=$vv})
 ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) 
 vv=(+c_title:son +{prefix f="_nest_path_" v="/a/b/c/"})
{noformat}
{noformat}
NEW:  q={!parent parentPath="/"}c_title:son

OLD:  q=(-_nest_path_:* +{!parent which=$ff v=$vv}
 ff=(*:* -_nest_path_:*) 
 vv=(+c_title:son +_nest_path_:*)
{noformat}
{noformat}
NEW:  q={!child parentPath="/a/b/c"}p_title:dad

OLD:  q={!child of=$ff v=$vv})
 ff=(*:* -{prefix f="_nest_path_" v="/a/b/c/"}) 
 vv=(+p_title:dad +{field f="_nest_path_" v="/a/b/c"})
{noformat}
{noformat}
NEW:  q={!child parentPath="/"}p_title:dad

OLD:  q={!child of=$ff v=$vv})
 ff=(*:* -_nest_path_:*) 
 vv=(+p_title:dad -_nest_path_:*)
{noformat}
 

[~mkhl] - what do you think about this approach? do you see any flaws in the 
logic here? ... if the logic looks correct, I'd like to write it up as "how to 
create a *safe* of/which local param when using nest path" doc tip for 
SOLR-14383 and move forward there as a documentation improvement, even if there 
are still feature/implementation/syntax concerns/discussion to happen here as 
far as a "new feature"

 *EDIT*: fixed brain fart / typo of + vs - in last example


was (Author: hossman):
besides that fact that Jira's WYSIWYG editor lied to me and munged up some of 
the formatting of "STAR:STAR" and "UNDERSCORE nest UNDERSCORE path UNDERSCORE" 
in many places, something else has been nagging that i felt like i was 
overlooking and i finally figured out what it is: I hadn't really accounted for 
docs that _have_ a "nest path" but their path doesn't have any common ancestors 
with the {{parentPath}} specified – ie: how would a mix of {{/a/b/c}} hierarchy 
docs mixed in an index with docs having a hierarchy of {{/x/y/z}} wind up 
affecting each other?

I *think* that what i described above would still mostly work for the "parent" 
parser – even if the "parent filter" generated by a {{parentPath="/a/b/c"}} as 
i described above didn't really "rule out" the other docs, because this still 
wouldn't match the "nest path with a prefix of /a/b/c" rule for the "children", 
but it still wouldn't really be a "correct" "parents bit set filter" as

[jira] [Commented] (SOLR-14677) DIH doesnt close DataSource when import encounters errors

2020-08-11 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175901#comment-17175901
 ] 

Jason Gerlowski commented on SOLR-14677:


I created a PR a few minutes ago that ensures that errors arising during 
close() operations on DIHWriter don't derail efforts to close the 
EntityProcessors and DataSources.  Hoping to commit this in a few days, and 
then I'll propose it in the plugin repo I guess?

> DIH doesnt close DataSource when import encounters errors
> -
>
> Key: SOLR-14677
> URL: https://issues.apache.org/jira/browse/SOLR-14677
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: 7.5, master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
> Attachments: error-solr.log, no-error-solr.log
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DIH imports don't close DataSource's (which can hold db connections, etc.) in 
> all cases.  Specifically, if an import runs into an unexpected error 
> forwarding processed docs to other nodes, it will neglect to close the 
> DataSource's when it finishes.
> This problem goes back to at least 7.5.  This is partially mitigated in older 
> versions of some DataSource implementations (e.g. JdbcDataSource) by means of 
> a "finalize" hook which invokes "close()" when the DataSource object is 
> garbage-collected.  In practice, this means that resources might be held open 
> longer than necessary but will be closed within a few seconds or minutes by 
> GC.  This only helps JdbcDataSource though - all other DataSource impl's risk 
> leaking resources. 
> In master/9.0, which requires a minimum of Java 11 and doesn't have the 
> finalize-hook, the connections are never cleaned up when an error is 
> encountered during DIH.  DIH will likely be removed for the 9.0 release, but 
> if it isn't this bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gerlowskija opened a new pull request #1741: SOLR-14677: Always close DIH EntityProcessor/DataSource

2020-08-11 Thread GitBox


gerlowskija opened a new pull request #1741:
URL: https://github.com/apache/lucene-solr/pull/1741


   # Description
   
   Prior to this commit, the wrapup logic at the end of
   DocBuilder.execute() closed out a series of DIH objects, but did so
   in a way that an exception closing any of them resulted in the remainder
   staying open.  This is especially problematic since Writer.close()
   throws exceptions that DIH uses to determine the success/failure of the
   run.
   
   In practice this caused network errors sending DIH data to other Solr
   nodes to result in leaked JDBC connections.
   
   # Solution
   
   This PR changes DocBuilder's termination logic to handle exceptions
   more gracefully, ensuring that errors closing a DIHWriter (for example)
   don't prevent the closure of entity-processor and DataSource objects.
   
   # Tests
   
   None
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `ant precommit` and the appropriate test suite.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Assigned] (SOLR-14677) DIH doesnt close DataSource when import encounters errors

2020-08-11 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski reassigned SOLR-14677:
--

Assignee: Jason Gerlowski

> DIH doesnt close DataSource when import encounters errors
> -
>
> Key: SOLR-14677
> URL: https://issues.apache.org/jira/browse/SOLR-14677
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: 7.5, master (9.0)
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
> Attachments: error-solr.log, no-error-solr.log
>
>
> DIH imports don't close DataSource's (which can hold db connections, etc.) in 
> all cases.  Specifically, if an import runs into an unexpected error 
> forwarding processed docs to other nodes, it will neglect to close the 
> DataSource's when it finishes.
> This problem goes back to at least 7.5.  This is partially mitigated in older 
> versions of some DataSource implementations (e.g. JdbcDataSource) by means of 
> a "finalize" hook which invokes "close()" when the DataSource object is 
> garbage-collected.  In practice, this means that resources might be held open 
> longer than necessary but will be closed within a few seconds or minutes by 
> GC.  This only helps JdbcDataSource though - all other DataSource impl's risk 
> leaking resources. 
> In master/9.0, which requires a minimum of Java 11 and doesn't have the 
> finalize-hook, the connections are never cleaned up when an error is 
> encountered during DIH.  DIH will likely be removed for the 9.0 release, but 
> if it isn't this bug should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley opened a new pull request #1740: LUCENE-9458: WDGF and WDF should tie-break by endOffset

2020-08-11 Thread GitBox


dsmiley opened a new pull request #1740:
URL: https://github.com/apache/lucene-solr/pull/1740


   Can happen with catenateAll and not generating word xor number part when the 
input ends with the non-generated sub-token.
   
   https://issues.apache.org/jira/browse/LUCENE-9458
   
   CC @jimczi maybe you could review this please; I believe you reviewed the 
predecessor.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9458) WordDelimiterGraphFilter (and non-graph) should tie-break order using end offset

2020-08-11 Thread David Smiley (Jira)
David Smiley created LUCENE-9458:


 Summary: WordDelimiterGraphFilter (and non-graph) should tie-break 
order using end offset
 Key: LUCENE-9458
 URL: https://issues.apache.org/jira/browse/LUCENE-9458
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Reporter: David Smiley
Assignee: David Smiley


WordDelimiterGraphFilter and WordDelimiterFilter do not consult the end offset 
in their sub-token _ordering_.  In the event of a tie-break, I propose the 
longer token come first.  This usually happens already, but not always, and so 
this also feels like an inconsistency when you see it.  This issue can be 
thought of as a bug fix to LUCENE-9006 or an improvement; I have no strong 
feelings on the issue classification.  Before reading further, definitely read 
that issue.

I see this is a problem when using CATENATE_ALL with either GENERATE_WORD_PARTS 
xor GENERATE_NUMBER_PARTS when the input ends with that part not being 
generated.  Consider the input: "other-9" and let's assume we want to catenate 
all, generate word parts, but nothing else (not numbers).  This should be 
tokenized in this order: "other9", "other" but today is emitted in reverse 
order.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14706) Upgrading 8.6.0 to 8.6.1 causes collection creation to fail

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175844#comment-17175844
 ] 

ASF subversion and git services commented on SOLR-14706:


Commit dc0d049b62b2dd1e5bddc30beda526b79e4a7383 in lucene-solr's branch 
refs/heads/branch_8x from Houston Putman
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dc0d049 ]

SOLR-14706: Fix support for default autoscaling policy (8x forward-port) (#1739)



> Upgrading 8.6.0 to 8.6.1 causes collection creation to fail
> ---
>
> Key: SOLR-14706
> URL: https://issues.apache.org/jira/browse/SOLR-14706
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: AutoScaling
>Affects Versions: 8.7, 8.6.1
> Environment: 8.6.1 upgraded from 8.6.0 with more than one node
>Reporter: Gus Heck
>Assignee: Houston Putman
>Priority: Blocker
> Fix For: 8.6.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The following steps will reproduce a situation in which collection creation 
> fails with this stack trace:
> {code:java}
> 2020-08-03 12:17:58.617 INFO  
> (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [   
> ] o.a.s.c.a.c.CreateCollectionCmd Create collection test861
> 2020-08-03 12:17:58.751 ERROR 
> (OverseerThreadFactory-22-thread-1-processing-n:192.168.2.106:8981_solr) [   
> ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test861 operation: 
> create failed:org.apache.solr.common.SolrException
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:347)
>   at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:264)
>   at 
> org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:517)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:212)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Only one extra tag supported for the 
> tag cores in {
>   "cores":"#EQUAL",
>   "node":"#ANY",
>   "strict":"false"}
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Clause.(Clause.java:122)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Clause.create(Clause.java:235)
>   at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy.(Policy.java:144)
>   at 
> org.apache.solr.client.solrj.cloud.autoscaling.AutoScalingConfig.getPolicy(AutoScalingConfig.java:372)
>   at 
> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:300)
>   at 
> org.apache.solr.cloud.api.collections.Assign.usePolicyFramework(Assign.java:277)
>   at 
> org.apache.solr.cloud.api.collections.Assign$AssignStrategyFactory.create(Assign.java:661)
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.buildReplicaPositions(CreateCollectionCmd.java:415)
>   at 
> org.apache.solr.cloud.api.collections.CreateCollectionCmd.call(CreateCollectionCmd.java:192)
>   ... 6 more
> {code}
> Generalized steps:
> # Deploy 8.6.0 with separate data directories, create a collection to prove 
> it's working
> # download 
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.6.1-RC1-reva32a3ac4e43f629df71e5ae30a3330be94b095f2/solr/solr-8.6.1.tgz
> # Stop the server on all nodes
> # replace the 8.6.0 with 8.6.1 
> # Start the server
> # via the admin UI create a collection
> # Observe failure warning box (with no text), check logs, find above trace
> Or more exactly here are my actual commands with a checkout of the 8.6.0 tag 
> in the working dir to which cloud.sh was configured:
> # /cloud.sh new -r upgrademe 
> # Create collection named test860 via admin ui with _default
> # ./cloud.sh stop 
> # cd upgrademe/
> # cp ../8_6_1_RC1/solr-8.6.1.tgz .
> # mv solr-8.6.0

[GitHub] [lucene-solr] HoustonPutman merged pull request #1739: SOLR-14706: Fix support for default autoscaling policy (8x forward-port)

2020-08-11 Thread GitBox


HoustonPutman merged pull request #1739:
URL: https://github.com/apache/lucene-solr/pull/1739


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14729) Investigate and harden TestExportWriter.testExpr failures

2020-08-11 Thread Joel Bernstein (Jira)
Joel Bernstein created SOLR-14729:
-

 Summary: Investigate and harden TestExportWriter.testExpr failures
 Key: SOLR-14729
 URL: https://issues.apache.org/jira/browse/SOLR-14729
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


TestExportWriter.testExpr is failing way too much (6.4% of the time). This 
ticket will fix the problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13807) Caching for term facet counts

2020-08-11 Thread Michael Gibney (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175836#comment-17175836
 ] 

Michael Gibney commented on SOLR-13807:
---

After SOLR-13132 was merged to master, it was a bit of challenge to reconcile 
with the complementary "term facet cache" (this issue). I've taken an initial 
stab at this and pushed to [PR 
#1357|https://github.com/apache/lucene-solr/pull/1357], and I think it's at the 
point where it's once again ready for consideration.

Below are some naive performance benchmarks, using [^SOLR-13807-benchmarks.tgz] 
(based on similar benchmarks for SOLR-13132).

{{filterCache}} is irrelevant for what's illustrated here (all count or sweep 
collection, single-shard thus no refinement). I included hooks in the included 
scripts to easily change the filterCache size and termFacetCache size for 
evaluation. For purpose of {{relatedness}} evaluation, fgSet == base search 
result domain. All results discussed here are for single-valued string fields, 
but multivalued string fields are also included in the benchmark attachment 
(results for multi-valued didn't differ substantially from those for 
single-valued).

There's a row for each docset domain recall percentage (percentage of \*:* 
domain returned by main query/fg), and a column for each field cardinality; 
cell values indicate latency (QTime) in ms against a single core with 3 million 
docs, no deletes; each value is the average of 10 repeated invocations of the 
the relevant request (standard deviation isn't captured here, but was quite 
low, fwiw).

Below are for current (including SOLR-13132) master; no caches (filterCache, if 
present, would be unused):
{code}
[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, master
cdnlty: 10  100 1k  10k 100k1m
.1% 0   0   0   0   0   4
1%  1   0   1   1   2   5
10% 7   7   8   8   10  16
20% 17  14  16  15  19  31
30% 22  19  23  20  24  42
40% 27  26  28  28  32  50
50% 33  32  35  32  38  59
99.99%  65  60  67  62  72  107

[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, master
cdnlty: 10  100 1k  10k 100k1m
.1% 179 174 183 190 192 225
1%  182 177 186 183 194 236
10% 193 191 196 197 226 256
20% 206 200 207 207 234 300
30% 216 210 217 216 239 316
40% 228 225 231 231 253 331
50% 239 234 241 240 266 347
99.99%  285 280 287 287 311 403
{code}

Below are for 77daac4ae2a4d1c40652eafbbdb42b582fe2d02d (SOLR-13807), with _no_ 
termFacetCache configured (apples-to-apples, since there are changes in some of 
the hot facet code paths):
{code}
[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, 
no_cache
cdnlty: 10  100 1k  10k 100k1m
.1% 0   0   0   0   0   3
1%  1   1   1   1   1   6
10% 8   8   9   8   11  14
20% 16  15  16  15  20  32
30% 21  21  23  22  26  42
40% 28  27  31  28  34  53
50% 35  33  37  34  40  63
99.99%  68  64  71  66  74  108

[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, no_cache
cdnlty: 10  100 1k  10k 100k1m
.1% 96  80  89  97  96  129
1%  88  83  90  88  101 133
10% 99  97  103 102 122 162
20% 117 107 113 113 135 194
30% 120 117 123 122 144 211
40% 130 129 134 134 156 232
50% 143 140 147 144 169 249
99.99%  179 175 181 179 201 305
{code}

Below are for 77daac4ae2a4d1c40652eafbbdb42b582fe2d02d (SOLR-13807), with 
{{solr.termFacetCacheSize=20}} configured.
{code}
[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s count # sort-by-count, cache 
size 20
cdnlty: 10  100 1k  10k 100k1m
.1% 0   0   0   0   0   2
1%  0   0   0   0   1   10
10% 3   4   4   4   5   16
20% 8   7   8   7   9   20
30% 11  10  12  11  13  25
40% 13  13  15  15  15  28
50% 15  16  16  18  20  32
99.99%  29  30  30  29  32  45

[magibney@mbp SOLR-13132-benchmarks]$ ./check.sh s true # sort-by-skg, cache 
size 20
cdnlty: 10  100 1k  10k 100k1m
.1% 0   0   

[jira] [Created] (LUCENE-9457) Why is Kuromoji tokenization throughput bimodal?

2020-08-11 Thread Michael McCandless (Jira)
Michael McCandless created LUCENE-9457:
--

 Summary: Why is Kuromoji tokenization throughput bimodal?
 Key: LUCENE-9457
 URL: https://issues.apache.org/jira/browse/LUCENE-9457
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless


With the recent accidental regression of Japanese (Kuromoji) tokenization 
throughput due to exciting FST optimizations, we [added new nightly Lucene 
benchmarks|https://github.com/mikemccand/luceneutil/issues/64] to measure 
tokenization throughput for {{JapaneseTokenizer}}: 
[https://home.apache.org/~mikemccand/lucenebench/analyzers.html]

It has already been running for ~5-6 weeks now!  But for some reason, it looks 
bi-modal?  "Normally" it is ~.45 M tokens/sec, but for two data points it 
dropped down to ~.33 M tokens/sec, which is odd.  It could be hotspot noise 
maybe?  But would be good to get to the root cause and fix it if possible.

Hotspot noise that randomly steals ~27% of your tokenization throughput is no 
good!!

Or does anyone have any other ideas of what could be bi-modal in Kuromoji?  I 
don't think [this performance 
test|https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/TestAnalyzerPerf.java]
 has any randomness in it...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13807) Caching for term facet counts

2020-08-11 Thread Michael Gibney (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Gibney updated SOLR-13807:
--
Attachment: SOLR-13807-benchmarks.tgz

> Caching for term facet counts
> -
>
> Key: SOLR-13807
> URL: https://issues.apache.org/jira/browse/SOLR-13807
> Project: Solr
>  Issue Type: New Feature
>  Components: Facet Module
>Affects Versions: master (9.0), 8.2
>Reporter: Michael Gibney
>Priority: Minor
> Attachments: SOLR-13807-benchmarks.tgz, 
> SOLR-13807__SOLR-13132_test_stub.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Solr does not have a facet count cache; so for _every_ request, term facets 
> are recalculated for _every_ (facet) field, by iterating over _every_ field 
> value for _every_ doc in the result domain, and incrementing the associated 
> count.
> As a result, subsequent requests end up redoing a lot of the same work, 
> including all associated object allocation, GC, etc. This situation could 
> benefit from integrated caching.
> Because of the domain-based, serial/iterative nature of term facet 
> calculation, latency is proportional to the size of the result domain. 
> Consequently, one common/clear manifestation of this issue is high latency 
> for faceting over an unrestricted domain (e.g., {{\*:\*}}), as might be 
> observed on a top-level landing page that exposes facets. This type of 
> "static" case is often mitigated by external (to Solr) caching, either with a 
> caching layer between Solr and a front-end application, or within a front-end 
> application, or even with a caching layer between the end user and a 
> front-end application.
> But in addition to the overhead of handling this caching elsewhere in the 
> stack (or, for a new user, even being aware of this as a potential issue to 
> mitigate), any external caching mitigation is really only appropriate for 
> relatively static cases like the "landing page" example described above. A 
> Solr-internal facet count cache (analogous to the {{filterCache}}) would 
> provide the following additional benefits:
>  # ease of use/out-of-the-box configuration to address a common performance 
> concern
>  # compact (specifically caching count arrays, without the extra baggage that 
> accompanies a naive external caching approach)
>  # NRT-friendly (could be implemented to be segment-aware)
>  # modular, capable of reusing the same cached values in conjunction with 
> variant requests over the same result domain (this would support common use 
> cases like paging, but also potentially more interesting direct uses of 
> facets). 
>  # could be used for distributed refinement (i.e., if facet counts over a 
> given domain are cached, a refinement request could simply look up the 
> ordinal value for each enumerated term and directly grab the count out of the 
> count array that was cached during the first phase of facet calculation)
>  # composable (e.g., in aggregate functions that calculate values based on 
> facet counts across different domains, like SKG/relatedness – see SOLR-13132)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

2020-08-11 Thread GitBox


mikemccand commented on a change in pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r468812527



##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
   }
 }
 
-Document doc = indexReader.document(ordinal);
-FacetLabel ret = new 
FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+boolean found = MultiDocValues.getBinaryValues(indexReader, 
Consts.FULL).advanceExact(catIDInteger);

Review comment:
   One more idea: instead of using `MultiDocValues` sugar, I think we 
should use Lucene's `ReaderUtil` to quickly (binary search) determine which 
leaf holds this `docId`, then pull `BinaryDocValues` from that `LeafReader`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] mikemccand commented on a change in pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

2020-08-11 Thread GitBox


mikemccand commented on a change in pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#discussion_r468808904



##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
   }
 }
 
-Document doc = indexReader.document(ordinal);
-FacetLabel ret = new 
FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+boolean found = MultiDocValues.getBinaryValues(indexReader, 
Consts.FULL).advanceExact(catIDInteger);

Review comment:
   OK, I see one issue -- you are pulling a new `BinaryDocValues`, calling 
`.advanceExact` on it (good), but then pulling a new `BinaryDocValues` below 
and not calling `.advanceExact` on it.
   
   I think you must add a new local variable, e.g. `BinaryDocValues values`.  
Pull it once (using the `MultiDocValues.getBinaryValues` sugar API).  Then call 
`.advanceExact` on that and assert it succeeded. Finally, use that same 
`values` instance (now that it has advanced to the right `docId`) to call 
`.binaryValue().utf8ToString()`.
   
   I think that should fix the `NPE`?
   
   This is misuse of the API for the default Lucene Codec for 
`BinaryDocValues`, since you were calling `.binaryValue()` before 
`.advanceExact()`.  It is somewhat disappointing that the codec threw a 
confusing `NPE` and not a clearer (best effort) exception stating that you must 
first call `.advanceExact`.  Maybe we could improve the default Codec?  
(Though, not if that would hurt performance of correct usage).  OK I see: the 
`NPE` is because of `MultiDocValues.currentValues` is `null` since 
`.advanceExact` was not yet called.  Maybe we could add an `assert` there, 
confirming `.advanceExact` was indeed called and had returned `true`?  It would 
have made debugging this easier, and should not hurt performance when 
assertions are disabled ...

##
File path: 
lucene/facet/src/java/org/apache/lucene/facet/taxonomy/directory/DirectoryTaxonomyReader.java
##
@@ -323,8 +323,10 @@ public FacetLabel getPath(int ordinal) throws IOException {
   }
 }
 
-Document doc = indexReader.document(ordinal);
-FacetLabel ret = new 
FacetLabel(FacetsConfig.stringToPath(doc.get(Consts.FULL)));
+boolean found = MultiDocValues.getBinaryValues(indexReader, 
Consts.FULL).advanceExact(catIDInteger);

Review comment:
   I think instead of the boxed `Integer catIDInteger` we should pass the 
`int ordinal` to `.advanceExact(...)`?  Not the cause of the `NPE`, just 
cleaner.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13972) Insecure Solr should generate startup warning

2020-08-11 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175803#comment-17175803
 ] 

Jason Gerlowski commented on SOLR-13972:


Reopening based on the mailing list thread Jan referenced above.  I'll take 
care of this this week.

> Insecure Solr should generate startup warning
> -
>
> Key: SOLR-13972
> URL: https://issues.apache.org/jira/browse/SOLR-13972
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Jason Gerlowski
>Priority: Critical
> Fix For: master (9.0), 8.4
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Warning to the effect of, start Solr with: "solr auth enable -credentials 
> solr:foo -blockUnknown true” (or some other way to achieve the same effect) 
> if you want to expose this Solr instance directly to users. Maybe the link to 
> the ref guide discussing all this might be in good measure here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-13972) Insecure Solr should generate startup warning

2020-08-11 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-13972:
---
Status: Reopened  (was: Closed)

> Insecure Solr should generate startup warning
> -
>
> Key: SOLR-13972
> URL: https://issues.apache.org/jira/browse/SOLR-13972
> Project: Solr
>  Issue Type: Bug
>Reporter: Ishan Chattopadhyaya
>Assignee: Jason Gerlowski
>Priority: Critical
> Fix For: master (9.0), 8.4
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Warning to the effect of, start Solr with: "solr auth enable -credentials 
> solr:foo -blockUnknown true” (or some other way to achieve the same effect) 
> if you want to expose this Solr instance directly to users. Maybe the link to 
> the ref guide discussing all this might be in good measure here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] HoustonPutman opened a new pull request #1739: SOLR-14706: Fix support for default autoscaling policy (8x forward-port)

2020-08-11 Thread GitBox


HoustonPutman opened a new pull request #1739:
URL: https://github.com/apache/lucene-solr/pull/1739


   forward-porting #1716 for https://issues.apache.org/jira/browse/SOLR-14706



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #1725: LUCENE-9449 Skip docs with _doc sort and "after"

2020-08-11 Thread GitBox


jimczi commented on a change in pull request #1725:
URL: https://github.com/apache/lucene-solr/pull/1725#discussion_r468774381



##
File path: lucene/core/src/java/org/apache/lucene/search/FieldValueHitQueue.java
##
@@ -160,18 +160,20 @@ private FieldValueHitQueue(SortField[] fields, int size, 
boolean filterNonCompet
*  The number of hits to retain. Must be greater than zero.
* @param filterNonCompetitiveDocs
*{@code true} If comparators should be allowed to filter 
non-competitive documents, {@code false} otherwise
+   * @param hasAfter
+   *{@code true} If this sort has "after" FieldDoc
*/
   public static  FieldValueHitQueue 
create(SortField[] fields, int size,
-  boolean filterNonCompetitiveDocs) {
+  boolean filterNonCompetitiveDocs, boolean hasAfter) {

Review comment:
   Can we avoid adding `hasAfter` here ? See my comment below.

##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringDocLeafComparator.java
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.lucene.search;
+
+import org.apache.lucene.index.LeafReaderContext;
+
+import java.io.IOException;
+
+/**
+ * This comparator is used when there is sort by _doc asc together with 
"after" FieldDoc.
+ * The comparator provides an iterator that can quickly skip to the desired 
"after" document.
+ */
+public class FilteringDocLeafComparator implements 
FilteringLeafFieldComparator {
+private final FieldComparator.DocComparator in;
+private DocIdSetIterator topValueIterator; // iterator that starts from 
topValue if possible
+private final int minDoc;
+private final int maxDoc;
+private final int docBase;
+private boolean iteratorUpdated = false;
+
+public FilteringDocLeafComparator(LeafFieldComparator in, 
LeafReaderContext context) {

Review comment:
   Can we force the `in` to be a `FieldComparator.DocComparator` ? 

##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringFieldComparator.java
##
@@ -68,10 +68,12 @@ public int compareValues(T first, T second) {
* @param comparator – comparator to wrap
* @param reverse – if this sort is reverse
* @param singleSort – true if this sort is based on a single field and 
there are no other sort fields for tie breaking
+   * @param hasAfter – true if this sort has after FieldDoc
* @return comparator wrapped as a filtering comparator or the original 
comparator if the filtering functionality
* is not implemented for it
*/
-  public static FieldComparator 
wrapToFilteringComparator(FieldComparator comparator, boolean reverse, 
boolean singleSort) {
+  public static FieldComparator 
wrapToFilteringComparator(FieldComparator comparator, boolean reverse, 
boolean singleSort,
+  boolean hasAfter) {

Review comment:
   Do we really need to add the `hasAfter` ? Can we check the if the 
`topValue` in the DocComparator is greater than 0 instead ?

##
File path: lucene/core/src/java/org/apache/lucene/search/FieldValueHitQueue.java
##
@@ -121,7 +121,7 @@ protected boolean lessThan(final Entry hitA, final Entry 
hitB) {
   }
   
   // prevent instantiation and extension.
-  private FieldValueHitQueue(SortField[] fields, int size, boolean 
filterNonCompetitiveDocs) {
+  private FieldValueHitQueue(SortField[] fields, int size, boolean 
filterNonCompetitiveDocs, boolean hasAfter) {

Review comment:
   Not sure that `hasAfter` is really needed here.

##
File path: 
lucene/core/src/java/org/apache/lucene/search/FilteringDocLeafComparator.java
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, softwar

[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-11 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175750#comment-17175750
 ] 

Tomoko Uchida commented on SOLR-13412:
--

[~epugh] :
{quote}I think that the real issue here is that most of our users don't know 
that Luke exists, and that its a powerful tool. What if we kept Luke as a 
standalone artifact, and instead talked about Luke in the Solr Ref Guide? We 
mention the Luke request handler on 
[https://lucene.apache.org/solr/guide/8_6/implicit-requesthandlers.html], that 
could also link to a page with more details on Luke and where to download it 
from? Which reminds me we should add the word Luke to the Ref Guide glossary 
page!
{quote}
{quote}I just poked around the lucene.apache.org site, and there is no mention 
of Luke anywhere...
{quote}
Thank you for specifically pointing that. Indeed documentation and/or user 
guide is the most powerful promotion tool, then Luke lacks any of them. 
Although Luke is a GUI tool that describes itself, it's not great for new 
users. I'm partly responsible for that - I once created an issue 
[https://github.com/DmitryKey/luke/issues/116] and have abandoned it. I always 
thought we should have "getting started" documentation for Luke in our web site 
so that we can provide the links for it from Solr Ref Guide or everywhere else. 
If you have any ideas, please feel free to share it and open an issue if needed 
:)

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1726:
URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468762245



##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {

Review comment:
   
https://github.com/apache/lucene-solr/pull/1726/files#diff-d8beef800870f194d61993d701fd9cc2L77
 has `>=`
   
https://github.com/apache/lucene-solr/pull/1726/files#diff-65e9f3712efc1ec962ea82a04a1d7aa1L104
 has `>`
   
   Either way, please update the docs at 
https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/params/CommonParams.java#L162
 because they are absolutely wrong (`>= 0` means no timeout???)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-11 Thread GitBox


atris commented on a change in pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468756781



##
File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java
##
@@ -811,10 +818,18 @@ private void initLibs(SolrResourceLoader loader, boolean 
isConfigsetTrusted) {
 loader.reloadLuceneSPI();
   }
 
-  private void validateMemoryBreakerThreshold() {
+  private void validateCircuitBreakerThresholds() {
 if (useCircuitBreakers) {
-  if (memoryCircuitBreakerThresholdPct > 95 || 
memoryCircuitBreakerThresholdPct < 50) {
-throw new IllegalArgumentException("Valid value range of 
memoryCircuitBreakerThresholdPct is 50 -  95");
+  if (isMemoryCircuitBreakerEnabled) {
+if (memoryCircuitBreakerThresholdPct > 95 || 
memoryCircuitBreakerThresholdPct < 50) {
+  throw new IllegalArgumentException("Valid value range of 
memoryCircuitBreakerThresholdPct is 50 -  95");
+}
+  }
+
+  if (isCpuCircuitBreakerEnabled) {
+if (cpuCircuitBreakerThresholdPct > 95 || 
cpuCircuitBreakerThresholdPct < 40) {
+  throw new IllegalArgumentException("Valid value range for 
cpuCircuitBreakerThresholdPct is 40 - 95");

Review comment:
   I see values between 0 - 100. Ran stress locally and validated values.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] uschindler commented on pull request #1593: LUCENE-9409: Check file lengths before creating slices.

2020-08-11 Thread GitBox


uschindler commented on pull request #1593:
URL: https://github.com/apache/lucene-solr/pull/1593#issuecomment-672106735


   I am fine to fix the test. Sure you have to first figure out why the index 
is out of bounds, and the exact exception may be misleading, but that's 
actually what's happening here. If you want other exceptions, another fix would 
be to enforce the IO layer to have a meaningful exception and implement it for 
all directory implementations.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris merged pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


atris merged pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jpountz commented on pull request #1593: LUCENE-9409: Check file lengths before creating slices.

2020-08-11 Thread GitBox


jpountz commented on pull request #1593:
URL: https://github.com/apache/lucene-solr/pull/1593#issuecomment-672097226


   I repurposed this PR to instead make the test expect out-of-bounds 
exceptions. Does it look better to you @rmuir @uschindler ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-9379) Directory based approach for index encryption

2020-08-11 Thread Rajeswari Natarajan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175699#comment-17175699
 ] 

Rajeswari Natarajan edited comment on LUCENE-9379 at 8/11/20, 4:59 PM:
---

[~bruno.roustant] and [~dsmiley] , if we go with implicit  router, shard 
management/rebalancing/routing becomes manual. Solrcloud will not take care of 
these (In solr mailing lists always I see users are advised against taking this 
route) , so looking to see if encryption possible with composite id router and 
multiple tenants per collection . We might have around 3000+ collections going 
forward  , so having one collection per tenant will make our cluster really 
heavy.  Please share your thoughts and if anyone has attempted this kind of 
encryption


was (Author: raji):
[~bruno.roustant] and [~dsmiley] , if we go with implicit  router, shard 
management/rebalancing/routing becomes manual. Solrcloud will not take care of 
these (In solr mailing lists always I see users are advised against taking this 
route_ , so looking to see if encryption possible with composite id router and 
multiple tenants per collection . We might have around 3000+ collections going 
forward  , so having one collection per tenant will make our cluster really 
heavy.  Please share your thoughts and if anyone has attempted this kind of 
encryption

> Directory based approach for index encryption
> -
>
> Key: LUCENE-9379
> URL: https://issues.apache.org/jira/browse/LUCENE-9379
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Bruno Roustant
>Assignee: Bruno Roustant
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> +Important+: This Lucene Directory wrapper approach is to be considered only 
> if an OS level encryption is not possible. OS level encryption better fits 
> Lucene usage of OS cache, and thus is more performant.
> But there are some use-case where OS level encryption is not possible. This 
> Jira issue was created to address those.
> 
>  
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-11 Thread GitBox


atris commented on a change in pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468728850



##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+
+import org.apache.solr.core.SolrConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * 
+ * Tracks current CPU usage and triggers if the specified threshold is 
breached.
+ *
+ * This circuit breaker gets the average CPU load over the last minute and uses
+ * that data to take a decision. Ideally, we should be able to cache the value
+ * locally and only query once the minute has elapsed. However, that will 
introduce
+ * more complexity than the current structure and might not get us major 
performance
+ * wins. If this ever becomes a performance bottleneck, that can be considered.
+ * 
+ *
+ * 
+ * The configuration to define which mode to use and the trigger threshold are 
defined in
+ * solrconfig.xml
+ * 
+ */
+public class CPUCircuitBreaker extends CircuitBreaker {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+  private static final OperatingSystemMXBean operatingSystemMXBean = 
ManagementFactory.getOperatingSystemMXBean();
+
+  private final boolean isCpuCircuitBreakerEnabled;
+  private final double cpuUsageThreshold;
+
+  // Assumption -- the value of these parameters will be set correctly before 
invoking getDebugInfo()
+  private final ThreadLocal seenCPUUsage = new ThreadLocal<>();
+  private final ThreadLocal allowedCPUUsage = new ThreadLocal<>();
+
+  public CPUCircuitBreaker(SolrConfig solrConfig) {
+super(solrConfig);
+
+this.isCpuCircuitBreakerEnabled = solrConfig.isCpuCircuitBreakerEnabled;
+this.cpuUsageThreshold = solrConfig.cpuCircuitBreakerThresholdPct;
+  }
+
+  @Override
+  public boolean isTripped() {
+if (!isEnabled()) {
+  return false;
+}
+
+if (!isCpuCircuitBreakerEnabled) {
+  return false;
+}
+
+double localAllowedCPUUsage = getCpuUsageThreshold();
+double localSeenCPUUsage = calculateLiveCPUUsage();
+
+if (localSeenCPUUsage < 0) {
+  if (log.isWarnEnabled()) {
+String msg = "Unable to get CPU usage";
+
+log.warn(msg);
+
+return false;

Review comment:
   Good catch, thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


atris commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468727485



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -48,7 +47,7 @@ public static void setupCluster() throws Exception {
 configureCluster(1).addConfig(FIRST_COLLECTION, 
configset("cloud-minimal")).configure();
   }
 
-  @Test
+  @Nightly

Review comment:
   Updated





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2020-08-11 Thread Rajeswari Natarajan (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175699#comment-17175699
 ] 

Rajeswari Natarajan commented on LUCENE-9379:
-

[~bruno.roustant] and [~dsmiley] , if we go with implicit  router, shard 
management/rebalancing/routing becomes manual. Solrcloud will not take care of 
these (In solr mailing lists always I see users are advised against taking this 
route_ , so looking to see if encryption possible with composite id router and 
multiple tenants per collection . We might have around 3000+ collections going 
forward  , so having one collection per tenant will make our cluster really 
heavy.  Please share your thoughts and if anyone has attempted this kind of 
encryption

> Directory based approach for index encryption
> -
>
> Key: LUCENE-9379
> URL: https://issues.apache.org/jira/browse/LUCENE-9379
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Bruno Roustant
>Assignee: Bruno Roustant
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> +Important+: This Lucene Directory wrapper approach is to be considered only 
> if an OS level encryption is not possible. OS level encryption better fits 
> Lucene usage of OS cache, and thus is more performant.
> But there are some use-case where OS level encryption is not possible. This 
> Jira issue was created to address those.
> 
>  
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468724755



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);
 
 MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get());
-assertTrue("Incoming accepted new request count did not match. Expected 5 
incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
-mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
-assertTrue("Incoming rejected new request count did not match. Expected 20 
incoming " + mockQueryRateLimiter.rejectedRequestCount.get(),
-mockQueryRateLimiter.rejectedRequestCount.get() > 0);
+assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get());
+
+assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == 
mockQueryRateLimiter.incomingRequestCount.get()

Review comment:
   yea I was looking at the wrong side of the diff





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468721648



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -48,7 +47,7 @@ public static void setupCluster() throws Exception {
 configureCluster(1).addConfig(FIRST_COLLECTION, 
configset("cloud-minimal")).configure();
   }
 
-  @Test
+  @Nightly

Review comment:
   s/Nightly/Test





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley merged pull request #1735: LUCENE spell: Implement SuggestWord.toString

2020-08-11 Thread GitBox


dsmiley merged pull request #1735:
URL: https://github.com/apache/lucene-solr/pull/1735


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1728: SOLR-14596: equals/hashCode for common SolrRequest classes

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1728:
URL: https://github.com/apache/lucene-solr/pull/1728#discussion_r468716670



##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/SolrRequest.java
##
@@ -244,6 +247,8 @@ public String getBasePath() {
   public void addHeader(String key, String value) {
 if (headers == null) {
   headers = new HashMap<>();
+  final HashMap asdf = new HashMap<>();

Review comment:
   what?

##
File path: 
solr/solrj/src/test/org/apache/solr/client/solrj/request/TestUpdateRequest.java
##
@@ -17,52 +17,164 @@
 package org.apache.solr.client.solrj.request;
 
 import java.util.Arrays;
+import java.util.List;
 
+import com.google.common.collect.Lists;
 import org.apache.solr.common.SolrInputDocument;
 import org.junit.Before;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
 
+import static org.apache.solr.SolrTestCaseJ4.adoc;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+
 public class TestUpdateRequest {
 
   @Rule
   public ExpectedException exception = ExpectedException.none();
 
   @Before
   public void expectException() {
-exception.expect(NullPointerException.class);
-exception.expectMessage("Cannot add a null SolrInputDocument");
   }
 
   @Test
   public void testCannotAddNullSolrInputDocument() {
+exception.expect(NullPointerException.class);
+exception.expectMessage("Cannot add a null SolrInputDocument");
+
 UpdateRequest req = new UpdateRequest();
 req.add((SolrInputDocument) null);
   }
 
   @Test
   public void testCannotAddNullDocumentWithOverwrite() {
+exception.expect(NullPointerException.class);
+exception.expectMessage("Cannot add a null SolrInputDocument");
+
 UpdateRequest req = new UpdateRequest();
 req.add(null, true);
   }
 
   @Test
   public void testCannotAddNullDocumentWithCommitWithin() {
+exception.expect(NullPointerException.class);
+exception.expectMessage("Cannot add a null SolrInputDocument");
+
 UpdateRequest req = new UpdateRequest();
 req.add(null, 1);
   }
 
   @Test
   public void testCannotAddNullDocumentWithParameters() {
+exception.expect(NullPointerException.class);
+exception.expectMessage("Cannot add a null SolrInputDocument");
+
 UpdateRequest req = new UpdateRequest();
 req.add(null, 1, true);
   }
 
   @Test
   public void testCannotAddNullDocumentAsPartOfList() {
+exception.expect(NullPointerException.class);
+exception.expectMessage("Cannot add a null SolrInputDocument");
+
 UpdateRequest req = new UpdateRequest();
 req.add(Arrays.asList(new SolrInputDocument(), new SolrInputDocument(), 
null));
   }
 
+  @Test
+  public void testEqualsMethod() {
+final SolrInputDocument doc1 = new SolrInputDocument("id", "1", "value_s", 
"foo");
+final SolrInputDocument doc2 = new SolrInputDocument("id", "2", "value_s", 
"bar");
+final SolrInputDocument doc3 = new SolrInputDocument("id", "3", "value_s", 
"baz");
+/*

Review comment:
   left over from other testing?

##
File path: 
solr/solrj/src/test/org/apache/solr/client/solrj/request/TestUpdateRequest.java
##
@@ -17,52 +17,164 @@
 package org.apache.solr.client.solrj.request;
 
 import java.util.Arrays;
+import java.util.List;
 
+import com.google.common.collect.Lists;
 import org.apache.solr.common.SolrInputDocument;
 import org.junit.Before;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.ExpectedException;
 
+import static org.apache.solr.SolrTestCaseJ4.adoc;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+
 public class TestUpdateRequest {
 
   @Rule
   public ExpectedException exception = ExpectedException.none();
 
   @Before
   public void expectException() {
-exception.expect(NullPointerException.class);

Review comment:
   remove the whole method?

##
File path: solr/solrj/src/java/org/apache/solr/client/solrj/ResponseParser.java
##
@@ -49,4 +52,31 @@ public String getVersion()
   {
 return "2.2";
   }
+
+  @Override
+  public int hashCode() {
+return new HashCodeBuilder()
+.append(getWriterType())
+.append(getContentType())
+.append(getVersion())
+.toHashCode();
+  }
+
+  @Override
+  public boolean equals(Object rhs) {
+if (rhs == null || getClass() != rhs.getClass()) {
+  return false;
+} else if (this == rhs) {
+  return true;
+} else if (hashCode() != rhs.hashCode()) {
+  return false;
+}
+
+final ResponseParser rhsCast = (ResponseParser) rhs;

Review comment:
   I think I prefer Objects.hash, but I'm not sure why? Definitely willing 
to be convinced the other way if there's a reason or a difference or even if 
they're equivalent and there is already inertia here.





This is a

[GitHub] [lucene-solr] madrob commented on a change in pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#discussion_r468711266



##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.management.OperatingSystemMXBean;
+
+import org.apache.solr.core.SolrConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * 
+ * Tracks current CPU usage and triggers if the specified threshold is 
breached.
+ *
+ * This circuit breaker gets the average CPU load over the last minute and uses
+ * that data to take a decision. Ideally, we should be able to cache the value
+ * locally and only query once the minute has elapsed. However, that will 
introduce
+ * more complexity than the current structure and might not get us major 
performance
+ * wins. If this ever becomes a performance bottleneck, that can be considered.
+ * 
+ *
+ * 
+ * The configuration to define which mode to use and the trigger threshold are 
defined in
+ * solrconfig.xml
+ * 
+ */
+public class CPUCircuitBreaker extends CircuitBreaker {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+  private static final OperatingSystemMXBean operatingSystemMXBean = 
ManagementFactory.getOperatingSystemMXBean();
+
+  private final boolean isCpuCircuitBreakerEnabled;
+  private final double cpuUsageThreshold;
+
+  // Assumption -- the value of these parameters will be set correctly before 
invoking getDebugInfo()
+  private final ThreadLocal seenCPUUsage = new ThreadLocal<>();

Review comment:
   thread locals should be static

##
File path: solr/core/src/java/org/apache/solr/core/SolrConfig.java
##
@@ -811,10 +818,18 @@ private void initLibs(SolrResourceLoader loader, boolean 
isConfigsetTrusted) {
 loader.reloadLuceneSPI();
   }
 
-  private void validateMemoryBreakerThreshold() {
+  private void validateCircuitBreakerThresholds() {
 if (useCircuitBreakers) {
-  if (memoryCircuitBreakerThresholdPct > 95 || 
memoryCircuitBreakerThresholdPct < 50) {
-throw new IllegalArgumentException("Valid value range of 
memoryCircuitBreakerThresholdPct is 50 -  95");
+  if (isMemoryCircuitBreakerEnabled) {
+if (memoryCircuitBreakerThresholdPct > 95 || 
memoryCircuitBreakerThresholdPct < 50) {
+  throw new IllegalArgumentException("Valid value range of 
memoryCircuitBreakerThresholdPct is 50 -  95");
+}
+  }
+
+  if (isCpuCircuitBreakerEnabled) {
+if (cpuCircuitBreakerThresholdPct > 95 || 
cpuCircuitBreakerThresholdPct < 40) {
+  throw new IllegalArgumentException("Valid value range for 
cpuCircuitBreakerThresholdPct is 40 - 95");

Review comment:
   I don't think CPU load average is typically measured on a 0-100 scale. 
Can you confirm some sample values of what calculateLiveCPUUsage returns?

##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/CPUCircuitBreaker.java
##
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.solr.util.circuitbreaker;
+
+import java.lang.invoke.MethodHandles;
+import java.lang.management.ManagementFactory;
+import java.lang.manage

[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


atris commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468713194



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);
 
 MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get());
-assertTrue("Incoming accepted new request count did not match. Expected 5 
incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
-mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
-assertTrue("Incoming rejected new request count did not match. Expected 20 
incoming " + mockQueryRateLimiter.rejectedRequestCount.get(),
-mockQueryRateLimiter.rejectedRequestCount.get() > 0);
+assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get());
+
+assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == 
mockQueryRateLimiter.incomingRequestCount.get()

Review comment:
   That is what we do in this assert?
   assertEquals(mockQueryRateLimiter.incomingRequestCount.get(),
   mockQueryRateLimiter.acceptedNewRequestCount.get() + 
mockQueryRateLimiter.rejectedRequestCount.get());





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation

2020-08-11 Thread GitBox


dsmiley commented on a change in pull request #1726:
URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468710559



##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {

Review comment:
   `>` vs `>=` is debatable; there's an argument both ways.  I suspect 
there's a test for it this way but moreover, I don't think we should change it. 
 It's fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation

2020-08-11 Thread GitBox


atris commented on a change in pull request #1726:
URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468709958



##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {

Review comment:
   This seems inconsistent -- should we not be marking no timeout as -1?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468709509



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,20 +65,19 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);
 
 MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get());
-assertTrue("Incoming accepted new request count did not match. Expected 5 
incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),
-mockQueryRateLimiter.acceptedNewRequestCount.get() < 25);
-assertTrue("Incoming rejected new request count did not match. Expected 20 
incoming " + mockQueryRateLimiter.rejectedRequestCount.get(),
-mockQueryRateLimiter.rejectedRequestCount.get() > 0);
+assertEquals(350, mockQueryRateLimiter.incomingRequestCount.get());
+
+assertTrue((mockQueryRateLimiter.acceptedNewRequestCount.get() == 
mockQueryRateLimiter.incomingRequestCount.get()

Review comment:
   Should we assert that accepted + rejected = total?
   And that accepted > 0.

##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -48,7 +47,7 @@ public static void setupCluster() throws Exception {
 configureCluster(1).addConfig(FIRST_COLLECTION, 
configset("cloud-minimal")).configure();
   }
 
-  @Test
+  @Nightly

Review comment:
   This isn't what I meant, sorry for being unclear. Keep this as `@Test` 
but when selecting the number of documents and queries do something like 
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/test/org/apache/lucene/index/TestMultiDocValues.java#L52





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1726: SOLR-14722: timeAllowed should track from req creation

2020-08-11 Thread GitBox


dsmiley commented on a change in pull request #1726:
URL: https://github.com/apache/lucene-solr/pull/1726#discussion_r468709003



##
File path: solr/core/src/java/org/apache/solr/search/SolrQueryTimeoutImpl.java
##
@@ -67,8 +69,21 @@ public boolean shouldExit() {
   }
 
   /**
-   * Method to set the time at which the timeOut should happen.
-   * @param timeAllowed set the time at which this thread should timeout.
+   * Sets or clears the time allowed based on how much time remains from the 
start of the request plus the configured
+   * {@link CommonParams#TIME_ALLOWED}.
+   */
+  public static void set(SolrQueryRequest req) {
+long timeAllowed = req.getParams().getLong(CommonParams.TIME_ALLOWED, -1L);
+if (timeAllowed >= 0L) {
+  set(timeAllowed - (long)req.getRequestTimer().getTime()); // reduce by 
time already spent
+} else {
+  reset();
+}
+  }
+
+  /**
+   * Sets the time allowed (milliseconds), assuming we start a timer 
immediately.
+   * You should probably invoke {@link #set(SolrQueryRequest)} instead.
*/
   public static void set(Long timeAllowed) {

Review comment:
   Oh yeah; I forgot that -- indeed a primitive.  It's weird it was boxed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9448) Make an equivalent to Ant's "run" target for Luke module

2020-08-11 Thread Tomoko Uchida (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175661#comment-17175661
 ] 

Tomoko Uchida commented on LUCENE-9448:
---

Thank you [~dweiss] for your help. 
{quote}luke's build file declares an exportable configuration 'standalone' that 
includes a set of preassembled dependencies and classpath-enriched Luke JAR. 
This configuration is separate from the default project JAR. It can be 
assembled (standaloneAssemble), compressed into a tgz archive 
(standalonePackage) or exported and reused elsewhere
{quote}
As for 'standalone' package, there are drop-in runtime only dependencies (many 
analysis modules) which are not required for development at all. If we make 
complete standalone distribution package by Gradle script, we need to collect 
all such jars or add all of them to compile time dependencies.
 I'll try it a little later (have little time right now).

> Make an equivalent to Ant's "run" target for Luke module
> 
>
> Key: LUCENE-9448
> URL: https://issues.apache.org/jira/browse/LUCENE-9448
> Project: Lucene - Core
>  Issue Type: Sub-task
>Reporter: Tomoko Uchida
>Priority: Minor
> Attachments: LUCENE-9448.patch, LUCENE-9448.patch
>
>
> With Ant build, Luke Swing app can be launched by "ant run" after checking 
> out the source code. "ant run" allows developers to immediately see the 
> effects of UI changes without creating the whole zip/tgz package (originally, 
> it was suggested when integrating Luke to Lucene).
> In Gradle, {{:lucene:luke:run}} task would be easily implemented with 
> {{JavaExec}}, I think.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9456) Stored fields should store the number of chunks in the meta file

2020-08-11 Thread Adrien Grand (Jira)
Adrien Grand created LUCENE-9456:


 Summary: Stored fields should store the number of chunks in the 
meta file
 Key: LUCENE-9456
 URL: https://issues.apache.org/jira/browse/LUCENE-9456
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand


Currently stored fields record numChunks/numDirtyChunks at the very end of the 
data file. They should migrate to the meta file instead, so that they would be 
validated when opening the index (meta files get their checksum validated 
entirely, data files don't).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


atris commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468681532



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);
 
 MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get());
-assertTrue("Incoming accepted new request count did not match. Expected 5 
incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),

Review comment:
   It isnt really a relaxation -- the remaining assert should cover for all 
cases that can happen for rate limiting. The catch is that rate limiting is not 
a guaranteed phenomenon -- we create a high load and it should happen. I have 
added an additional assert -- let me know if it looks fine.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write

2020-08-11 Thread Mike Drob (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob resolved LUCENE-9453.
---
Fix Version/s: master (9.0)
Lucene Fields:   (was: New)
 Assignee: Mike Drob
   Resolution: Fixed

Thanks for the feedback [~dweiss], [~simonw]. Added the assert and committed 
this.

> DocumentWriterFlushControl missing explicit sync on write
> -
>
> Key: LUCENE-9453
> URL: https://issues.apache.org/jira/browse/LUCENE-9453
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Trivial
> Fix For: master (9.0)
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> checkoutAndBlock is not synchronized, but has a non-atomic write to 
> {{numPending}}. Meanwhile, all of the other writes to numPending are in sync 
> methods.
> In this case it turns out to be ok because all of the code paths calling this 
> method are already sync:
> {{synchronized doAfterDocument -> checkout -> checkoutAndBlock}}
> {{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> 
> checkoutAndBlock}}
> If we make {{synchronized checkoutAndBlock}} that protects us against future 
> changes, shouldn't cause any performance impact since the code paths will 
> already be going through a sync block, and will make an IntelliJ warning go 
> away.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9453) DocumentWriterFlushControl missing explicit sync on write

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175651#comment-17175651
 ] 

ASF subversion and git services commented on LUCENE-9453:
-

Commit 092076ec39e0f71ae92d36cd4ebe69e21a97ce4e in lucene-solr's branch 
refs/heads/master from Mike Drob
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=092076e ]

LUCENE-9453 Assert lock held before volatile write (#1734)

Found via IntelliJ warnings.

> DocumentWriterFlushControl missing explicit sync on write
> -
>
> Key: LUCENE-9453
> URL: https://issues.apache.org/jira/browse/LUCENE-9453
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Reporter: Mike Drob
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> checkoutAndBlock is not synchronized, but has a non-atomic write to 
> {{numPending}}. Meanwhile, all of the other writes to numPending are in sync 
> methods.
> In this case it turns out to be ok because all of the code paths calling this 
> method are already sync:
> {{synchronized doAfterDocument -> checkout -> checkoutAndBlock}}
> {{checkoutLargestNonPendingWriter -> synchronized(this) -> checkout -> 
> checkoutAndBlock}}
> If we make {{synchronized checkoutAndBlock}} that protects us against future 
> changes, shouldn't cause any performance impact since the code paths will 
> already be going through a sync block, and will make an IntelliJ warning go 
> away.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob merged pull request #1734: LUCENE-9453 Add sync around volatile write

2020-08-11 Thread GitBox


madrob merged pull request #1734:
URL: https://github.com/apache/lucene-solr/pull/1734


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gautamworah96 commented on pull request #1733: LUCENE-9450 Use BinaryDocValues in the taxonomy writer

2020-08-11 Thread GitBox


gautamworah96 commented on pull request #1733:
URL: https://github.com/apache/lucene-solr/pull/1733#issuecomment-672021513


   Changes in this revision (incorporated from feedback on JIRA):
   
   * Added a call to `advanceExact()` before calling `.binaryValue()` and an 
`assert` to check that the field exists in the index
   * Re-added the `StringField` with the `Field.Store.YES` changed to 
`Field.Store.NO`.
   
   * I've not added new tests at the moment. Trying to get the existing ones to 
work first.
   
   From the error log:
   Note that the code is able to successfully execute the `assert found` 
statement (so the field does exist), and it fails on the next line
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-14727.
---
Fix Version/s: 8.7
   Resolution: Fixed

> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 8.7
>
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
>  gradle/
>  gradlew
>  gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175633#comment-17175633
 ] 

ASF subversion and git services commented on SOLR-14727:


Commit 703becc0372fcfaf8c0184a63bfd9a7070458c6d in lucene-solr's branch 
refs/heads/branch_8x from Erick Erickson
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=703becc ]

SOLR-14727: Add gradle files to the 8x .gitignore file.


> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
>  gradle/
>  gradlew
>  gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14727:
--
Attachment: (was: SOLR-14727.patch)

> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
>  gradle/
>  gradlew
>  gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14727:
--
Attachment: SOLR-14727.patch

> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
>  gradle/
>  gradlew
>  gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14727:
--
Description: 
This is a little different than I thought. Apparently it's an interaction with 
IntelliJ. One sequence is something like:
 * import the Gradle build in IntelliJ on master
 * switch to branch_8x on the command line
 * switch back to master from the command line and you can't because "untracked 
changes would be overwritten by..."

there may be other ways to get into this bind.

At any rate, I don't see a problem with adding

gradle.properties
 gradle/
 gradlew
 gradlew.bat

to .gitignore on branch_8x only.

  was:
This is a little different than I thought. Apparently it's an interaction with 
IntelliJ. One sequence is something like:
 * import the Gradle build in IntelliJ on master
 * switch to branch_8x on the command line
 * switch back to master from the command line and you can't because "untracked 
changes would be overwritten by..."

there may be other ways to get into this bind.

At any rate, I don't see a problem with adding

gradle.properties
gradle/
gradle/
gradlew
gradlew.bat

to .gitignore on branch_8x only.


> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
>  gradle/
>  gradlew
>  gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14727:
--
Attachment: SOLR-14727.patch

> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-14727.patch
>
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
> gradle/
> gradle/
> gradlew
> gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14727:
--
Description: 
This is a little different than I thought. Apparently it's an interaction with 
IntelliJ. One sequence is something like:
 * import the Gradle build in IntelliJ on master
 * switch to branch_8x on the command line
 * switch back to master from the command line and you can't because "untracked 
changes would be overwritten by..."

there may be other ways to get into this bind.

At any rate, I don't see a problem with adding

gradle.properties
gradle/
gradle/
gradlew
gradlew.bat

to .gitignore on branch_8x only.

  was:
It's annoying to switch from master to 8x after building with Gradle and then 
be unable to switch back because Git sees files the gradle directory and thinks 
you have added files.

This will be for 8x only


> Add gradle files to the 8x .gitignore file.
> ---
>
> Key: SOLR-14727
> URL: https://issues.apache.org/jira/browse/SOLR-14727
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> This is a little different than I thought. Apparently it's an interaction 
> with IntelliJ. One sequence is something like:
>  * import the Gradle build in IntelliJ on master
>  * switch to branch_8x on the command line
>  * switch back to master from the command line and you can't because 
> "untracked changes would be overwritten by..."
> there may be other ways to get into this bind.
> At any rate, I don't see a problem with adding
> gradle.properties
> gradle/
> gradle/
> gradlew
> gradlew.bat
> to .gitignore on branch_8x only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468631394



##
File path: lucene/core/src/java/org/apache/lucene/index/SegmentInfos.java
##
@@ -440,7 +440,7 @@ public static final SegmentInfos readCommit(Directory 
directory, ChecksumIndexIn
   if (format >= VERSION_70) { // oldest supported version
 CodecUtil.checkFooter(input, priorE);
   } else {
-throw IOUtils.rethrowAlways(priorE);

Review comment:
   The original compiler complaint was that the throw is inside the finally 
block. Could I replace the "Unreachable code" at the end with this rethrow? I 
believe the logic will be the same.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-11 Thread Erick Erickson (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175609#comment-17175609
 ] 

Erick Erickson commented on SOLR-13412:
---

[~dweiss] Yes, I saw that. My take at this point is to answer your question 
with "no, we shouldn't add such a low-level tool to Solr's distribution". I 
think we can enhance the Luke Request Handler with relative ease to satisfy 
most Solr users.

For those who need more, I suspect the intersection of users who really get 
value from Luke and the users who would be comfortable building Lucene is quite 
large, although I have no proof... I'll wait for complaints ;)

Thanks again for your help with LUCENE-9448

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1732: Clean up many small fixes

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1732:
URL: https://github.com/apache/lucene-solr/pull/1732#discussion_r468622877



##
File path: lucene/core/src/java/org/apache/lucene/analysis/Analyzer.java
##
@@ -367,12 +367,12 @@ public void close() {
 /**
  * Original source of the tokens.
  */
-protected final Consumer source;

Review comment:
   I think it's because the field is final and there is a getter for it, so 
the code analyzer prefers encapsulation?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-11 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175606#comment-17175606
 ] 

David Eric Pugh commented on SOLR-13412:


I've been following this conversation, and wanted to throw my 2cents in. Luke 
is a *really* powerful tool, and I've used it a couple of times to troubleshoot 
some hard to understand problems.   

Having said that, it functions *very* differently then Solr, especially for 
those of us using SolrCloud, and adding it to the Solr distribution feels like 
it goes against the current trend of shrinking the size of the Solr 
distribution.   

I think that the real issue here is that most of our users don't know that Luke 
exists, and that its a powerful tool.   What if we kept Luke as a standalone 
artifact, and instead talked about Luke in the Solr Ref Guide?   We mention the 
Luke request handler on 
https://lucene.apache.org/solr/guide/8_6/implicit-requesthandlers.html, that 
could also link to a page with more details on Luke and where to download it 
from?   Which reminds me we should add the word Luke to the Ref Guide glossary 
page!

I just poked around the lucene.apache.org site, and there is no mention of Luke 
anywhere...   



> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14650) Default autoscaling policy rules are ineffective

2020-08-11 Thread Houston Putman (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Houston Putman resolved SOLR-14650.
---
Fix Version/s: (was: 8.7)
   Resolution: Abandoned

The default autoscaling policy has been removed as of 8.6.1 (and therefore 8.7)

> Default autoscaling policy rules are ineffective
> 
>
> Key: SOLR-14650
> URL: https://issues.apache.org/jira/browse/SOLR-14650
> Project: Solr
>  Issue Type: Bug
>  Components: AutoScaling
>Affects Versions: 8.6
>Reporter: Andrzej Bialecki
>Assignee: Andrzej Bialecki
>Priority: Major
>
> There's a faulty logic in {{Assign.usePolicyFramework()}} that makes the 
> default policy (added in SOLR-12845) ineffective - that is, in the absence of 
> any user-provided modifications to the policy rules the code reverts to 
> LEGACY assignment.
> The logic in this method is convoluted and opaque, it's difficult for users 
> to be sure what strategy is used when - instead we should make this choice 
> explicit.
> (BTW, the default ruleset is probably too expensive for large clusters 
> anyway, given the unresolved performance problems in the policy engine).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9379) Directory based approach for index encryption

2020-08-11 Thread Bruno Roustant (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175583#comment-17175583
 ] 

Bruno Roustant commented on LUCENE-9379:


[~Raji] maybe a better approach would be to have one tenant per collection, but 
you might have many tenants so the performance for many collection is poor? If 
this is the case, then I think the root problem is the perf for many 
collections. Without composite id router you could use an OS encryption per 
collection.

> Directory based approach for index encryption
> -
>
> Key: LUCENE-9379
> URL: https://issues.apache.org/jira/browse/LUCENE-9379
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Bruno Roustant
>Assignee: Bruno Roustant
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> +Important+: This Lucene Directory wrapper approach is to be considered only 
> if an OS level encryption is not possible. OS level encryption better fits 
> Lucene usage of OS cache, and thus is more performant.
> But there are some use-case where OS level encryption is not possible. This 
> Jira issue was created to address those.
> 
>  
> The goal is to provide optional encryption of the index, with a scope limited 
> to an encryptable Lucene Directory wrapper.
> Encryption is at rest on disk, not in memory.
> This simple approach should fit any Codec as it would be orthogonal, without 
> modifying APIs as much as possible.
> Use a standard encryption method. Limit perf/memory impact as much as 
> possible.
> Determine how callers provide encryption keys. They must not be stored on 
> disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery

2020-08-11 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14728:
--
Description: A simple optimization can be put in place to massively improve 
join performance when the TopLevelJoinQuery is performing a self join (same 
core) and the *to* and *from* fields are the same field. In this scenario the 
top level doc values ordinals can be used directly as a filter avoiding the 
most expensive part of the join which is the bytes ref reconciliation between 
the *to* and *from* fields.  (was: A simple optimization can be put in place to 
massively improve join performance when the TopLevelJoinQuery is performing a 
self join (same core) and the *to* and *from* fields are the same field. In 
this scenario the top level doc values ordinals can be used directly as a 
filter avoiding the most expensive part of the join which is the bytes ref 
reconciliation between the *to* and *from* fields. )

> Add self join optimization to the TopLevelJoinQuery
> ---
>
> Key: SOLR-14728
> URL: https://issues.apache.org/jira/browse/SOLR-14728
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> A simple optimization can be put in place to massively improve join 
> performance when the TopLevelJoinQuery is performing a self join (same core) 
> and the *to* and *from* fields are the same field. In this scenario the top 
> level doc values ordinals can be used directly as a filter avoiding the most 
> expensive part of the join which is the bytes ref reconciliation between the 
> *to* and *from* fields.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery

2020-08-11 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14728:
--
Description: A simple optimization can be put in place to massively improve 
join performance when the TopLevelJoinQuery is performing a self join (same 
core) and the *to* and *from* fields are the same field. In this scenario the 
top level doc values ordinals can be used directly as a filter avoiding the 
most expensive part of the join which is the bytes ref reconciliation between 
the *to* and *from* fields.   (was: A simple optimization that can be put in 
place to massively improve join performance when the TopLevelJoinQuery is 
performing a self join (same core) and the *to* and *from* fields are the same 
field. In this scenario the top level doc values ordinals can be used directly 
as a filter avoiding the most expensive part of the join which is the bytes ref 
reconciliation between the *to* and *from* fields. )

> Add self join optimization to the TopLevelJoinQuery
> ---
>
> Key: SOLR-14728
> URL: https://issues.apache.org/jira/browse/SOLR-14728
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> A simple optimization can be put in place to massively improve join 
> performance when the TopLevelJoinQuery is performing a self join (same core) 
> and the *to* and *from* fields are the same field. In this scenario the top 
> level doc values ordinals can be used directly as a filter avoiding the most 
> expensive part of the join which is the bytes ref reconciliation between the 
> *to* and *from* fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9455) ExitableTermsEnum (in ExitableDirectoryReader) should sample next()

2020-08-11 Thread David Smiley (Jira)
David Smiley created LUCENE-9455:


 Summary: ExitableTermsEnum (in ExitableDirectoryReader) should 
sample next()
 Key: LUCENE-9455
 URL: https://issues.apache.org/jira/browse/LUCENE-9455
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/other
Reporter: David Smiley


ExitableTermsEnum calls "checkAndThrow" on *every* call to next().  This is too 
expensive; it should sample.  I observed ElasticSearch uses the same approach; 
I think Lucene would benefit from this:
https://github.com/elastic/elasticsearch/blob/4af4eb99e18fdaadac879b1223e986227dd2ee71/server/src/main/java/org/elasticsearch/search/internal/ExitableDirectoryReader.java#L151

CC [~jimczi]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery

2020-08-11 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14728:
--
Description: A simple optimization that can be put in place to massively 
improve join performance when the TopLevelJoinQuery is performing a self join 
(same core) and the *to* and *from* fields are the same field. In this scenario 
the top level doc values ordinals can be used directly as a filter avoiding the 
most expensive part of the join which is the bytes ref reconciliation between 
the *to* and *from* fields.   (was: A simple strategy can be put in place to 
massively improve join performance when the TopLevelJoinQuery is performing a 
self join (same core) and the *to* and *from* fields are the same field. In 
this scenario the top level doc values ordinals can be used directly as a 
filter, *avoiding* the most expensive part of the join which is the bytes ref 
reconciliation between the *to* and *from* fields. )

> Add self join optimization to the TopLevelJoinQuery
> ---
>
> Key: SOLR-14728
> URL: https://issues.apache.org/jira/browse/SOLR-14728
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> A simple optimization that can be put in place to massively improve join 
> performance when the TopLevelJoinQuery is performing a self join (same core) 
> and the *to* and *from* fields are the same field. In this scenario the top 
> level doc values ordinals can be used directly as a filter avoiding the most 
> expensive part of the join which is the bytes ref reconciliation between the 
> *to* and *from* fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (SOLR-14692) JSON Facet "join" domain should take optional "method" property

2020-08-11 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski resolved SOLR-14692.

Fix Version/s: 8.7
   master (9.0)
 Assignee: Jason Gerlowski
   Resolution: Fixed

All wrapped up; thanks to Munendra for the review comments.

> JSON Facet "join" domain should take optional "method" property
> ---
>
> Key: SOLR-14692
> URL: https://issues.apache.org/jira/browse/SOLR-14692
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting, JSON Request API
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Minor
> Fix For: master (9.0), 8.7
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Solr offers several different join implementations which can be switched off 
> providing the "method" local-param on JoinQuery's.  Each of these 
> implementations has different performance characteristics and can behave very 
> differently depending on a user's data and use case.
> When joins are used internally as a part of JSON Faceting's "join" 
> domain-transform though, users have no way to specify which implementation 
> they would like to use.  We should correct this by adding a "method" property 
> to the join domain-transform.  This will let user's choose the join that's 
> most performant for their use case during JSON Facet requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery

2020-08-11 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-14728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-14728:
--
Description: A simple strategy can be put in place to massively improve 
join performance when the TopLevelJoinQuery is performing a self join (same 
core) and the *to* and *from* fields are the same field. In this scenario the 
top level doc values ordinals can be used directly as a filter, *avoiding* the 
most expensive part of the join which is the bytes ref reconciliation between 
the *to* and *from* fields.   (was: A simple strategy can be put in place to 
massively improve join performance when the TopLevelJoinQuery is performing a 
self join (same core) and the *to* and *from* fields are the same field. In 
this scenario the top level doc values ordinals can be used directly as a 
filter avoiding the most expensive part of the join which is the bytes ref 
reconciliation between the *to* and *from* fields. )

> Add self join optimization to the TopLevelJoinQuery
> ---
>
> Key: SOLR-14728
> URL: https://issues.apache.org/jira/browse/SOLR-14728
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> A simple strategy can be put in place to massively improve join performance 
> when the TopLevelJoinQuery is performing a self join (same core) and the *to* 
> and *from* fields are the same field. In this scenario the top level doc 
> values ordinals can be used directly as a filter, *avoiding* the most 
> expensive part of the join which is the bytes ref reconciliation between the 
> *to* and *from* fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14728) Add self join optimization to the TopLevelJoinQuery

2020-08-11 Thread Joel Bernstein (Jira)
Joel Bernstein created SOLR-14728:
-

 Summary: Add self join optimization to the TopLevelJoinQuery
 Key: SOLR-14728
 URL: https://issues.apache.org/jira/browse/SOLR-14728
 Project: Solr
  Issue Type: New Feature
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


A simple strategy can be put in place to massively improve join performance 
when the TopLevelJoinQuery is performing a self join (same core) and the *to* 
and *from* fields are the same field. In this scenario the top level doc values 
ordinals can be used directly as a filter avoiding the most expensive part of 
the join which is the bytes ref reconciliation between the *to* and *from* 
fields. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14692) JSON Facet "join" domain should take optional "method" property

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175544#comment-17175544
 ] 

ASF subversion and git services commented on SOLR-14692:


Commit d6992f74e0673d2ed5593c6d9312651e94267446 in lucene-solr's branch 
refs/heads/branch_8x from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d6992f7 ]

SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms 
(#1707)


> JSON Facet "join" domain should take optional "method" property
> ---
>
> Key: SOLR-14692
> URL: https://issues.apache.org/jira/browse/SOLR-14692
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting, JSON Request API
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Solr offers several different join implementations which can be switched off 
> providing the "method" local-param on JoinQuery's.  Each of these 
> implementations has different performance characteristics and can behave very 
> differently depending on a user's data and use case.
> When joins are used internally as a part of JSON Faceting's "join" 
> domain-transform though, users have no way to specify which implementation 
> they would like to use.  We should correct this by adding a "method" property 
> to the join domain-transform.  This will let user's choose the join that's 
> most performant for their use case during JSON Facet requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-14727) Add gradle files to the 8x .gitignore file.

2020-08-11 Thread Erick Erickson (Jira)
Erick Erickson created SOLR-14727:
-

 Summary: Add gradle files to the 8x .gitignore file.
 Key: SOLR-14727
 URL: https://issues.apache.org/jira/browse/SOLR-14727
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Erick Erickson
Assignee: Erick Erickson


It's annoying to switch from master to 8x after building with Gradle and then 
be unable to switch back because Git sees files the gradle directory and thinks 
you have added files.

This will be for 8x only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-13412) Make the Lucene Luke module available from a Solr distribution

2020-08-11 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-13412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175528#comment-17175528
 ] 

Dawid Weiss commented on SOLR-13412:


bq. My current thinking is that if the stand-alone Luke app is packaged with 
Lucene automagically,

Look at the patch I provided, Erick. This is simple to do. Question is whether 
it makes sense to add such a low-level tool to Solr's distribution.

> Make the Lucene Luke module available from a Solr distribution
> --
>
> Key: SOLR-13412
> URL: https://issues.apache.org/jira/browse/SOLR-13412
> Project: Solr
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: SOLR-13412.patch
>
>
> Now that [~Tomoko Uchida] has put in a great effort to bring Luke into the 
> project, I think it would be good to be able to access it from a Solr distro.
> I want to go to the right place under the Solr install directory and start 
> Luke up to examine the local indexes. 
> This ticket is explicitly _not_ about accessing it from the admin UI, Luke is 
> a stand-alone app that must be invoked on the node that has a Lucene index on 
> the local filesystem
> We need to 
>  * have it included in Solr when running "ant package". 
>  * add some bits to the ref guide on how to invoke
>  ** Where to invoke it from
>  ** mention anything that has to be installed.
>  ** any other "gotchas" someone just installing Solr should be aware of.
>  * Ant should not be necessary.
>  * 
>  
> I'll assign this to myself to keep track of, but would not be offended in the 
> least if someone with more knowledge of "ant package" and the like wanted to 
> take it over ;)
> If we can do it at all



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] madrob commented on a change in pull request #1736: Harden RequestRateLimiter Tests

2020-08-11 Thread GitBox


madrob commented on a change in pull request #1736:
URL: https://github.com/apache/lucene-solr/pull/1736#discussion_r468542925



##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);

Review comment:
   Can we limit the higher footprint to Nightly?

##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -124,12 +120,12 @@ private void processTest(CloudSolrClient client) throws 
Exception {
 List> futures;
 
 try {
-  for (int i = 0; i < 25; i++) {
+  for (int i = 0; i < numQueries; i++) {
 callableList.add(() -> {
   try {
 QueryResponse response = client.query(new SolrQuery("*:*"));
 
-assertEquals(100, response.getResults().getNumFound());
+assertEquals(1, response.getResults().getNumFound());

Review comment:
   should be numDocuments

##
File path: 
solr/core/src/test/org/apache/solr/servlet/TestRequestRateLimiter.java
##
@@ -66,15 +66,11 @@ public void testConcurrentQueries() throws Exception {
 
 solrDispatchFilter.replaceRateLimitManager(rateLimitManager);
 
-processTest(client);
+processTest(client, 1 /* number of documents */, 350 /* number of 
queries */);
 
 MockRequestRateLimiter mockQueryRateLimiter = (MockRequestRateLimiter) 
rateLimitManager.getRequestRateLimiter(SolrRequest.SolrRequestType.QUERY);
 
-assertEquals(25, mockQueryRateLimiter.incomingRequestCount.get());
-assertTrue("Incoming accepted new request count did not match. Expected 5 
incoming " + mockQueryRateLimiter.acceptedNewRequestCount.get(),

Review comment:
   Somewhat concerning that the fix to the test is to relax the assertion 
conditions





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-08-11 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175519#comment-17175519
 ] 

Dawid Weiss commented on LUCENE-9439:
-

Seems to pass all tests and checks. Works for me in a production system too.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch, matchhighlighter.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14630) CloudSolrClient doesn't pick correct core when server contains more shards

2020-08-11 Thread Ivan Djurasevic (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175518#comment-17175518
 ] 

Ivan Djurasevic commented on SOLR-14630:


{quote}When you say "batch update", do you mean more than one document in the 
same request or perhaps something else? If the batch size was one then does the 
issue happen also, I wonder?
{quote}
Batch size is not problem. Same issue is happening when batch contains 50 and 
when contains 1 document.
{quote}I'm not very familiar with update request processor chains but the 
[https://lucene.apache.org/solr/guide/8_6/update-request-processors.html#update-processors-in-solrcloud]
 documentation was useful and the SOLR-8030 ticket mentioned in it sounds 
interesting.
{quote}
Update processor chain is not problem(they have some other issues, i will raise 
bugs for that team, too), i was describing our process and why is important to 
hit correct shard without forwarding requests.
{quote}What if {{inputCollections}} contained more than one element?
{quote}
Yes, this is a problem, i was trying to search across collections and with my 
fix, it doesn't work. It seems that HttpSolrCall class can't parse URL when URL 
contain more core names.
{quote}What if {{inputCollections}} contained an alias that was resolved at 
[line 
1080|https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.5.2/solr/solrj/src/java/org/apache/solr/client/solrj/impl/BaseCloudSolrClient.java#L1080],
 does it matter that before the alias (e.g. {{collection_one}}) was appended 
but now a core name (e.g. {{collection1_shard2_replica1}}) is appended?
{quote}
Aliases should't be a problem, when we solve issue with multiple 
collections(because we found real collection names before creating URL).

Unfortunately, to fix this issue we will need to refactor HttpSolrCall class, 
too.

 

> CloudSolrClient doesn't pick correct core when server contains more shards
> --
>
> Key: SOLR-14630
> URL: https://issues.apache.org/jira/browse/SOLR-14630
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud, SolrJ
>Affects Versions: 8.5.1, 8.5.2
>Reporter: Ivan Djurasevic
>Priority: Major
> Attachments: 
> 0001-SOLR-14630-Test-case-demonstrating-_route_-is-broken.patch
>
>
> Precondition: create collection with 4 shards on one server.
> During search and update, solr cloud client picks wrong core even _route_ 
> exists in query param. In BaseSolrClient class, method sendRequest, 
>  
> {code:java}
> sortedReplicas.forEach( replica -> {
>   if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(ZkCoreNodeProps.getCoreUrl(replica.getBaseUrl(), 
> joinedInputCollections));
>   }
> });
> {code}
>  
> Previous part of code adds base url(localhost:8983/solr/collection_name) to 
> theUrlList, it doesn't create core address(localhost:8983/solr/core_name). If 
> we change previous code to:
> {quote}
> {code:java}
> sortedReplicas.forEach(replica -> {
> if (seenNodes.add(replica.getNodeName())) {
> theUrlList.add(replica.getCoreUrl());
> }
> });{code}
> {quote}
> Solr cloud client picks core which is defined with  _route_ parameter.
>  
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gerlowskija merged pull request #1707: SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms

2020-08-11 Thread GitBox


gerlowskija merged pull request #1707:
URL: https://github.com/apache/lucene-solr/pull/1707


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14692) JSON Facet "join" domain should take optional "method" property

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175514#comment-17175514
 ] 

ASF subversion and git services commented on SOLR-14692:


Commit 5887032e95953a8d93d723e1a5210793472def71 in lucene-solr's branch 
refs/heads/master from Jason Gerlowski
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5887032 ]

SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms 
(#1707)



> JSON Facet "join" domain should take optional "method" property
> ---
>
> Key: SOLR-14692
> URL: https://issues.apache.org/jira/browse/SOLR-14692
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: faceting, JSON Request API
>Affects Versions: master (9.0), 8.6
>Reporter: Jason Gerlowski
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Solr offers several different join implementations which can be switched off 
> providing the "method" local-param on JoinQuery's.  Each of these 
> implementations has different performance characteristics and can behave very 
> differently depending on a user's data and use case.
> When joins are used internally as a part of JSON Faceting's "join" 
> domain-transform though, users have no way to specify which implementation 
> they would like to use.  We should correct this by adding a "method" property 
> to the join domain-transform.  This will let user's choose the join that's 
> most performant for their use case during JSON Facet requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] gerlowskija commented on pull request #1707: SOLR-14692: Allow 'method' specification on JSON Facet join domain transforms

2020-08-11 Thread GitBox


gerlowskija commented on pull request #1707:
URL: https://github.com/apache/lucene-solr/pull/1707#issuecomment-671912624


   Thanks for the review Munendra; I made the changes you suggested.  Merging 
now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9439) Matches API should enumerate hit fields that have no positions (no iterator)

2020-08-11 Thread Dawid Weiss (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175481#comment-17175481
 ] 

Dawid Weiss commented on LUCENE-9439:
-

A cleaned up version in the PR. Running tests and checks now.

> Matches API should enumerate hit fields that have no positions (no iterator)
> 
>
> Key: LUCENE-9439
> URL: https://issues.apache.org/jira/browse/LUCENE-9439
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Attachments: LUCENE-9439.patch, matchhighlighter.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I have been fiddling with Matches API and it's great. There is one corner 
> case that doesn't work for me though -- queries that affect fields without 
> positions return {{MatchesUtil.MATCH_WITH_NO_TERMS}} but this constant is 
> problematic as it doesn't carry the field name that caused it (returns null).
> The associated fromSubMatches combines all these constants into one (or 
> swallows them) which is another problem.
> I think it would be more consistent if MATCH_WITH_NO_TERMS was replaced with 
> a true match (carrying field name) returning an empty iterator (or a constant 
> "empty" iterator NO_TERMS).
> I have a very compelling use case: I wrote an "auto-highlighter" that runs on 
> top of Matches API and automatically picks up query-relevant fields and 
> snippets. Everything works beautifully except for cases where fields are 
> searchable but don't have any positions (token-like fields).
> I can work on a patch but wanted to reach out first - [~romseygeek]?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris edited a comment on pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-11 Thread GitBox


atris edited a comment on pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#issuecomment-671879842


   > The configuration can be as simple as
   > 
   > ``
   > 
   > This way you can just read all the attributes all at once from the 
`PluginInfo` .
   > CircuitBreaker should be a type of plugin. It should be an interface
   
   As discussed offline, I will refactor circuit breaker infrastructure to use 
PluginInfo as a part of 8.7 (hence will leave this PR's JIRA open for that 
effort). Not proceeding with that effort in this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-08-11 Thread Atri Sharma (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175453#comment-17175453
 ] 

Atri Sharma commented on SOLR-14588:


As discussed offline, this can be done as a refactor before 8.7 – hence leaving 
this Jira open to track that specific effort.

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] atris commented on pull request #1737: SOLR-14615: Implement CPU Utilization Based Circuit Breaker

2020-08-11 Thread GitBox


atris commented on pull request #1737:
URL: https://github.com/apache/lucene-solr/pull/1737#issuecomment-671879842


   > The configuration can be as simple as
   > 
   > ``
   > 
   > This way you can just read all the attributes all at once from the 
`PluginInfo` .
   > CircuitBreaker should be a type of plugin. It should be an interface
   
   As discussed offline, I will refactor circuit breaker infrastructure to use 
PluginInfo as a part of 8.7 (hence will leave this PR's JIRA open for that 
effort). No proceeding with that effort in this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-9454) Upgrade hamcrest to version 2.2

2020-08-11 Thread Dawid Weiss (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9454.
-
Resolution: Fixed

> Upgrade hamcrest to version 2.2
> ---
>
> Key: LUCENE-9454
> URL: https://issues.apache.org/jira/browse/LUCENE-9454
> Project: Lucene - Core
>  Issue Type: Task
>Affects Versions: master (9.0)
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9454) Upgrade hamcrest to version 2.2

2020-08-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175420#comment-17175420
 ] 

ASF subversion and git services commented on LUCENE-9454:
-

Commit 5375a2d2ada2bb3bd94cffcb49a730ec234c8649 in lucene-solr's branch 
refs/heads/master from Dawid Weiss
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5375a2d ]

LUCENE-9454: upgrade hamcrest to version 2.2. (#1738)



> Upgrade hamcrest to version 2.2
> ---
>
> Key: LUCENE-9454
> URL: https://issues.apache.org/jira/browse/LUCENE-9454
> Project: Lucene - Core
>  Issue Type: Task
>Affects Versions: master (9.0)
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] dweiss merged pull request #1738: LUCENE-9454: upgrade hamcrest to version 2.2.

2020-08-11 Thread GitBox


dweiss merged pull request #1738:
URL: https://github.com/apache/lucene-solr/pull/1738


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-08-11 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175402#comment-17175402
 ] 

Noble Paul edited comment on SOLR-14588 at 8/11/20, 9:38 AM:
-

The configuration can be as follows
{code:xml}


{code}
Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have  a 
reference of circuit breaker


was (Author: noble.paul):
The configuration can be as follows
{code:xml}


{code}
Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have  a 
reference of circuit breaker

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-08-11 Thread Noble Paul (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175402#comment-17175402
 ] 

Noble Paul edited comment on SOLR-14588 at 8/11/20, 9:38 AM:
-

The configuration can be as follows
{code:xml}


{code}
Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have  a 
reference of circuit breaker


was (Author: noble.paul):
The configuration can be as follows
{code:xml}


{code}
Nowhere else in {{solrconfig.xml}} or {{SolrConfig.java}} should we have  a 
reference of circuit breaker

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



  1   2   >