[jira] [Resolved] (LUCENE-9814) fix extremely slow 7.0 backwards tests in master

2021-02-26 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-9814.
-
Fix Version/s: master (9.0)
   Resolution: Fixed

> fix extremely slow 7.0 backwards tests in master
> 
>
> Key: LUCENE-9814
> URL: https://issues.apache.org/jira/browse/LUCENE-9814
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Fix For: master (9.0)
>
> Attachments: LUCENE-9814.patch
>
>
> The 7.0 backwards tests added to master must have come from an older branch 
> before they were fixed: they've added minutes to my test times.
> These tests have already been fixed in master, so that the crazy corner-case 
> stress tests are only running slowly in jenkins and we don't have 15-30s long 
> tests locally.
> Re-applying same fixes to 7.0 tests removes minutes from my test times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9814) fix extremely slow 7.0 backwards tests in master

2021-02-26 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292004#comment-17292004
 ] 

ASF subversion and git services commented on LUCENE-9814:
-

Commit 373e1d6c83f5d3e24c9c979acb41a36182f07097 in lucene-solr's branch 
refs/heads/master from Robert Muir
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=373e1d6 ]

LUCENE-9814: fix extremely slow 7.0 backwards tests in master

The 7.0 backwards tests added to master must have come from an older
branch before they were fixed: they've added minutes to my test times.

These tests have already been fixed in master, so that the crazy
corner-case stress tests are only running slowly in jekins and we dont
have 15-30s long tests locally.

Re-applying same fixes to 7.0 tests removes minutes from my test times.


> fix extremely slow 7.0 backwards tests in master
> 
>
> Key: LUCENE-9814
> URL: https://issues.apache.org/jira/browse/LUCENE-9814
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9814.patch
>
>
> The 7.0 backwards tests added to master must have come from an older branch 
> before they were fixed: they've added minutes to my test times.
> These tests have already been fixed in master, so that the crazy corner-case 
> stress tests are only running slowly in jenkins and we don't have 15-30s long 
> tests locally.
> Re-applying same fixes to 7.0 tests removes minutes from my test times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15195) Onboard committers who are not on PMC

2021-02-26 Thread Jira


[ 
https://issues.apache.org/jira/browse/SOLR-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291983#comment-17291983
 ] 

Jan Høydahl commented on SOLR-15195:


Actually there is currently no code in solr repos 😉 Yet, this will be completed 
soon. 

> Onboard committers who are not on PMC
> -
>
> Key: SOLR-15195
> URL: https://issues.apache.org/jira/browse/SOLR-15195
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> TLP was started with the PMC only. Now we need to give commit bit to the 
> existing Lucene committers, as decided by VOTE when establishing the project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


janhoy commented on pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#issuecomment-786956265


   Could you bring the or up to date with master? Lots of unrelated stuff in 
the diff now.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15202) Rule-Based Authorization Plugin not honoring "collection" permission parameter

2021-02-26 Thread Ken Liccardo (Jira)
Ken Liccardo created SOLR-15202:
---

 Summary: Rule-Based Authorization Plugin not honoring "collection" 
permission parameter
 Key: SOLR-15202
 URL: https://issues.apache.org/jira/browse/SOLR-15202
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Authorization
Affects Versions: 8.8.1
 Environment: Debian Buster, openjdk 11, Solr 8.8.1 stand-alone, 
installed as a service
Reporter: Ken Liccardo


It appears the "collection" parameter of authorization.permissions in 
security.json is not honored.  That is, a request made to a collection endpoint 
by an unauthorized user(role) is allowed.  For example, consider the following 
permissions entry in authorization section of security.json:

{{"permissions":[\{"name":"p1","collection":"col1","path":"/select","role":"col1-query"}]}}

A user who is NOT assigned role "col1-query" may still query this collection at 
the following endpoint:

{{[/solr/col1/select?q=id%3A*|http://myserver/solr/col1/select?q=id%3A*]}}

However, if the "collection" parameter is removed from the permissions as 
follows:

{{"permissions":[\{"name":"p1","path":"/select","role":"col1-query}]}}

then a user who is NOT assigned role "col1-query" is rightfully blocked from 
the endpoint with error 403.

In other words, the "collection" parameter, when present in security.json 
authorization.permissions section, is not being matched against the request, 
and therefore the restriction represented by this permissions entry is not 
enacted.

 

After further investigation by turning on debug logging for the 
RuleBasedAuthorizationPlugin and RuleBasedAuthorizationPluginBase, the 
authorization request is logged as follows:

{{o.a.s.s.RuleBasedAuthorizationPluginBase Attempting to authorize request to 
[/select] of type: [READ], associated with collections[[]]}}

So, even thought the request was made to collection "col1", for some reason 
this information is not being passed to the plugin, as represented by the empty 
collections array in the log message "... associated with collections [[ ]]".  
In the java code, RuleBasedAuthorizationPluginBase.java, this information 
appears to come from context.getCollectionRequests(), which appears to be 
returning an empty array [ ] instead of, I suppose, ["col1"] that one might 
expect from the request /solr/col1/select.

Whether this is a problem in solr.RuleBasedAuthorizationPlugin, or in whatever 
module passes the context object to the Plugin, I do not know at this point.  
But whatever the case, it renders impotent the potentially highly useful 
"collection" parameter that would allow us to restrict access by collection 
name.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude merged pull request #221: Work with basic-auth enabled SolrCloud clusters

2021-02-26 Thread GitBox


thelabdude merged pull request #221:
URL: https://github.com/apache/lucene-solr-operator/pull/221


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude closed issue #218: Enhancements for working with basic auth enabled Solr clusters

2021-02-26 Thread GitBox


thelabdude closed issue #218:
URL: https://github.com/apache/lucene-solr-operator/issues/218


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15080) Apache Zeppelin Sandbox Integration

2021-02-26 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291882#comment-17291882
 ] 

Jason Gerlowski commented on SOLR-15080:


Oh really?  I'm pleasantly surprised and that's def a nice tip to pick up. I 
didn't realize that'd obey the {{-c}} option.

That said, I think my general point about the examples coupling data and 
topology/config together still holds, even if my specific example doesn't :P.  
It's great you can run the techproducts example in SolrCloud, but the 
flexibility is ultimately limited.  I may put my foot in my mouth again here, 
but afaik there's no way for techproducts dataset/example to use > 1 Solr node?

That's my ultimate point here.  If we add nyc311 as a new dataset, I'd love to 
see that happen in a way that's (1) agnostic of the ZeppelinTool specifically, 
and (2) in a way that lets you load it into any shape and size of cluster.

And sure, I'll switch to PR on the next iteration.  I'm not sure why I chose a 
patch initially, and I only stuck w/ it today in an attempt at consistency.

> Apache Zeppelin Sandbox Integration  
> -
>
> Key: SOLR-15080
> URL: https://issues.apache.org/jira/browse/SOLR-15080
> Project: Solr
>  Issue Type: New Feature
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming 
> Expression" libraries, Solr has a lot of analytics and data exploration 
> capabilities to show off in a "notebook" environment.  Case in point - the 
> "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs 
> make heavy use of screenshots taken from Zeppelin, a popular notebook project 
> run by the ASF.  Interested readers are going to want to try their own hand 
> at replicating the specific visualizations showed off in those docs, and in 
> using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might 
> deter or thwart unfamiliar users.  I'd love to see Solr make this easier by 
> offering some sort of integration point with Zeppelin to get users up and 
> running.
> I'm still up in the air on what form would be best for such an integration.  
> But as a strawman I've attached a patch that creates a "zeppelin" tool for 
> "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user 
> up to play with a particular use case without any fuss or configuration on 
> their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to 
> talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It 
> contains other commands to start/stop Zeppelin and clean out the Zeppelin 
> sandbox, but draws the line there in terms of exposing Zeppelin functionality 
> more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr-operator] thelabdude commented on a change in pull request #221: Work with basic-auth enabled SolrCloud clusters

2021-02-26 Thread GitBox


thelabdude commented on a change in pull request #221:
URL: 
https://github.com/apache/lucene-solr-operator/pull/221#discussion_r583859256



##
File path: docs/solr-cloud/solr-cloud-crd.md
##
@@ -543,3 +543,234 @@ spec:
 ```
 The example settings above will result in your Solr pods getting names like: 
`-search-solrcloud-0.k8s.solr.cloud` 
 which you can request TLS certificates from LetsEncrypt assuming you own the 
`k8s.solr.cloud` domain.
+
+## Solr Security: Authentication and Authorization
+
+All well-configured Solr clusters should enforce users to authenticate, even 
for read-only operations. Even if you want
+to allow anonymous query requests from unknown users, you should make this 
explicit using Solr's rule-based authorization
+plugin. In other words, always enforce security and then relax constraints as 
needed for specific endpoints based on your
+use case. The Solr operator can bootstrap a default security configuration for 
your SolrCloud during initialization. As such,
+there is no reason to deploy an unsecured SolrCloud cluster when using the 
Solr operator. In most cases, you'll want to combine
+basic authentication with TLS to ensure credentials are never passed in clear 
text.
+
+For background on Solr security, please refer to the [Reference 
Guide](https://lucene.apache.org/solr/guide) for your version of Solr.
+
+Basic authentication is the only authentication scheme supported by the Solr 
operator at this time. In general, you have 
+two basic options for configuring basic authentication with the Solr operator:
+1. Let the Solr operator bootstrap the `security.json` to configure *basic 
authentication* for Solr.
+2. Supply your own `security.json` to Solr, which must define a user account 
that the operator can use to make API requests to secured Solr pods.
+
+If you choose option 2, then you need to provide the credentials to the Solr 
operator using a Kubernetes [Basic Authentication 
Secret](https://kubernetes.io/docs/concepts/configuration/secret/#basic-authentication-secret).
+With option 1, the operator creates the Basic Authentication Secret for you.
+
+### Option 1: Bootstrap Security
+
+The easiest way to get started with Solr security is to have the operator 
bootstrap a `security.json` (stored in ZK) as part of the initial deployment 
process.
+To activate this feature, add the following configuration to your SolrCloud 
CRD definition YAML:
+```
+spec:
+  ...
+  solrSecurity:
+authenticationType: Basic
+```
+
+Once the cluster is up, you'll need the `admin` user password to login to the 
Solr Admin UI.
+The `admin` user will have a random password generated by the operator during 
`security.json` bootstrapping.
+Use the following command to retrieve the password from the bootstrap secret 
created by the operator:
+```
+kubectl get secret -solrcloud-security-bootstrap -o 
jsonpath='{.data.admin}' | base64 --decode
+```
+_where `` is the name of your SolrCloud_
+
+Once `security.json` is bootstrapped, the operator will not update it! You're 
expected to use the `admin` user to access the Security API to make further 
changes.
+In addition to the `admin` user, the operator defines a `solr` user, which has 
basic read access to Solr resources. You can retrieve the `solr` user password 
using:
+```
+kubectl get secret -solrcloud-security-bootstrap -o 
jsonpath='{.data.solr}' | base64 --decode
+```
+
+The operator makes requests to secured Solr endpoints as the `k8s-oper` user; 
credentials for the `k8s-oper` user are stored in a separate secret of type 
`kubernetes.io/basic-auth`
+with name `-solrcloud-basic-auth`. The `k8s-oper` user is configured 
with read-only access to a minimal set of endpoints, see details in the 
**Authorization** sub-section below.
+Remember, if you change the `k8s-oper` password using the Solr security API, 
then you **must** update the secret with the new password or the operator will 
be locked out.
+Also, changing the password for the `k8s-oper` user in the K8s secret after 
bootstrapping will not update Solr! You're responsible for changing the 
password in both places.
+
+ Liveness and Readiness Probes
+
+We recommend configuring Solr to allow un-authenticated access over HTTP to 
the probe endpoint(s) and the bootstrapped `security.json` does this for you 
automatically (see next sub-section). 
+However, if you want to secure the probe endpoints, then you need to set 
`probesRequireAuth: true` as shown below:
+```
+spec:
+  ...
+  solrSecurity:
+authenticationType: Basic
+probesRequireAuth: true
+```
+When `probesRequireAuth` is set to `true`, the liveness and readiness probes 
execute a command instead of using HTTP. 
+The operator configures a command instead of setting the `Authorization` 
header for the HTTP probes, as that would require a restart of all pods if the 
password changes. 
+With a command, we can load the username and password from a secret; 
Kubernetes will 
+[update the mounted secret 
files

[jira] [Commented] (SOLR-15080) Apache Zeppelin Sandbox Integration

2021-02-26 Thread David Eric Pugh (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291863#comment-17291863
 ] 

David Eric Pugh commented on SOLR-15080:


I run {{solr/bin -c -e techproducts}} all the time, so I'm not sure about the 
data set and the topology.   Does it make sense to open up a PR versus a patch 
file for this?

> Apache Zeppelin Sandbox Integration  
> -
>
> Key: SOLR-15080
> URL: https://issues.apache.org/jira/browse/SOLR-15080
> Project: Solr
>  Issue Type: New Feature
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming 
> Expression" libraries, Solr has a lot of analytics and data exploration 
> capabilities to show off in a "notebook" environment.  Case in point - the 
> "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs 
> make heavy use of screenshots taken from Zeppelin, a popular notebook project 
> run by the ASF.  Interested readers are going to want to try their own hand 
> at replicating the specific visualizations showed off in those docs, and in 
> using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might 
> deter or thwart unfamiliar users.  I'd love to see Solr make this easier by 
> offering some sort of integration point with Zeppelin to get users up and 
> running.
> I'm still up in the air on what form would be best for such an integration.  
> But as a strawman I've attached a patch that creates a "zeppelin" tool for 
> "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user 
> up to play with a particular use case without any fuss or configuration on 
> their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to 
> talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It 
> contains other commands to start/stop Zeppelin and clean out the Zeppelin 
> sandbox, but draws the line there in terms of exposing Zeppelin functionality 
> more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15195) Onboard committers who are not on PMC

2021-02-26 Thread Anshum Gupta (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291864#comment-17291864
 ] 

Anshum Gupta commented on SOLR-15195:
-

[~janhoy] - guess we should do this sooner so the committers have access to the 
code, but also to other spaces e.g. Confluence ? 

> Onboard committers who are not on PMC
> -
>
> Key: SOLR-15195
> URL: https://issues.apache.org/jira/browse/SOLR-15195
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Jan Høydahl
>Assignee: Jan Høydahl
>Priority: Major
>
> TLP was started with the PMC only. Now we need to give commit bit to the 
> existing Lucene committers, as decided by VOTE when establishing the project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


epugh commented on pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#issuecomment-786817645


   Okay, I've spent quite a few hours attempting to a) Mock up the call to open 
the URL, or b) Introduce a builder pattern that I could then mock out, and had 
no joy.   Refactoring this to mock up the call is out of my depth right now.   
   
   I'd love another LGTM and then I'll merge and backport.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-15080) Apache Zeppelin Sandbox Integration

2021-02-26 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291849#comment-17291849
 ] 

Jason Gerlowski edited comment on SOLR-15080 at 2/26/21, 6:08 PM:
--

I've attached an updated version of this patch.  This version massages the CLI 
syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some 
issues around starting/stopping Zeppelin and puts better help text in place 
(which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect 
world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* 
interpreters, and one that only includes a minimal set.  I assumed I'd 
accidentally used the former instead of the latter, but it turns out that the 
patch *does* use the minimal download (it's just not all that minimal).  I'm 
going to open a Zeppelin ticket to discuss making the minimal distribution 
moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the 
{{update_interpreter}} subcommand.  If I can clear those away soon I'll be 
looking to merge in the next week or so, so I'd love any testing help that 
people could offer on their own systems.  Eric, I think I addressed most of 
your feedback (other than creating additional zeppelin-solr commands, which we 
can handle independently of the integration here), but if I missed something or 
you've got more suggestions let me know!



To get back to the question around making the nyc311 dataset available.  I def 
agree that we should allow that, but I'm unsure about the approach so I'd 
rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e 
example}} mechanism, but on second thought I'm less sure of this approach.  
Currently, Solr "examples" couple together the node/core topology with the 
dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  
Which is less than ideal.  Ideally you could run something like {{bin/solr 
example}} to set up a particular topology or deployment config, and then have a 
command like {{bin/solr exampledata}} capable of loading datasets into any of 
the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting 
that out.


was (Author: gerlowskija):
I've attached an updated version of this patch.  This version massages the CLI 
syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some 
issues around starting/stopping Zeppelin and puts better help text in place 
(which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect 
world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* 
interpreters, and one that only includes a minimal set.  I assumed I'd 
accidentally used the former instead of the latter, but it turns out that the 
patch *does* use the minimal download (it's just not all that minimal).  I'm 
going to open a Zeppelin ticket to discuss making the minimal distribution 
moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the 
{{update_interpreter}} subcommand.  If I can clear those away soon I'll be 
looking to merge in the next week or so, so I'd love any testing help that 
people could offer on their own systems.



To get back to the question around making the nyc311 dataset available.  I def 
agree that we should allow that, but I'm unsure about the approach so I'd 
rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e 
example}} mechanism, but on second thought I'm less sure of this approach.  
Currently, Solr "examples" couple together the node/core topology with the 
dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  
Which is less than ideal.  Ideally you could run something like {{bin/solr 
example}} to set up a particular topology or deployment config, and then have a 
command like {{bin/solr exampledata}} capable of loading datasets into any of 
the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting 
that out.

> Apache Zeppelin Sandbox Integration  
> -
>
> Key: SOLR-15080
> URL: https://issues.apache.org/jira/browse/SOLR-15080
> Project: Solr
>  Issue Type: New Feature
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the st

[jira] [Commented] (SOLR-15080) Apache Zeppelin Sandbox Integration

2021-02-26 Thread Jason Gerlowski (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291849#comment-17291849
 ] 

Jason Gerlowski commented on SOLR-15080:


I've attached an updated version of this patch.  This version massages the CLI 
syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some 
issues around starting/stopping Zeppelin and puts better help text in place 
(which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect 
world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* 
interpreters, and one that only includes a minimal set.  I assumed I'd 
accidentally used the former instead of the latter, but it turns out that the 
patch *does* use the minimal download (it's just not all that minimal).  I'm 
going to open a Zeppelin ticket to discuss making the minimal distribution 
moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the 
{{update_interpreter}} subcommand.  If I can clear those away soon I'll be 
looking to merge in the next week or so, so I'd love any testing help that 
people could offer on their own systems.



To get back to the question around making the nyc311 dataset available.  I def 
agree that we should allow that, but I'm unsure about the approach so I'd 
rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e 
example}} mechanism, but on second thought I'm less sure of this approach.  
Currently, Solr "examples" couple together the node/core topology with the 
dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  
Which is less than ideal.  Ideally you could run something like {{bin/solr 
example}} to set up a particular topology or deployment config, and then have a 
command like {{bin/solr exampledata}} capable of loading datasets into any of 
the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting 
that out.

> Apache Zeppelin Sandbox Integration  
> -
>
> Key: SOLR-15080
> URL: https://issues.apache.org/jira/browse/SOLR-15080
> Project: Solr
>  Issue Type: New Feature
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming 
> Expression" libraries, Solr has a lot of analytics and data exploration 
> capabilities to show off in a "notebook" environment.  Case in point - the 
> "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs 
> make heavy use of screenshots taken from Zeppelin, a popular notebook project 
> run by the ASF.  Interested readers are going to want to try their own hand 
> at replicating the specific visualizations showed off in those docs, and in 
> using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might 
> deter or thwart unfamiliar users.  I'd love to see Solr make this easier by 
> offering some sort of integration point with Zeppelin to get users up and 
> running.
> I'm still up in the air on what form would be best for such an integration.  
> But as a strawman I've attached a patch that creates a "zeppelin" tool for 
> "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user 
> up to play with a particular use case without any fuss or configuration on 
> their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to 
> talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It 
> contains other commands to start/stop Zeppelin and clean out the Zeppelin 
> sandbox, but draws the line there in terms of exposing Zeppelin functionality 
> more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15080) Apache Zeppelin Sandbox Integration

2021-02-26 Thread Jason Gerlowski (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-15080:
---
Attachment: SOLR-15080.patch
Status: Open  (was: Open)

> Apache Zeppelin Sandbox Integration  
> -
>
> Key: SOLR-15080
> URL: https://issues.apache.org/jira/browse/SOLR-15080
> Project: Solr
>  Issue Type: New Feature
>Reporter: Jason Gerlowski
>Assignee: Jason Gerlowski
>Priority: Major
> Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming 
> Expression" libraries, Solr has a lot of analytics and data exploration 
> capabilities to show off in a "notebook" environment.  Case in point - the 
> "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs 
> make heavy use of screenshots taken from Zeppelin, a popular notebook project 
> run by the ASF.  Interested readers are going to want to try their own hand 
> at replicating the specific visualizations showed off in those docs, and in 
> using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might 
> deter or thwart unfamiliar users.  I'd love to see Solr make this easier by 
> offering some sort of integration point with Zeppelin to get users up and 
> running.
> I'm still up in the air on what form would be best for such an integration.  
> But as a strawman I've attached a patch that creates a "zeppelin" tool for 
> "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user 
> up to play with a particular use case without any fuss or configuration on 
> their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to 
> talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It 
> contains other commands to start/stop Zeppelin and clean out the Zeppelin 
> sandbox, but draws the line there in terms of exposing Zeppelin functionality 
> more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8626) standardise test class naming

2021-02-26 Thread Christine Poerschke (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291821#comment-17291821
 ] 

Christine Poerschke commented on LUCENE-8626:
-

I don't know if we've already got naming convention enforcement in place for 
the "Lucene" tests (now that they are wonderfully standardized) but in case not 
then progressing [~dweiss]'s 
[https://github.com/apache/lucene-solr/pull/1743/files] towards merging could 
be a next step here -- what do people think?

I just pushed commits to bring the PR up to date with master and removed the 
lucene entries in the exception list _but have not yet run all the tests_ i.e. 
possibly some more exceptions need to be added for the "Solr" tests still.

> standardise test class naming
> -
>
> Key: LUCENE-8626
> URL: https://issues.apache.org/jira/browse/LUCENE-8626
> Project: Lucene - Core
>  Issue Type: Test
>Reporter: Christine Poerschke
>Priority: Major
> Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, 
> SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> This was mentioned and proposed on the dev mailing list. Starting this ticket 
> here to start to make it happen?
> History: This ticket was created as 
> https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got 
> JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15146) Distribute Collection API command execution

2021-02-26 Thread Ilan Ginzburg (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291724#comment-17291724
 ] 

Ilan Ginzburg commented on SOLR-15146:
--

I have updated the removing overseer doc describing how I intend to approach 
this ticket.

Please see 
[https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit#heading=h.gky9ejm6sfdp]
 (new standalone section not requiring reading the rest of the doc).

Feedback most welcome.

> Distribute Collection API command execution
> ---
>
> Key: SOLR-15146
> URL: https://issues.apache.org/jira/browse/SOLR-15146
> Project: Solr
>  Issue Type: Sub-task
>  Components: SolrCloud
>Affects Versions: master (9.0)
>Reporter: Ilan Ginzburg
>Assignee: Ilan Ginzburg
>Priority: Major
>  Labels: collection-api, overseer
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Building on the distributed cluster state update changes (SOLR-14928), this 
> ticket will distribute the Collection API so that commands can execute on any 
> node (i.e. the node handling the request through {{CollectionsHandler}}) 
> without having to go through a Zookeeper queue and the Overseer.
>  This is the second step (first was SOLR-14928) after which the Overseer 
> could be removed (but the code keeps existing execution options so completion 
> by no means Overseer is gone, but it could be removed in a future release).
> -There is a dependency on the distributed cluster state changes because the 
> Overseer locking protecting same collection (or same shard) Collection API 
> commands from executing concurrently will be replaced by optimistic locking 
> of the collection {{state.json}} znodes (or other znodes that will eventually 
> replace/augment {{state.json}}).-
> The goal of this ticket is threefold:
>  * Simplify the code (running synchronously and not going through the 
> Zookeeper queues and the Overseer dequeue logic is much simpler),
>  * Lead to improved performance for most/all use cases (although this is a 
> secondary goal, as long as performance is not degraded) and
>  * Allow a future change (in another future Jira) to the way cluster state is 
> cached on the nodes of the cluster (keep less information, be less dependent 
> on Zookeeper watches, do not care about collections not present on the node). 
> This future work will aim to significantly increase the scale (amount of 
> collections) supported by SolrCloud.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15201) Improve the streaming expression joins

2021-02-26 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-15201:
--
Description: 
The Solr 9.0 /export handler has faster sorting (SOLR-15064). This improved 
export performance will translate to faster joins. It's time now to revisit the 
Streaming Expression join package and the *parallel* function to improve the 
performance of distributed joins. The HashQParserPlugin is key to this and 
there is an existing ticket to improve this as well (SOLR-15185).

It's also time to turn the *fetch* expression into a fully functioning nested 
loop join.

  was:
The Solr 9.0 /export handler has faster sorting (SOLR-15064). This improved 
export performance will translate to faster joins. It's time now to revisit the 
Streaming Expression join package and the *parallel* function to improve the 
performance of distributed joins. The HashQParserPlugin is key to this and 
there is an existing ticket to improve this (SOLR-15185).

It's also time to turn the *fetch* expression into a fully functioning nested 
loop join.


> Improve the streaming expression joins
> --
>
> Key: SOLR-15201
> URL: https://issues.apache.org/jira/browse/SOLR-15201
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: master (9.0)
>Reporter: Joel Bernstein
>Priority: Major
>
> The Solr 9.0 /export handler has faster sorting (SOLR-15064). This improved 
> export performance will translate to faster joins. It's time now to revisit 
> the Streaming Expression join package and the *parallel* function to improve 
> the performance of distributed joins. The HashQParserPlugin is key to this 
> and there is an existing ticket to improve this as well (SOLR-15185).
> It's also time to turn the *fetch* expression into a fully functioning nested 
> loop join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15201) Improve the streaming expression joins

2021-02-26 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-15201:
--
Description: 
The Solr 9.0 /export handler has faster sorting (SOLR-15064). This improved 
export performance will translate to faster joins. It's time now to revisit the 
Streaming Expression join package and the *parallel* function to improve the 
performance of distributed joins. The HashQParserPlugin is key to this and 
there is an existing ticket to improve this (SOLR-15185).

It's also time to turn the *fetch* expression into a fully functioning nested 
loop join.

  was:
The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time now to 
revisit the Streaming Expression join package and the *parallel* function to 
improve the performance of distributed joins. The HashQParserPlugin is key to 
this and there is an existing ticket to improve this (SOLR-15185).

It's also time to turn the *fetch* expression into a fully functioning nested 
loop join.


> Improve the streaming expression joins
> --
>
> Key: SOLR-15201
> URL: https://issues.apache.org/jira/browse/SOLR-15201
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: master (9.0)
>Reporter: Joel Bernstein
>Priority: Major
>
> The Solr 9.0 /export handler has faster sorting (SOLR-15064). This improved 
> export performance will translate to faster joins. It's time now to revisit 
> the Streaming Expression join package and the *parallel* function to improve 
> the performance of distributed joins. The HashQParserPlugin is key to this 
> and there is an existing ticket to improve this (SOLR-15185).
> It's also time to turn the *fetch* expression into a fully functioning nested 
> loop join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15201) Improve the streaming expression joins

2021-02-26 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-15201:
--
Description: 
The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time now to 
revisit the Streaming Expression join package and the *parallel* function to 
improve the performance of distributed joins. The HashQParserPlugin is key to 
this and there is an existing ticket to improve this (SOLR-15185).

It's also time to turn the *fetch* expression into a fully functioning nested 
loop join.

  was:The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time 
now to revisit the Streaming Expression join package and the *parallel* 
function to improve the performance of distributed joins. The HashQParserPlugin 
is key to this and there is an existing ticket to improve this (SOLR-15185).


> Improve the streaming expression joins
> --
>
> Key: SOLR-15201
> URL: https://issues.apache.org/jira/browse/SOLR-15201
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: master (9.0)
>Reporter: Joel Bernstein
>Priority: Major
>
> The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time now 
> to revisit the Streaming Expression join package and the *parallel* function 
> to improve the performance of distributed joins. The HashQParserPlugin is key 
> to this and there is an existing ticket to improve this (SOLR-15185).
> It's also time to turn the *fetch* expression into a fully functioning nested 
> loop join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15201) Improve the streaming expression joins

2021-02-26 Thread Joel Bernstein (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-15201:
--
Summary: Improve the streaming expression joins  (was: Improve the 
streaming expressions joins)

> Improve the streaming expression joins
> --
>
> Key: SOLR-15201
> URL: https://issues.apache.org/jira/browse/SOLR-15201
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: streaming expressions
>Affects Versions: master (9.0)
>Reporter: Joel Bernstein
>Priority: Major
>
> The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time now 
> to revisit the Streaming Expression join package and the *parallel* function 
> to improve the performance of distributed joins. The HashQParserPlugin is key 
> to this and there is an existing ticket to improve this (SOLR-15185).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (SOLR-15201) Improve the streaming expressions joins

2021-02-26 Thread Joel Bernstein (Jira)
Joel Bernstein created SOLR-15201:
-

 Summary: Improve the streaming expressions joins
 Key: SOLR-15201
 URL: https://issues.apache.org/jira/browse/SOLR-15201
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: streaming expressions
Affects Versions: master (9.0)
Reporter: Joel Bernstein


The Solr 9.0 /export handler has faster sorting (SOLR-15064). It's time now to 
revisit the Streaming Expression join package and the *parallel* function to 
improve the performance of distributed joins. The HashQParserPlugin is key to 
this and there is an existing ticket to improve this (SOLR-15185).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15185) Improve "hash" QParser

2021-02-26 Thread Joel Bernstein (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291690#comment-17291690
 ] 

Joel Bernstein commented on SOLR-15185:
---

Let's see if we can remove the HashQParserPlugin all together and use the 
HashRangeQParser instead. I can investigate this.

> Improve "hash" QParser
> --
>
> Key: SOLR-15185
> URL: https://issues.apache.org/jira/browse/SOLR-15185
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> * Don't use Filter (to be removed)
> * Do use TwoPhaseIterator, not PostFilter
> * Don't pre-compute matching docs (wasteful)
> * Support more fields, and more field types
> * Faster hash on Strings (avoid Char conversion)
> * Stronger hash when using multiple fields



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] rmuir commented on a change in pull request #2429: LUCENE-9791 Allow calling BytesRefHash#find concurrently

2021-02-26 Thread GitBox


rmuir commented on a change in pull request #2429:
URL: https://github.com/apache/lucene-solr/pull/2429#discussion_r583677955



##
File path: lucene/core/src/test/org/apache/lucene/util/TestBytesRefHash.java
##
@@ -267,6 +271,71 @@ public void testFind() throws Exception {
 }
   }
 
+  @Test
+  public void testConcurrentAccessToUnmodifiableBytesRefHash() throws 
Exception {
+int num = atLeast(2);
+for (int j = 0; j < num; j++) {
+  int numStrings = 797;
+  List strings = new ArrayList<>(numStrings);
+  for (int i = 0; i < numStrings; i++) {
+final String str = TestUtil.randomRealisticUnicodeString(random(), 1, 
1000);
+hash.add(new BytesRef(str));
+assertTrue(strings.add(str));
+  }
+  int hashSize = hash.size();
+
+  UnmodifiableBytesRefHash unmodifiableHash = new 
UnmodifiableBytesRefHash(hash);
+
+  AtomicInteger notFound = new AtomicInteger();
+  AtomicInteger notEquals = new AtomicInteger();
+  AtomicInteger wrongSize = new AtomicInteger();
+  int numThreads = 10;

Review comment:
   Can we tone this down to use less threads (at least locally, it is ok to 
increase for NIGHTLY). Using many threads causes tests like this to run 
extremely slow on wimpier machine (e.g. i have 2 cores)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-9814) fix extremely slow 7.0 backwards tests in master

2021-02-26 Thread Michael McCandless (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291674#comment-17291674
 ] 

Michael McCandless commented on LUCENE-9814:


+1

> fix extremely slow 7.0 backwards tests in master
> 
>
> Key: LUCENE-9814
> URL: https://issues.apache.org/jira/browse/LUCENE-9814
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9814.patch
>
>
> The 7.0 backwards tests added to master must have come from an older branch 
> before they were fixed: they've added minutes to my test times.
> These tests have already been fixed in master, so that the crazy corner-case 
> stress tests are only running slowly in jenkins and we don't have 15-30s long 
> tests locally.
> Re-applying same fixes to 7.0 tests removes minutes from my test times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (LUCENE-9814) fix extremely slow 7.0 backwards tests in master

2021-02-26 Thread Robert Muir (Jira)


 [ 
https://issues.apache.org/jira/browse/LUCENE-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-9814:

Attachment: LUCENE-9814.patch

> fix extremely slow 7.0 backwards tests in master
> 
>
> Key: LUCENE-9814
> URL: https://issues.apache.org/jira/browse/LUCENE-9814
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Robert Muir
>Priority: Major
> Attachments: LUCENE-9814.patch
>
>
> The 7.0 backwards tests added to master must have come from an older branch 
> before they were fixed: they've added minutes to my test times.
> These tests have already been fixed in master, so that the crazy corner-case 
> stress tests are only running slowly in jenkins and we don't have 15-30s long 
> tests locally.
> Re-applying same fixes to 7.0 tests removes minutes from my test times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9814) fix extremely slow 7.0 backwards tests in master

2021-02-26 Thread Robert Muir (Jira)
Robert Muir created LUCENE-9814:
---

 Summary: fix extremely slow 7.0 backwards tests in master
 Key: LUCENE-9814
 URL: https://issues.apache.org/jira/browse/LUCENE-9814
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Robert Muir
 Attachments: LUCENE-9814.patch

The 7.0 backwards tests added to master must have come from an older branch 
before they were fixed: they've added minutes to my test times.

These tests have already been fixed in master, so that the crazy corner-case 
stress tests are only running slowly in jenkins and we don't have 15-30s long 
tests locally.

Re-applying same fixes to 7.0 tests removes minutes from my test times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


epugh commented on pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#issuecomment-786645918


   I think you are right, so I'll yank out the WARN (and the related test 
code), and then I think we are ready for merging!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] janhoy commented on pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


janhoy commented on pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#issuecomment-786612766


   > Do you think the WARN logic is EVEN useful since we have a parameter now 
that you have to intentionally set. I'm wondering what your thought is on 
removing the warning in the log?
   
   I'd probably skip testing that the logger actually logs. And given how this 
is documented, users should be aware of the risk. The only reason for keeping 
the WARN log would be in cases where settings from dev/test environments sneak 
into production unaware. But then again, if the production IDP server is setup 
with HTTPS, it would immediately fail if trying a HTTP URL...
   
   My gut feeling is keep it securey by default, make it easy to switch to 
insecure for dev and that's it.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] bruno-roustant commented on pull request #2412: LUCENE-9737: Flexible configuration for DocValue compressions

2021-02-26 Thread GitBox


bruno-roustant commented on pull request #2412:
URL: https://github.com/apache/lucene-solr/pull/2412#issuecomment-786606912


   @jaisonbi this PR has already been commented. The preferred way to configure 
compression beyond the speed/compression Mode is to use PerField customization.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


epugh commented on pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#issuecomment-786585570


   @janhoy I think we all know the answer to "is it worth all the extra code 
lines" since we both have the same feeling ;-).   Do you think the WARN logic 
is EVEN useful since we have a parameter now that you have to intentionally 
set.   I'm wondering what your thought is on removing the warning in the log?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on a change in pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


epugh commented on a change in pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#discussion_r583567245



##
File path: solr/core/src/java/org/apache/solr/security/JWTIssuerConfig.java
##
@@ -68,6 +68,11 @@
   private WellKnownDiscoveryConfig wellKnownDiscoveryConfig;
   private String clientId;
   private String authorizationEndpoint;
+  
+  public static boolean ALLOW_OUTBOUND_HTTP = 
Boolean.parseBoolean(System.getProperty("solr.auth.jwt.allowOutboundHttp", 
"false"));
+  public static final String ALLOW_OUTBOUND_HTTP_ERR_MSG = "Outbound non SSL 
protected JWT authentication urls are not enabled, start your nodes with 
-Dsolr.auth.jwt.allowOutboundHttp=true.";

Review comment:
   I like!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] epugh commented on a change in pull request #2430: SOLR-15194: relax requirements and allow http urls.

2021-02-26 Thread GitBox


epugh commented on a change in pull request #2430:
URL: https://github.com/apache/lucene-solr/pull/2430#discussion_r583565911



##
File path: solr/core/src/java/org/apache/solr/security/JWTIssuerConfig.java
##
@@ -68,6 +68,11 @@
   private WellKnownDiscoveryConfig wellKnownDiscoveryConfig;
   private String clientId;
   private String authorizationEndpoint;
+  
+  public static boolean ALLOW_OUTBOUND_HTTP = 
Boolean.parseBoolean(System.getProperty("solr.auth.jwt.allowOutboundHttp", 
"false"));
+  public static final String ALLOW_OUTBOUND_HTTP_ERR_MSG = "Outbound non SSL 
protected JWT authentication urls are not enabled, start your nodes with 
-Dsolr.auth.jwt.allowOutboundHttp=true.";
+
+

Review comment:
   "Jan-Lint-Bot" ;-)Where is my Solr Java Lint checker ;-)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] pawel-bugalski-dynatrace commented on a change in pull request #2429: LUCENE-9791 Allow calling BytesRefHash#find concurrently

2021-02-26 Thread GitBox


pawel-bugalski-dynatrace commented on a change in pull request #2429:
URL: https://github.com/apache/lucene-solr/pull/2429#discussion_r583564032



##
File path: lucene/core/src/test/org/apache/lucene/util/TestBytesRefHash.java
##
@@ -267,6 +271,71 @@ public void testFind() throws Exception {
 }
   }
 
+  @Test
+  public void testConcurrentAccessToUnmodifiableBytesRefHash() throws 
Exception {
+int num = atLeast(2);
+for (int j = 0; j < num; j++) {
+  int numStrings = 797;
+  List strings = new ArrayList<>(numStrings);
+  for (int i = 0; i < numStrings; i++) {
+final String str = TestUtil.randomRealisticUnicodeString(random(), 1, 
1000);
+hash.add(new BytesRef(str));
+assertTrue(strings.add(str));
+  }
+  int hashSize = hash.size();
+
+  UnmodifiableBytesRefHash unmodifiableHash = hash;
+
+  AtomicInteger notFound = new AtomicInteger();
+  AtomicInteger notEquals = new AtomicInteger();
+  AtomicInteger wrongSize = new AtomicInteger();
+  int numThreads = 10;
+  CountDownLatch latch = new CountDownLatch(numThreads);
+  Thread[] threads = new Thread[numThreads];
+  for (int i = 0; i < threads.length; i++) {
+int loops = atLeast(100);
+threads[i] =
+new Thread(
+() -> {
+  BytesRef scratch = new BytesRef();
+  latch.countDown();
+  try {
+latch.await();
+  } catch (InterruptedException e) {

Review comment:
   Yes. My mistake. Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] pawel-bugalski-dynatrace commented on a change in pull request #2429: LUCENE-9791 Allow calling BytesRefHash#find concurrently

2021-02-26 Thread GitBox


pawel-bugalski-dynatrace commented on a change in pull request #2429:
URL: https://github.com/apache/lucene-solr/pull/2429#discussion_r583527169



##
File path: lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java
##
@@ -31,18 +31,21 @@
  * to the id is encapsulated inside {@link BytesRefHash} and is guaranteed to 
be increased for each
  * added {@link BytesRef}.
  *
+ * Note that this implementation is not synchronized. If 
multiple threads access
+ * a {@link BytesRefHash} instance concurrently, and at least one of the 
threads modifies it
+ * structurally, it must be synchronized externally. (A structural 
modification is any
+ * operation on the map except operations explicitly listed in {@link 
UnmodifiableBytesRefHash}
+ * interface).
+ *
  * Note: The maximum capacity {@link BytesRef} instance passed to {@link 
#add(BytesRef)} must not
  * be longer than {@link ByteBlockPool#BYTE_BLOCK_SIZE}-2. The internal 
storage is limited to 2GB
  * total byte storage.
  *
  * @lucene.internal
  */
-public final class BytesRefHash implements Accountable {
+public final class BytesRefHash implements Accountable, 
UnmodifiableBytesRefHash {

Review comment:
   Definitely. I've replaced interface with a class that wraps 
BytesRefHash. For me it looks much better now, but I wander what you think 
about it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] pawel-bugalski-dynatrace commented on a change in pull request #2429: LUCENE-9791 Allow calling BytesRefHash#find concurrently

2021-02-26 Thread GitBox


pawel-bugalski-dynatrace commented on a change in pull request #2429:
URL: https://github.com/apache/lucene-solr/pull/2429#discussion_r583526175



##
File path: lucene/core/src/java/org/apache/lucene/util/BytesRefHash.java
##
@@ -31,18 +31,21 @@
  * to the id is encapsulated inside {@link BytesRefHash} and is guaranteed to 
be increased for each
  * added {@link BytesRef}.
  *
+ * Note that this implementation is not synchronized. If 
multiple threads access
+ * a {@link BytesRefHash} instance concurrently, and at least one of the 
threads modifies it
+ * structurally, it must be synchronized externally. (A structural 
modification is any
+ * operation on the map except operations explicitly listed in {@link 
UnmodifiableBytesRefHash}

Review comment:
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] donnerpeter opened a new pull request #2434: LUCENE-9813: Add a convenience constructor IntsRef(int[])

2021-02-26 Thread GitBox


donnerpeter opened a new pull request #2434:
URL: https://github.com/apache/lucene-solr/pull/2434


   
   
   
   # Description
   
   to avoid repetitive passing of 0 and array.length everywhere
   
   # Solution
   
   Add the constructor, call it where applicable
   
   # Tests
   
   Nothing dedicated, but some tests now use the new API and still pass :)
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `master` branch.
   - [x] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Created] (LUCENE-9813) Add a convenience constructor IntsRef(int[])

2021-02-26 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-9813:


 Summary: Add a convenience constructor IntsRef(int[])
 Key: LUCENE-9813
 URL: https://issues.apache.org/jira/browse/LUCENE-9813
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Peter Gromov


to avoid repetitive passing of 0 and array.length everywhere



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Updated] (SOLR-15200) Nightly test run.

2021-02-26 Thread Mark Robert Miller (Jira)


 [ 
https://issues.apache.org/jira/browse/SOLR-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Robert Miller updated SOLR-15200:
--
Description: Make the Nightly test run fly. 

> Nightly test run.
> -
>
> Key: SOLR-15200
> URL: https://issues.apache.org/jira/browse/SOLR-15200
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Mark Robert Miller
>Priority: Major
>
> Make the Nightly test run fly. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15200) Nightly test run.

2021-02-26 Thread Mark Robert Miller (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291455#comment-17291455
 ] 

Mark Robert Miller commented on SOLR-15200:
---

I’ve started enabling the Nightly test run. It’s beautiful. 

> Nightly test run.
> -
>
> Key: SOLR-15200
> URL: https://issues.apache.org/jira/browse/SOLR-15200
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Mark Robert Miller
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org