date:20240416

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

2024-04-16 Thread via GitHub



rseitz commented on code in PR #2404:
URL: https://github.com/apache/solr/pull/2404#discussion_r1568088624


##
solr/core/src/java/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java:
##
@@ -215,8 +215,9 @@ public NamedList request(SolrRequest request, 
String coreName)
   if (handler == null) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "unknown 
handler: " + path);
   }
-
-  req = _parser.buildRequestFrom(core, params, getContentStreams(request));
+  req =
+  _parser.buildRequestFrom(

Review Comment:
   We could remove the buildRequestFrom() that doesn't take a Principal; 
outside of tests, it's used in only two places. Where it's used in 
`EmbeddedSolrServer.request(SolrRequest request, String coreName)`, we could 
get the Principal from the SolrRequest. Where it's used in 
`DirectSolrConnection.request(SolrRequestHandler handler, SolrParams params, 
String body)` I think we'd need to pass a null Principal as I'm not seeing 
where to get one from in that context.
   
   About the larger question -- are there lots of other cases where enabling 
basic auth causes errors -- I don't think I know enough to give a good answer. 
I do think that the subquery use case is the most obvious/glaring place for 
such an error to arise because it involves creating and executing a brand new 
query. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

2024-04-16 Thread via GitHub



rseitz commented on code in PR #2404:
URL: https://github.com/apache/solr/pull/2404#discussion_r1568077123


##
solr/core/src/test/org/apache/solr/response/transform/TestSubQueryTransformerDistrib.java:
##
@@ -61,22 +75,51 @@ public static void setupCluster() throws Exception {
 
 String configName = "solrCloudCollectionConfig";
 int nodeCount = 5;
-configureCluster(nodeCount).addConfig(configName, configDir).configure();
+
+final String SECURITY_JSON =

Review Comment:
   I'm seeing identical SECURITY_JSON constants declared in CreateToolTest, 
DeleteToolTest, PackageToolTest, PostToolTest, 
DistribDocExpirationUpdateProcessorTest, TestPullReplicaWithAuth, and 
BasicAuthIntegrationTest. I'd be happy to move all those identical ones to a 
common place. Also happy to be told where the common place should be :)
   
   There are some other SECURITY_JSON declarations that differ from the above 
ones: CloudAuthStreamingTest, SQLWithAuthzEnabledTest, and 
TestAuthorizationTest.
   
   The one I've declared here in TestSuQueryTransformerDistrib is basically the 
same as the first group, but I added forwardCredentials=true which tells the 
basic auth plugin to propagate creds across internode requests. I thought it 
would be _necessary_ for me to add this here, because forwardCredentials=true 
was needed in my manual testing. That's to say, in manual testing with 
forwardCredentials=false, subqueries did not work -- PKI authentication did not 
kick in seamlessly as one might have hoped. So that PKI case is an unresolved 
detail re: this fix -- should we _expect_ subqueries to work with basic auth 
enabled but forwardCredentials=false and if so, how do we make this work? The 
other unresolved detail is that here in TestSuQueryTransformerDistrib, I'm 
finding the test passes _regardless_ of whether forwardCredentials is true, 
false, or not provided, and I'm not yet sure why this differs from what's 
observed in manual testing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]

2024-04-16 Thread via GitHub



github-actions[bot] commented on PR #1999:
URL: https://github.com/apache/solr/pull/1999#issuecomment-2060090828

   This PR had no visible activity in the past 60 days, labeling it as stale. 
Any new activity will remove the stale label. To attract more reviewers, please 
tag someone or notify the d...@solr.apache.org mailing list. Thank you for your 
contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

2024-04-16 Thread via GitHub



epugh commented on code in PR #2404:
URL: https://github.com/apache/solr/pull/2404#discussion_r1568013748


##
solr/core/src/java/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java:
##
@@ -215,8 +215,9 @@ public NamedList request(SolrRequest request, 
String coreName)
   if (handler == null) {
 throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "unknown 
handler: " + path);
   }
-
-  req = _parser.buildRequestFrom(core, params, getContentStreams(request));
+  req =
+  _parser.buildRequestFrom(

Review Comment:
   I wonder, should we have still have a _parser.buildRequestFrom() that 
doesn't require the user principal?   These days, do we need to make sure that 
every method has a user principal?   Maybe what I am asking is, are there lots 
of other places where, if you enable basic auth, then you get errors because 
the oroginal code didn't plan for that?



##
solr/core/src/test/org/apache/solr/response/transform/TestSubQueryTransformerDistrib.java:
##
@@ -61,22 +75,51 @@ public static void setupCluster() throws Exception {
 
 String configName = "solrCloudCollectionConfig";
 int nodeCount = 5;
-configureCluster(nodeCount).addConfig(configName, configDir).configure();
+
+final String SECURITY_JSON =

Review Comment:
   Starting to feel like `SECURITY_JSON` needs to be moved into some sort of 
cross cutting concern/utility method as we see this basic code proliferate more 
and more ;-).   Don't get me wrong, I think it's great we are testing our code 
with Basic Auth more!!!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception

2024-04-16 Thread Rudi Seitz (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837906#comment-17837906
 ] 

Rudi Seitz commented on SOLR-12813:
---

Here's a PR: https://github.com/apache/solr/pull/2404

> SolrCloud + 2 shards + subquery + auth = 401 Exception
> --
>
> Key: SOLR-12813
> URL: https://issues.apache.org/jira/browse/SOLR-12813
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 6.4.1, 7.5, 8.11
>Reporter: Igor Fedoryn
>Priority: Major
> Attachments: screen1.png, screen2.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Environment: * Solr 6.4.1
>  * Zookeeper 3.4.6
>  * Java 1.8
> Run Zookeeper
> Upload simple configuration wherein the Solr schema has fields for a 
> relationship between parent/child
> Run two Solr instance (2 nodes)
> Create the collection with 1 shard on each Solr nodes
>  
> Add parent document to one shard and child document to another shard.
> The response for: * 
> /select?q=ChildIdField:VALUE=*,parents:[subqery]=\{!term f=id 
> v=$row.ParentIdsField}
> correct.
>  
> After that add Basic Authentication with some user for collection.
> Restart Solr or reload Solr collection.
> If the simple request /select?q=*:* with authorization on Solr server is a 
> success then run previously request
> with authorization on Solr server and you get the exception: "Solr HTTP 
> error: Unauthorized (401) "
>  
> Screens in the attachment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[PR] SOLR-12813: subqueries should respect basic auth [solr]

2024-04-16 Thread via GitHub

rseitz opened a new pull request, #2404:
URL: https://github.com/apache/solr/pull/2404

https://issues.apache.org/jira/browse/SOLR-12813

# Description

This PR fixes an issue where subqueries don't work when basic auth is
enabled. The problem surfaces when 2 or more shards are involved, and when the
solr node(s) are not started with the -Dbasicauth system property.

The root cause is that the SubQueryAugmenter discards any basic auth
credentials that have been sent with the original query request. There are two
separate places where basic auth credentials are lost. First, the
SubQueryAugmenter's transform() method issues a subquery by calling
EmbeddedSolrServer.query() without ever setting a user principal on the
generated QueryRequest. Second, if we look at how EmbeddedSolrServer actually
processes a QueryRequest, we see that various transformations are applied,
resulting in a SolrQueryRequestBase that fails to return the user principal via
getUserPrincipal() even if it had been properly set on the original
QueryRequest.

# Solution

SubQueryAugment.transform() now generates a QueryRequest explicitly, so that
the user principal can be set on this QueryRequest before it is processed.

EmbeddedSolrServer now attempts to preserve the user principal on a
QueryRequest when generating a SolrQueryRequestBase. To do this,
EmbeddedSolrServer relies on an updated buildRequestFrom() utility method in
SolrRequestParsers that allows for a user principal to be provided explicitly.

# Tests

TestSubQueryTransformerDistrib has been updated to use basic auth. I have
confirmed that the updated test without the fix, but passes with the fix. I
have also manually tested the change in a 2 node cluster with two shards, where
I enabled basic auth and issued a subquery successfully.

# Checklist

Please review the following and check all that apply:

- [x] I have reviewed the guidelines for [How to
Contribute](https://github.com/apache/solr/blob/main/CONTRIBUTING.md) and my
code conforms to the standards described there to the best of my ability.
- [x] I have created a Jira issue and added the issue ID to my pull request
title.
- [x] I have given Solr maintainers
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
to contribute to my PR branch. (optional but recommended)
- [x] I have developed this patch against the `main` branch.
- [x] I have run `./gradlew check`.
- [x] I have added tests for my changes.
- [ ] I have added documentation for the [Reference
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-16 Thread via GitHub



dsmiley commented on code in PR #2395:
URL: https://github.com/apache/solr/pull/2395#discussion_r1567989670


##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessor.java:
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import java.util.Locale;
+import org.apache.solr.cloud.CloudDescriptor;
+import org.apache.solr.cloud.ZkController;
+import org.apache.solr.common.SolrException;
+import org.apache.solr.common.cloud.Replica;
+import org.apache.solr.common.cloud.Slice;
+import org.apache.solr.common.util.StrUtils;
+import org.apache.solr.core.CoreDescriptor;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class NumFieldLimitingUpdateRequestProcessor extends 
UpdateRequestProcessor {
+  private static final Logger log = 
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+  private SolrQueryRequest req;
+  private int fieldThreshold;
+  private int currentNumFields;
+  private boolean warnOnly;
+
+  public NumFieldLimitingUpdateRequestProcessor(
+  SolrQueryRequest req,
+  UpdateRequestProcessor next,
+  int fieldThreshold,
+  int currentNumFields,
+  boolean warnOnly) {
+super(next);
+this.req = req;
+this.fieldThreshold = fieldThreshold;
+this.currentNumFields = currentNumFields;
+this.warnOnly = warnOnly;
+  }
+
+  public void processAdd(AddUpdateCommand cmd) throws IOException {
+if (!isCloudMode() || /* implicit isCloudMode==true */ 
isLeaderThatOwnsTheDoc(cmd)) {

Review Comment:
   isCloudMode -- let's not; the user can configure or not as they wish.
   
   "isLeaderThatOwnsTheDoc" can also be removed.  Instead in the URP Factory, 
if we are not a leader, then skip this URP.  The "owns the doc concept" I think 
we are agreeing we don't need in this PR.



##
solr/core/src/java/org/apache/solr/update/processor/NumFieldLimitingUpdateRequestProcessorFactory.java:
##
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import org.apache.solr.common.util.NamedList;
+import org.apache.solr.core.SolrCore;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.search.SolrIndexSearcher;
+import org.apache.solr.util.plugin.SolrCoreAware;
+
+/**
+ * This factory generates an UpdateRequestProcessor which fails update 
requests once a core has
+ * exceeded a configurable maximum number of fields. Meant as a safeguard to 
help users notice
+ * potentially-dangerous schema design before performance and stability 
problems start to occur.
+ *
+ * The URP uses the core's {@link SolrIndexSearcher} to judge the current 
number of fields.
+ * Accordingly, it undercounts the number of fields in the core - missing all 
fields added since the
+ * previous searcher was opened. As such, the URP's request-blocking is "best 
effort" - it cannot be
+ * relied on as a precise limit on the number of fields.
+ *
+ * Additionally, the field-counting includes all documents present in the 
index, including any
+ * deleted docs that haven't yet been purged via segment merging. Note that 
this can differ
+ *

[jira] [Commented] (SOLR-17106) LBSolrClient: Make it configurable to remove zombie ping checks

2024-04-16 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837877#comment-17837877
 ] 

David Smiley commented on SOLR-17106:
-

Just want to mention that we should probably have these settings choose default 
values via EnvUtils (a new thing) so we can conveniently make these settings 
adjustments via system properties or env vars, whichever is convenient to the 
user.

> LBSolrClient: Make it configurable to remove zombie ping checks
> ---
>
> Key: SOLR-17106
> URL: https://issues.apache.org/jira/browse/SOLR-17106
> Project: Solr
>  Issue Type: Improvement
>Reporter: Aparna Suresh
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Following discussion from a dev list discussion here: 
> [https://lists.apache.org/thread/f0zfmpg0t48xrtppyfsmfc5ltzsq2qqh]
> The issue involves scalability challenges in SolrJ's *LBSolrClient* when a 
> pod with numerous cores experiences connectivity problems. The "zombie" 
> tracking mechanism, operating on a core basis, becomes a bottleneck during 
> distributed search on a massive multi shard collection. Threads attempting to 
> reach unhealthy cores contribute to a high computational load, causing 
> performance issues. 
> As suggested by Chris Hostetter: LBSolrClient could be configured to disable 
> zombie "ping" checks, but retain zombie tracking. Once a replica/endpoint is 
> identified as a zombie, it could be held in zombie jail for X seconds, before 
> being released - hoping that by this timeframe ZK would be updated to mark 
> this endpoint DOWN or the pod is back up and CloudSolrClient would avoid 
> querying it. In any event, only 1 failed query would be needed to send the 
> server back to zombie jail.
>  
> There are benefits in doing this change:
>  * Eliminate the zombie ping requests, which would otherwise overload pod(s) 
> coming up after a restart
>  * Avoid memory leaks, in case a node/replica goes away permanently, but it 
> stays as zombie forever, with a background thread in LBSolrClient constantly 
> pinging it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Created] (SOLR-17234) LBHttp2SolrClient does not skip "zombie" endpoints

2024-04-16 Thread James Dyer (Jira)

James Dyer created SOLR-17234:
-

 Summary: LBHttp2SolrClient does not skip "zombie" endpoints
 Key: SOLR-17234
 URL: https://issues.apache.org/jira/browse/SOLR-17234
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrJ
Affects Versions: main (10.0)
Reporter: James Dyer


While working on SOLR-14763, I found different behavior with 
*LBHttp2SolrClient* between *branch_9x* and {*}main/10.x{*}.

If the first Endpoint in the list had previously failed, *branch_9x* will skip 
the failed Endpoint with subsequent requests, and begin requesting with the 
second Endpoint. If all remaining Endpoints fail, it will then retry the first 
Endpoint again.

If the first Endpoint in the list had previously failed, *main/10.x* will 
always try the first Endpoint despite it being in the "Zombie List".  When the 
first Endpoint fails again, it will re-try the second Endpoint.

The *branch_9x* behavior seems more desirable as this minimizes unnecessary 
work by avoiding Endpoints that are known to fail. Indeed, *main/10.x* has an 
obvious bug in *EndpointIterator#fetchNext* where it attempts to get the wrong 
type of key for the map holding the Zombies.  I believe this difference is a 
regression bug in *main/10x*.

The different behavior is recorded in test 
*LBHttp2SolrClientTest#testAsyncWithFailures*. This test was added 
after-the-fact with SOLR-14763. I needed to change its "asserts" when 
backporting to *branch_9x* to account for the changed behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17066) Deprecate and remove core URLs in HttpSolrClient and friends

2024-04-16 Thread James Dyer (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837819#comment-17837819
 ] 

James Dyer commented on SOLR-17066:
---

[~gerlowskija]  Can this ticket be closed?

> Deprecate and remove core URLs in HttpSolrClient and friends
> 
>
> Key: SOLR-17066
> URL: https://issues.apache.org/jira/browse/SOLR-17066
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Reporter: Jason Gerlowski
>Priority: Major
>  Time Spent: 12h 40m
>  Remaining Estimate: 0h
>
> Currently, URL-driven SolrClients can consume a base URL that either ends in 
> an API-version specific path ("/solr" for v1 APIs, "/api" for v2), or in the 
> full path to a specific core or collection ("/solr/techproducts").
> The latter option causes a number of problems in practice.  It prevents the 
> client from being used for any "admin" requests or for requests to other 
> cores or collections.  (Short of running a regex on 
> {{SolrClient.getBaseURL}}, it's hard to even tell which of these restrictions 
> a given client might have.)  And lastly, specifying such core/collection URL 
> makes it tough mix and match v1 and v2 API requests within the same client 
> (see SOLR-17044).
> We should give SolrJ users some similar way to default collection/cores 
> without any of these  downsides.  One approach would be to extend the 
> {{withDefaultCollection}} pattern currently established in 
> {{CloudHttp2SolrClient.Builder}}.
> (IMO we should also revisit the division of responsibilities between 
> SolrClient and SolrRequest implementations - maybe clients shouldn't, 
> directly at least, be holding on to request-specific settings like the 
> core/collection at all.  But that's a much larger concern that we might not 
> want to wade into here.  See SOLR-10466 for more on this topic.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

2024-04-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837818#comment-17837818
 ] 

ASF subversion and git services commented on SOLR-14763:


Commit cd84216e0cd610dc7bef1bbd880274d950041c8b in solr's branch 
refs/heads/branch_9_6 from James Dyer
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=cd84216e0cd ]

SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture (update for 2024) 
(#2402)


> SolrJ Client Async HTTP/2 Requests
> --
>
> Key: SOLR-14763
> URL: https://issues.apache.org/jira/browse/SOLR-14763
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Affects Versions: 8.7
>Reporter: Rishi Sankar
>Assignee: James Dyer
>Priority: Major
> Fix For: main (10.0), 9.6
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make 
> more thread efficient HttpShardHandler requests. This added public async 
> request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways 
> this API can be improved, that I will track in this issue:
> 1) Using a CompletableFuture-based async API signature, instead of using 
> internal custom interfaces (Cancellable, AsyncListener) - based on [this 
> discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E].
> 2) An async API is also useful in other HTTP/2 Solr clients as well, 
> particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync 
> method to the SolrClient class, with a default method that initially throws 
> an unsupported operation exception (maybe this can be later updated to use an 
> executor to handle the async request as a default impl). For now, I'll 
> override the default implementation in the Http2SolrClient and 
> CloudHttp2SolrClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

2024-04-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837816#comment-17837816
 ] 

ASF subversion and git services commented on SOLR-14763:


Commit 20601cd4314295990044a84bbb7c0a854741fb6d in solr's branch 
refs/heads/branch_9x from James Dyer
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=20601cd4314 ]

SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture (update for 2024) 
(#2402)


> SolrJ Client Async HTTP/2 Requests
> --
>
> Key: SOLR-14763
> URL: https://issues.apache.org/jira/browse/SOLR-14763
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Affects Versions: 8.7
>Reporter: Rishi Sankar
>Assignee: James Dyer
>Priority: Major
> Fix For: main (10.0), 9.6
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make 
> more thread efficient HttpShardHandler requests. This added public async 
> request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways 
> this API can be improved, that I will track in this issue:
> 1) Using a CompletableFuture-based async API signature, instead of using 
> internal custom interfaces (Cancellable, AsyncListener) - based on [this 
> discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E].
> 2) An async API is also useful in other HTTP/2 Solr clients as well, 
> particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync 
> method to the SolrClient class, with a default method that initially throws 
> an unsupported operation exception (maybe this can be later updated to use an 
> executor to handle the async request as a default impl). For now, I'll 
> override the default implementation in the Http2SolrClient and 
> CloudHttp2SolrClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Updated] (SOLR-16594) improve eDismax strategy for generating a term-centric query

2024-04-16 Thread Rudi Seitz (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-16594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rudi Seitz updated SOLR-16594:
--
Description: 
When parsing a multi-term query that spans multiple fields, edismax sometimes 
switches from a "term-centric" to a "field-centric" approach. This creates 
inconsistent semantics for the {{mm}} or "min should match" parameter and may 
have an impact on scoring. The goal of this ticket is to improve the approach 
that edismax uses for generating term-centric queries so that edismax would 
less frequently "give up" and resort to the field-centric approach. 
Specifically, we propose that edismax should create a dismax query for each 
distinct startOffset found among the tokens emitted by the field analyzers. 
Since the relevant code in edismax works with Query objects that contain Terms, 
and since Terms do not hold the startOffset of the Token from which Term was 
derived, some plumbing work would need to be done to make the startOffsets 
available to edismax.

 

BACKGROUND:

 

If a user searches for "foo bar" with {{{}qf=f1 f2{}}}, a field-centric 
interpretation of the query would contain a clause for each field:

{{  (f1:foo f1:bar) (f2:foo f2:bar)}}

while a term-centric interpretation would contain a clause for each term:

{{  (f1:foo f2:foo) (f1:bar f2:bar)}}

The challenge in generating a term-centric query is that we need to take the 
tokens that emerge from each field's analysis chain and group them according to 
the terms in the user's original query. However, the tokens that emerge from an 
analysis chain do not store a reference to their corresponding input terms. For 
example, if we pass "foo bar" through an ngram analyzer we would get a token 
stream containing "f", "fo", "foo", "b", "ba", "bar". While it may be obvious 
to a human that "f", "fo", and "foo" all come from the "foo" input term, and 
that "b", "ba", and "bar" come from the "bar" input term, there is not always 
an easy way for edismax to see this connection. When {{{}sow=true{}}}, edismax 
passes each whitespace-separated term through each analysis chain separately, 
and therefore edismax "knows" that the output tokens from any given analysis 
chain are all derived from the single input term that was passed into that 
chain. However, when {{{}sow=false{}}}, edismax passes the entire multi-term 
query through each analysis chain as a whole, resulting in multiple output 
tokens that are not "connected" to their source term.

Edismax still tries to generate a term-centric query when {{sow=false}} by 
first generating a boolean query for each field, and then checking whether all 
of these per-field queries have the same structure. The structure will 
generally be uniform if each analysis chain emits the same number of tokens for 
the given input. If one chain has a synonym filter and another doesn’t, this 
uniformity may depend on whether a synonym rule happened to match a term in the 
user's input.

Assuming the per-field boolean queries _do_ have the same structure, edismax 
reorganizes them into a new boolean query. The new query contains a dismax for 
each clause position in the original queries. If the original queries are 
{{(f1:foo f1:bar)}} and {{(f2:foo f2:bar)}} we can see they have two clauses 
each, so we would get a dismax containing all the first position clauses 
{{(f1:foo f1:bar)}} and another dismax containing all the second position 
clauses {{{}(f2:foo f2:bar){}}}.

We can see that edismax is using clause position as a heuristic to reorganize 
the per-field boolean queries into per-term ones, even though it doesn't know 
for sure which clauses inside those per-field boolean queries are related to 
which input terms. We propose that a better way of reorganizing the per-field 
boolean queries is to create a dismax for each distinct startOffset seen among 
the tokens in the token streams emitted by each field analyzer. The startOffset 
of a token (rather, a PackedTokenAttributeImpl) is "the position of the first 
character corresponding to this token in the source text".

We propose that startOffset is a resonable way of matching output tokens up 
with the input terms that gave rise to them. For example, if we pass "foo bar" 
through an ngram analysis chain we see that the foo-related tokens all have 
startOffset=0 while the bar-related tokens all have startOffset=4. Likewise, 
tokens that are generated via synonym expansion have a startOffset that points 
to the beginning of the matching input term. For example, if the query "GB" 
generates "GB gib gigabyte gigabytes" via synonym expansion, all of those four 
tokens would have startOffset=0.

Here's an example of how the proposed edismax logic would work. Let's say a 
user searches for "foo bar" across two fields, f1 and f2, where f1 uses a 
standard text analysis chain while f2 generates ngrams. We would get 
field-centric queries {{(f1:foo f1:bar)}} and

[jira] [Commented] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception

2024-04-16 Thread Rudi Seitz (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837803#comment-17837803
 ] 

Rudi Seitz commented on SOLR-12813:
---

I have begun implementing a fix here: 
[https://github.com/rseitz/solr/commit/c51f038f33b21411ce5c01ccf6d9f4d17690d82b]

I found two separate places where credentials are lost. First, the 
SubQueryAugmenterFactor never sets credentials on the subqueries that it 
generates. Second, when a subquery is handled by EmbeddedSolrServer, the query 
goes through various transformations that would drop credentials if they had 
been present in the first place. The code I'm sharing here fixes both issues 
and I've tested it manually with collection with 2 shards in a 2-node cluster. 
The fix only works with forwardCredentials=true.

I am working on writing a unit test and creating a PR. In the meantime, I'm 
eager for any feedback on the proposed changes.

> SolrCloud + 2 shards + subquery + auth = 401 Exception
> --
>
> Key: SOLR-12813
> URL: https://issues.apache.org/jira/browse/SOLR-12813
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 6.4.1, 7.5, 8.11
>Reporter: Igor Fedoryn
>Priority: Major
> Attachments: screen1.png, screen2.png
>
>
> Environment: * Solr 6.4.1
>  * Zookeeper 3.4.6
>  * Java 1.8
> Run Zookeeper
> Upload simple configuration wherein the Solr schema has fields for a 
> relationship between parent/child
> Run two Solr instance (2 nodes)
> Create the collection with 1 shard on each Solr nodes
>  
> Add parent document to one shard and child document to another shard.
> The response for: * 
> /select?q=ChildIdField:VALUE=*,parents:[subqery]=\{!term f=id 
> v=$row.ParentIdsField}
> correct.
>  
> After that add Basic Authentication with some user for collection.
> Restart Solr or reload Solr collection.
> If the simple request /select?q=*:* with authorization on Solr server is a 
> success then run previously request
> with authorization on Solr server and you get the exception: "Solr HTTP 
> error: Unauthorized (401) "
>  
> Screens in the attachment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

2024-04-16 Thread via GitHub



gerlowskija commented on PR #2395:
URL: https://github.com/apache/solr/pull/2395#issuecomment-2059268487

   > I don't like the complexity in this URP relating to tolerance of where the 
URP is placed in the chain; I'd feel better if the URP were simplified from 
that concern and we expect the user to place it at an appropriate spot
   
   Yeah I understand.  I still feel a little reluctant I guess, but I'm likely 
being paranoid.  It's harder to feel good about the "win" in making things 
safer for users if the change opens up another trappy state when misconfigured. 
 But I'll take out the leader-check for now.  We can always re-evaluate later 
if we see people getting bitten by this, and there are other ways to mitigate 
the risk of misconfig (e.g. putting into the "_default" config for 10.0 so 
users needn't tweak things).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

2024-04-16 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837736#comment-17837736
 ] 

ASF subversion and git services commented on SOLR-14763:


Commit c512116f6a20b3ccd0c76c0743053553da2ff53b in solr's branch 
refs/heads/main from James Dyer
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=c512116f6a2 ]

SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture (update for 2024) 
(#2402)

 

> SolrJ Client Async HTTP/2 Requests
> --
>
> Key: SOLR-14763
> URL: https://issues.apache.org/jira/browse/SOLR-14763
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrJ
>Affects Versions: 8.7
>Reporter: Rishi Sankar
>Assignee: James Dyer
>Priority: Major
> Fix For: main (10.0), 9.6
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make 
> more thread efficient HttpShardHandler requests. This added public async 
> request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways 
> this API can be improved, that I will track in this issue:
> 1) Using a CompletableFuture-based async API signature, instead of using 
> internal custom interfaces (Cancellable, AsyncListener) - based on [this 
> discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E].
> 2) An async API is also useful in other HTTP/2 Solr clients as well, 
> particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync 
> method to the SolrClient class, with a default method that initially throws 
> an unsupported operation exception (maybe this can be later updated to use an 
> executor to handle the async request as a default impl). For now, I'll 
> override the default implementation in the Http2SolrClient and 
> CloudHttp2SolrClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture (update for 2024) [solr]

2024-04-16 Thread via GitHub



jdyer1 merged PR #2402:
URL: https://github.com/apache/solr/pull/2402


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17204: REPLACENODE supports the source node not being live [solr]

2024-04-16 Thread via GitHub



HoustonPutman commented on PR #2353:
URL: https://github.com/apache/solr/pull/2353#issuecomment-2059215053

   +1 from me, sorry that I missed this.
   
   Should we do the same thing for `MigrateReplicasCmd`? Seems like a good idea 
to keep parity there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] Add correct exception logging in the ExecutorUtil [solr]

2024-04-16 Thread via GitHub



HoustonPutman merged PR #2384:
URL: https://github.com/apache/solr/pull/2384


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Comment Edited] (SOLR-15735) SolrJ should fully support Solr's v2 API

2024-04-16 Thread Yohann Callea (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837723#comment-17837723
 ] 

Yohann Callea edited comment on SOLR-15735 at 4/16/24 1:51 PM:
---

If it can root for the need to support v2 APIs in SolrJ, it is worth mentioning 
that some APIs introduced with Solr 9 are v2 only, without a v1 counterpart.

These v2 only APIs do not seem to be handled properly when called from the 
(generated) client in SolrJ, which is quite unsettling from a SolrJ user's 
perspective.

To illustrate this behavior, let's take the BalanceReplicas API as an example. 
I will simply call it using the _ClusterApi.BalanceReplicas_ client exposed in 
SolrJ in 
[BalanceReplicasTest|https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/cloud/BalanceReplicasTest.java#L102-L105]
 in place of _postDataAndGetResponse(...)_ so it is reproducible.
{code:java}
BalanceReplicas req = new BalanceReplicas();
req.setWaitForFinalState(true);
req.process(cloudClient); {code}
Such call is consistently throwing {*}_SolrException: No collection param 
specified on request and no default collection has been set: []_{*}, as we are 
unexpectedly falling in the following else section of 
[CloudSolrClient|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L1028-L1064]
 :
{code:java}
if (request instanceof V2Request) {
  if (!liveNodes.isEmpty()) {
List liveNodesList = new ArrayList<>(liveNodes);
Collections.shuffle(liveNodesList, rand);
final var chosenNodeUrl = Utils.getBaseUrlForNodeName(liveNodesList.get(0), 
urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(chosenNodeUrl));
  }

} else if (ADMIN_PATHS.contains(request.getPath())) {
  for (String liveNode : liveNodes) {
final var nodeBaseUrl = Utils.getBaseUrlForNodeName(liveNode, urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(nodeBaseUrl));
  }

} else { // Typical...
  Set collectionNames = resolveAliases(inputCollections);
  if (collectionNames.isEmpty()) {
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
"No collection param specified on request and no default collection has 
been set: "
+ inputCollections);
  }
  [...]
}{code}
I would not expect a SolrJ user to tinker with its SolrCloudClient to change 
the path prefix from /solr to /api to make it work in this situation.
Maybe SolrJ should expose API clients leveraging V2Request for these specific 
APIs, as it would then work just fine :
{code:java}
V2Request req =
  new V2Request.Builder("cluster/replicas/balance")
.forceV2(true)
.POST()
.withPayload(Map.of(WAIT_FOR_FINAL_STATE, true))
.build();
req.process(cloudClient);
{code}


was (Author: JIRAUSER303849):
If it can root for the need to support v2 APIs in SolrJ, it is worth mentioning 
that some APIs introduced with Solr 9 are v2 only, without a v1 counterpart.

These v2 only APIs do not seem to be handled properly when called from the 
(generated) client in SolrJ, which is quite unsettling from a SolrJ user's 
perspective.

To illustrate this behavior, let's take the BalanceReplicas API as an example. 
I will simply call it using the _ClusterApi.BalanceReplica_ client exposed in 
SolrJ in 
[BalanceReplicasTest|https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/cloud/BalanceReplicasTest.java#L102-L105]
 in place of _postDataAndGetResponse(...)_ so it is reproducible.
{code:java}
BalanceReplicas req = new BalanceReplicas();
req.setWaitForFinalState(true);
req.process(cloudClient); {code}
Such call is consistently throwing {*}_SolrException: No collection param 
specified on request and no default collection has been set: []_{*}, as we are 
unexpectedly falling in the following else section of 
[CloudSolrClient|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L1028-L1064]
 :
{code:java}
if (request instanceof V2Request) {
  if (!liveNodes.isEmpty()) {
List liveNodesList = new ArrayList<>(liveNodes);
Collections.shuffle(liveNodesList, rand);
final var chosenNodeUrl = Utils.getBaseUrlForNodeName(liveNodesList.get(0), 
urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(chosenNodeUrl));
  }

} else if (ADMIN_PATHS.contains(request.getPath())) {
  for (String liveNode : liveNodes) {
final var nodeBaseUrl = Utils.getBaseUrlForNodeName(liveNode, urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(nodeBaseUrl));
  }

} else { // Typical...
  Set collectionNames = resolveAliases(inputCollections);
  if (collectionNames.isEmpty()) {
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
"No collection param specified on request and no default collection has 
been set: "
+

[jira] [Commented] (SOLR-15735) SolrJ should fully support Solr's v2 API

2024-04-16 Thread Yohann Callea (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-15735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837723#comment-17837723
 ] 

Yohann Callea commented on SOLR-15735:
--

If it can root for the need to support v2 APIs in SolrJ, it is worth mentioning 
that some APIs introduced with Solr 9 are v2 only, without a v1 counterpart.

These v2 only APIs do not seem to be handled properly when called from the 
(generated) client in SolrJ, which is quite unsettling from a SolrJ user's 
perspective.

To illustrate this behavior, let's take the BalanceReplicas API as an example. 
I will simply call it using the _ClusterApi.BalanceReplica_ client exposed in 
SolrJ in 
[BalanceReplicasTest|https://github.com/apache/solr/blob/main/solr/core/src/test/org/apache/solr/cloud/BalanceReplicasTest.java#L102-L105]
 in place of _postDataAndGetResponse(...)_ so it is reproducible.
{code:java}
BalanceReplicas req = new BalanceReplicas();
req.setWaitForFinalState(true);
req.process(cloudClient); {code}
Such call is consistently throwing {*}_SolrException: No collection param 
specified on request and no default collection has been set: []_{*}, as we are 
unexpectedly falling in the following else section of 
[CloudSolrClient|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L1028-L1064]
 :
{code:java}
if (request instanceof V2Request) {
  if (!liveNodes.isEmpty()) {
List liveNodesList = new ArrayList<>(liveNodes);
Collections.shuffle(liveNodesList, rand);
final var chosenNodeUrl = Utils.getBaseUrlForNodeName(liveNodesList.get(0), 
urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(chosenNodeUrl));
  }

} else if (ADMIN_PATHS.contains(request.getPath())) {
  for (String liveNode : liveNodes) {
final var nodeBaseUrl = Utils.getBaseUrlForNodeName(liveNode, urlScheme);
requestEndpoints.add(new LBSolrClient.Endpoint(nodeBaseUrl));
  }

} else { // Typical...
  Set collectionNames = resolveAliases(inputCollections);
  if (collectionNames.isEmpty()) {
throw new SolrException(
SolrException.ErrorCode.BAD_REQUEST,
"No collection param specified on request and no default collection has 
been set: "
+ inputCollections);
  }
  [...]
}{code}
I would not expect a SolrJ user to tinker with its SolrCloudClient to change 
the path prefix from /solr to /api to make it work in this situation.
Maybe SolrJ should expose API clients leveraging V2Request for these specific 
APIs, as it would then work just fine :
{code:java}
V2Request req =
  new V2Request.Builder("cluster/replicas/balance")
.forceV2(true)
.POST()
.withPayload(Map.of(WAIT_FOR_FINAL_STATE, true))
.build();
req.process(cloudClient);
{code}

> SolrJ should fully support Solr's v2 API
> 
>
> Key: SOLR-15735
> URL: https://issues.apache.org/jira/browse/SOLR-15735
> Project: Solr
>  Issue Type: Improvement
>  Components: v2 API
>Reporter: Jason Gerlowski
>Priority: Major
>  Labels: V2
>
> Having our v2 API exercised by our test suite would provide a needed boost of 
> confidence and serve to flush out any existing gaps. Doing this though 
> requires that the v2 API is exposed through SolrJ, since SolrJ is mostly what 
> our tests are based on.
> This ticket serves as an umrella to track whatever works ends up being 
> necessary for updating SolrJ to use the V2 API. At a minimum, this will need 
> to include updating individual SolrRequest objects to use a v2 API, and 
> ensuring that SolrClient's offer the same optimizations in routing, etc. to 
> v2 requests as they do for v1.
> One open question that'll impact the scope of this work significantly is 
> whether SolrJ must support v1 and v2 simultaneously, or whether individual 
> SolrRequest implementations can be switched over to v2 without retaining v1 
> support. (See discussion of this 
> [here|https://issues.apache.org/jira/browse/SOLR-15141?focusedCommentId=17435576=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17435576]).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-17233) The parameters of q.op are recommended to be trim()

2024-04-16 Thread Eric Pugh (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837700#comment-17837700
 ] 

Eric Pugh commented on SOLR-17233:
--

I'd be happy to review a PR for fixing this   I wonder if we need to have 
better handling of all parameter parsing...  Curious how the space got in in 
the very first place?

> The parameters of q.op are recommended to be trim()
> ---
>
> Key: SOLR-17233
> URL: https://issues.apache.org/jira/browse/SOLR-17233
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: ichar
>Priority: Minor
> Attachments: image-2024-04-16-14-46-12-603.png
>
>
> !image-2024-04-16-14-46-12-603.png|width=301,height=265!
> The expected result should be empty, but this was found out. The reason is 
> that the parameter "and" of q.op (with an extra space) is taken out and the 
> judgment of "and" is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]

2024-04-16 Thread via GitHub



sigram commented on code in PR #2403:
URL: https://github.com/apache/solr/pull/2403#discussion_r1567093594


##
solr/core/src/java/org/apache/solr/search/QueryLimits.java:
##
@@ -108,12 +110,21 @@ public String formatExceptionMessage(String label) {
* @throws QueryLimitsExceededException if {@link 
CommonParams#PARTIAL_RESULTS} request parameter
* is false and limits have been reached.
*/
+  public boolean maybeExitWithPartialResults(Supplier label)
+  throws QueryLimitsExceededException {
+return maybeExitWithPartialResults(label.get());
+  }
+
   public boolean maybeExitWithPartialResults(String label) throws 
QueryLimitsExceededException {
 if (isLimitsEnabled() && shouldExit()) {
   if (allowPartialResults) {
 if (rsp != null) {
   rsp.setPartialResults();
-  rsp.addPartialResponseDetail(formatExceptionMessage(label));
+  if 
(rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == 
null) {

Review Comment:
   Hmm, ok... then maybe we should add some processing similar to 
`computeShardCpuTime` to aggregate multiple details from shard responses into a 
single value?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-16505: Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 [solr]

2024-04-16 Thread via GitHub



iamsanjay commented on PR #2276:
URL: https://github.com/apache/solr/pull/2276#issuecomment-2058715129

   ### Can we remove the Legacy Auth mechanism? 
   
   No
   
   In User managed cluster, the only communication, IMO, happening is 
downloading of index from one node to another. The `PKIAuthenticationPlugin` 
only works in cloud environment and the condition that enclosed the PKI 
initialization logic checks whether it `isZookeeperAware()`.  Of course! that 
condition evaluates to false in UserManaged cluster and therefore the method 
which is responsible of setting auth headers never gets called.
   
   
https://github.com/apache/solr/blob/c3c83ffb8dba17dd79f78429df65869d1b7d87bb/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L836-L855
   
   
https://github.com/apache/solr/blob/c3c83ffb8dba17dd79f78429df65869d1b7d87bb/solr/core/src/java/org/apache/solr/core/CoreContainer.java#L626-L631
   
   Will be adding a test case to test Legacy replication when Basic auth is 
enabled


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]

2024-04-16 Thread via GitHub



sigram commented on code in PR #2403:
URL: https://github.com/apache/solr/pull/2403#discussion_r1567091001


##
solr/core/src/java/org/apache/solr/search/QueryLimits.java:
##
@@ -108,22 +110,31 @@ public String formatExceptionMessage(String label) {
* @throws QueryLimitsExceededException if {@link 
CommonParams#PARTIAL_RESULTS} request parameter
* is false and limits have been reached.
*/
-  public boolean maybeExitWithPartialResults(String label) throws 
QueryLimitsExceededException {
+  public boolean maybeExitWithPartialResults(Supplier label)
+  throws QueryLimitsExceededException {
 if (isLimitsEnabled() && shouldExit()) {
   if (allowPartialResults) {
 if (rsp != null) {
   rsp.setPartialResults();
-  rsp.addPartialResponseDetail(formatExceptionMessage(label));
+  if 
(rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == 
null) {
+// don't want to add duplicate keys. Although technically legal, 
there's a strong risk
+// that clients won't anticipate it and break.
+rsp.addPartialResponseDetail(formatExceptionMessage(label.get()));
+  }
 }
 return true;
   } else {
-throw new QueryLimitsExceededException(formatExceptionMessage(label));
+throw new 
QueryLimitsExceededException(formatExceptionMessage(label.get()));
   }
 } else {
   return false;
 }
   }
 
+  public boolean maybeExitWithPartialResults(String label) throws 
QueryLimitsExceededException {

Review Comment:
   I think we should add some javadoc here that explains why we have two 
different methods for doing essentially the same work, and when to prefer one 
over the other.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Commented] (SOLR-16441) Upgrade Jetty to 11.x

2024-04-16 Thread Henrik (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-16441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837588#comment-17837588
 ] 

Henrik commented on SOLR-16441:
---

Maybe close this issue in favour of  SOLR-17069 ?

> Upgrade Jetty to 11.x
> -
>
> Key: SOLR-16441
> URL: https://issues.apache.org/jira/browse/SOLR-16441
> Project: Solr
>  Issue Type: Improvement
>  Components: Server
>Reporter: Tomas Eduardo Fernandez Lobbe
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Solr is currently using Jetty 9.4.x and upgrading to Jetty 10.x in 
> SOLR-15955, we should look at upgrade to Jetty 11 which moves from javax to 
> jakarta namespace for servlet.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

[jira] [Created] (SOLR-17233) The parameters of q.op are recommended to be trim()

2024-04-16 Thread ichar (Jira)

ichar created SOLR-17233:


 Summary: The parameters of q.op are recommended to be trim()
 Key: SOLR-17233
 URL: https://issues.apache.org/jira/browse/SOLR-17233
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud
Reporter: ichar
 Attachments: image-2024-04-16-14-46-12-603.png

!image-2024-04-16-14-46-12-603.png|width=301,height=265!

The expected result should be empty, but this was found out. The reason is that 
the parameter "and" of q.op (with an extra space) is taken out and the judgment 
of "and" is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

Re: [PR] SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr [solr]

Re: [PR] SOLR-12813: subqueries should respect basic auth [solr]

[jira] [Commented] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception

[PR] SOLR-12813: subqueries should respect basic auth [solr]

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

[jira] [Commented] (SOLR-17106) LBSolrClient: Make it configurable to remove zombie ping checks

[jira] [Created] (SOLR-17234) LBHttp2SolrClient does not skip "zombie" endpoints

[jira] [Commented] (SOLR-17066) Deprecate and remove core URLs in HttpSolrClient and friends

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

[jira] [Updated] (SOLR-16594) improve eDismax strategy for generating a term-centric query

[jira] [Commented] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception

Re: [PR] SOLR-17192: Add "field-limiting" URP to catch ill-designed schemas [solr]

[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests

Re: [PR] SOLR-14763 SolrJ HTTP/2 Async API using CompletableFuture (update for 2024) [solr]

Re: [PR] SOLR-17204: REPLACENODE supports the source node not being live [solr]

Re: [PR] Add correct exception logging in the ExecutorUtil [solr]

[jira] [Comment Edited] (SOLR-15735) SolrJ should fully support Solr's v2 API

[jira] [Commented] (SOLR-15735) SolrJ should fully support Solr's v2 API

[jira] [Commented] (SOLR-17233) The parameters of q.op are recommended to be trim()

Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]

Re: [PR] SOLR-16505: Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 [solr]

Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]

[jira] [Commented] (SOLR-16441) Upgrade Jetty to 11.x

[jira] [Created] (SOLR-17233) The parameters of q.op are recommended to be trim()

27 matches

Site Navigation

Mail list logo

Footer information