Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
github-actions[bot] commented on PR #2403: URL: https://github.com/apache/solr/pull/2403#issuecomment-2177273049 This PR had no visible activity in the past 60 days, labeling it as stale. Any new activity will remove the stale label. To attract more reviewers, please tag someone or notify the d...@solr.apache.org mailing list. Thank you for your contribution! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
sigram commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1567093594 ## solr/core/src/java/org/apache/solr/search/QueryLimits.java: ## @@ -108,12 +110,21 @@ public String formatExceptionMessage(String label) { * @throws QueryLimitsExceededException if {@link CommonParams#PARTIAL_RESULTS} request parameter * is false and limits have been reached. */ + public boolean maybeExitWithPartialResults(Supplier label) + throws QueryLimitsExceededException { +return maybeExitWithPartialResults(label.get()); + } + public boolean maybeExitWithPartialResults(String label) throws QueryLimitsExceededException { if (isLimitsEnabled() && shouldExit()) { if (allowPartialResults) { if (rsp != null) { rsp.setPartialResults(); - rsp.addPartialResponseDetail(formatExceptionMessage(label)); + if (rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == null) { Review Comment: Hmm, ok... then maybe we should add some processing similar to `computeShardCpuTime` to aggregate multiple details from shard responses into a single value? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
sigram commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1567091001 ## solr/core/src/java/org/apache/solr/search/QueryLimits.java: ## @@ -108,22 +110,31 @@ public String formatExceptionMessage(String label) { * @throws QueryLimitsExceededException if {@link CommonParams#PARTIAL_RESULTS} request parameter * is false and limits have been reached. */ - public boolean maybeExitWithPartialResults(String label) throws QueryLimitsExceededException { + public boolean maybeExitWithPartialResults(Supplier label) + throws QueryLimitsExceededException { if (isLimitsEnabled() && shouldExit()) { if (allowPartialResults) { if (rsp != null) { rsp.setPartialResults(); - rsp.addPartialResponseDetail(formatExceptionMessage(label)); + if (rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == null) { +// don't want to add duplicate keys. Although technically legal, there's a strong risk +// that clients won't anticipate it and break. +rsp.addPartialResponseDetail(formatExceptionMessage(label.get())); + } } return true; } else { -throw new QueryLimitsExceededException(formatExceptionMessage(label)); +throw new QueryLimitsExceededException(formatExceptionMessage(label.get())); } } else { return false; } } + public boolean maybeExitWithPartialResults(String label) throws QueryLimitsExceededException { Review Comment: I think we should add some javadoc here that explains why we have two different methods for doing essentially the same work, and when to prefer one over the other. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
gus-asf commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1566112815 ## solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java: ## @@ -621,46 +684,72 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw rsp.addToLog(ThreadCpuTimer.CPU_TIME, totalShardCpuTime); } } + } -// SOLR-5550: still provide shards.info if requested even for a short circuited distrib request -if (!rb.isDistrib -&& req.getParams().getBool(ShardParams.SHARDS_INFO, false) -&& rb.shortCircuitedURL != null) { - NamedList shardInfo = new SimpleOrderedMap<>(); - SimpleOrderedMap nl = new SimpleOrderedMap<>(); - if (rsp.getException() != null) { -Throwable cause = rsp.getException(); -if (cause instanceof SolrServerException) { - cause = ((SolrServerException) cause).getRootCause(); -} else { - if (cause.getCause() != null) { -cause = cause.getCause(); - } -} -nl.add("error", cause.toString()); -if (!core.getCoreContainer().hideStackTrace()) { - StringWriter trace = new StringWriter(); - cause.printStackTrace(new PrintWriter(trace)); - nl.add("trace", trace.toString()); -} - } else if (rb.getResults() != null) { -nl.add("numFound", rb.getResults().docList.matches()); -nl.add( -"numFoundExact", -rb.getResults().docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO); -nl.add("maxScore", rb.getResults().docList.maxScore()); - } - nl.add("shardAddress", rb.shortCircuitedURL); - nl.add("time", req.getRequestTimer().getTime()); // elapsed time of this request so far + private static String stageInEngilish(int nextStage) { +// This should probably be a enum, but that change should be its own ticket. +//public static int STAGE_START = 0; +// +//public static int STAGE_PARSE_QUERY = 1000; +//public static int STAGE_TOP_GROUPS = 1500; +//public static int STAGE_EXECUTE_QUERY = 2000; +//public static int STAGE_GET_FIELDS = 3000; +//public static int STAGE_DONE = Integer.MAX_VALUE; +switch (nextStage) { + case STAGE_START: +return "START"; + case STAGE_PARSE_QUERY: +return "PARSE_QUERY"; + case STAGE_TOP_GROUPS: +return "TOP_GROUPS"; + case STAGE_EXECUTE_QUERY: +return "EXECUTE_QUERY"; + case STAGE_GET_FIELDS: +return "GET_FIELDS"; +// nobody wants to think it was DONE and canceled after it completed... + case STAGE_DONE: +return "FINISHING"; + default: +throw new SolrException( +SolrException.ErrorCode.SERVER_ERROR, "Unrecognized stage:" + nextStage); +} + } - int pos = rb.shortCircuitedURL.indexOf("://"); - String shardInfoName = - pos != -1 ? rb.shortCircuitedURL.substring(pos + 3) : rb.shortCircuitedURL; - shardInfo.add(shardInfoName, nl); - rsp.getValues().add(ShardParams.SHARDS_INFO, shardInfo); + private static void shortCircuitedResults(SolrQueryRequest req, ResponseBuilder rb) { + +if (rb.rsp.getResponse() == null) { + rb.rsp.addResponse(new SolrDocumentList()); + + // If a cursorMark was passed, and we didn't progress, set + // the nextCursorMark to the same position + String cursorStr = rb.req.getParams().get(CursorMarkParams.CURSOR_MARK_PARAM); + if (null != cursorStr) { +rb.rsp.add(CursorMarkParams.CURSOR_MARK_NEXT, cursorStr); + } +} +if (rb.isDebug()) { + NamedList debug = new NamedList<>(); + debug.add("explain", new NamedList<>()); + rb.rsp.add("debug", debug); } } + private static boolean checkLimitsBefore( + SearchComponent c, + String when, Review Comment: stage is what we use for the finishing steps... maybe phase? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
gus-asf commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1566094054 ## solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java: ## @@ -621,46 +684,72 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw rsp.addToLog(ThreadCpuTimer.CPU_TIME, totalShardCpuTime); } } + } -// SOLR-5550: still provide shards.info if requested even for a short circuited distrib request -if (!rb.isDistrib -&& req.getParams().getBool(ShardParams.SHARDS_INFO, false) -&& rb.shortCircuitedURL != null) { - NamedList shardInfo = new SimpleOrderedMap<>(); - SimpleOrderedMap nl = new SimpleOrderedMap<>(); - if (rsp.getException() != null) { -Throwable cause = rsp.getException(); -if (cause instanceof SolrServerException) { - cause = ((SolrServerException) cause).getRootCause(); -} else { - if (cause.getCause() != null) { -cause = cause.getCause(); - } -} -nl.add("error", cause.toString()); -if (!core.getCoreContainer().hideStackTrace()) { - StringWriter trace = new StringWriter(); - cause.printStackTrace(new PrintWriter(trace)); - nl.add("trace", trace.toString()); -} - } else if (rb.getResults() != null) { -nl.add("numFound", rb.getResults().docList.matches()); -nl.add( -"numFoundExact", -rb.getResults().docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO); -nl.add("maxScore", rb.getResults().docList.maxScore()); - } - nl.add("shardAddress", rb.shortCircuitedURL); - nl.add("time", req.getRequestTimer().getTime()); // elapsed time of this request so far + private static String stageInEngilish(int nextStage) { +// This should probably be a enum, but that change should be its own ticket. +//public static int STAGE_START = 0; +// +//public static int STAGE_PARSE_QUERY = 1000; +//public static int STAGE_TOP_GROUPS = 1500; +//public static int STAGE_EXECUTE_QUERY = 2000; +//public static int STAGE_GET_FIELDS = 3000; +//public static int STAGE_DONE = Integer.MAX_VALUE; +switch (nextStage) { + case STAGE_START: +return "START"; + case STAGE_PARSE_QUERY: +return "PARSE_QUERY"; + case STAGE_TOP_GROUPS: +return "TOP_GROUPS"; + case STAGE_EXECUTE_QUERY: +return "EXECUTE_QUERY"; + case STAGE_GET_FIELDS: +return "GET_FIELDS"; +// nobody wants to think it was DONE and canceled after it completed... + case STAGE_DONE: +return "FINISHING"; + default: +throw new SolrException( +SolrException.ErrorCode.SERVER_ERROR, "Unrecognized stage:" + nextStage); +} + } - int pos = rb.shortCircuitedURL.indexOf("://"); - String shardInfoName = - pos != -1 ? rb.shortCircuitedURL.substring(pos + 3) : rb.shortCircuitedURL; - shardInfo.add(shardInfoName, nl); - rsp.getValues().add(ShardParams.SHARDS_INFO, shardInfo); + private static void shortCircuitedResults(SolrQueryRequest req, ResponseBuilder rb) { + +if (rb.rsp.getResponse() == null) { + rb.rsp.addResponse(new SolrDocumentList()); + + // If a cursorMark was passed, and we didn't progress, set + // the nextCursorMark to the same position + String cursorStr = rb.req.getParams().get(CursorMarkParams.CURSOR_MARK_PARAM); + if (null != cursorStr) { +rb.rsp.add(CursorMarkParams.CURSOR_MARK_NEXT, cursorStr); + } +} +if (rb.isDebug()) { + NamedList debug = new NamedList<>(); + debug.add("explain", new NamedList<>()); + rb.rsp.add("debug", debug); } } + private static boolean checkLimitsBefore( + SearchComponent c, + String when, + SolrQueryRequest req, + SolrQueryResponse resp, + List components) { + +return getQueryLimits(req, resp) +.maybeExitWithPartialResults( +() -> { Review Comment: oops, meant to do it the other way around so that the lambda is only evaluated if we're busting the limit. Will fix. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
gus-asf commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1566090370 ## solr/core/src/java/org/apache/solr/search/QueryLimits.java: ## @@ -108,12 +110,21 @@ public String formatExceptionMessage(String label) { * @throws QueryLimitsExceededException if {@link CommonParams#PARTIAL_RESULTS} request parameter * is false and limits have been reached. */ + public boolean maybeExitWithPartialResults(Supplier label) + throws QueryLimitsExceededException { +return maybeExitWithPartialResults(label.get()); + } + public boolean maybeExitWithPartialResults(String label) throws QueryLimitsExceededException { if (isLimitsEnabled() && shouldExit()) { if (allowPartialResults) { if (rsp != null) { rsp.setPartialResults(); - rsp.addPartialResponseDetail(formatExceptionMessage(label)); + if (rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == null) { Review Comment: Well typically before I did this we would get the initial limit violation, and then at least one component violation which isn't really interesting because the first one is where we really stopped. Also there are tests that check the contents of partialResultsDetail using `JSONTestUtil.match()` which doesn't seem to have a syntax for handling duplicate keys (or if it does I can't figure out what it is), and also if someone out there has been using a Jackson Object Mapper or similar to turn the header into an object (or JSON.parse() in javascript where they'll loose one anyway) things could go sideways if they suddenly encounter either a duplicate key or an array there... I can provide a rant about the fact we allow/use duplicate keys in the first place, how we should be referring to our APIs as noggit APIs not JSON APIs because we accept illegal JSON... etc... If you're interested :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
sigram commented on code in PR #2403: URL: https://github.com/apache/solr/pull/2403#discussion_r1566013026 ## solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java: ## @@ -621,46 +684,72 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw rsp.addToLog(ThreadCpuTimer.CPU_TIME, totalShardCpuTime); } } + } -// SOLR-5550: still provide shards.info if requested even for a short circuited distrib request -if (!rb.isDistrib -&& req.getParams().getBool(ShardParams.SHARDS_INFO, false) -&& rb.shortCircuitedURL != null) { - NamedList shardInfo = new SimpleOrderedMap<>(); - SimpleOrderedMap nl = new SimpleOrderedMap<>(); - if (rsp.getException() != null) { -Throwable cause = rsp.getException(); -if (cause instanceof SolrServerException) { - cause = ((SolrServerException) cause).getRootCause(); -} else { - if (cause.getCause() != null) { -cause = cause.getCause(); - } -} -nl.add("error", cause.toString()); -if (!core.getCoreContainer().hideStackTrace()) { - StringWriter trace = new StringWriter(); - cause.printStackTrace(new PrintWriter(trace)); - nl.add("trace", trace.toString()); -} - } else if (rb.getResults() != null) { -nl.add("numFound", rb.getResults().docList.matches()); -nl.add( -"numFoundExact", -rb.getResults().docList.hitCountRelation() == TotalHits.Relation.EQUAL_TO); -nl.add("maxScore", rb.getResults().docList.maxScore()); - } - nl.add("shardAddress", rb.shortCircuitedURL); - nl.add("time", req.getRequestTimer().getTime()); // elapsed time of this request so far + private static String stageInEngilish(int nextStage) { Review Comment: Engilish -> English :) ## solr/core/src/java/org/apache/solr/search/QueryLimits.java: ## @@ -108,12 +110,21 @@ public String formatExceptionMessage(String label) { * @throws QueryLimitsExceededException if {@link CommonParams#PARTIAL_RESULTS} request parameter * is false and limits have been reached. */ + public boolean maybeExitWithPartialResults(Supplier label) + throws QueryLimitsExceededException { +return maybeExitWithPartialResults(label.get()); + } + public boolean maybeExitWithPartialResults(String label) throws QueryLimitsExceededException { if (isLimitsEnabled() && shouldExit()) { if (allowPartialResults) { if (rsp != null) { rsp.setPartialResults(); - rsp.addPartialResponseDetail(formatExceptionMessage(label)); + if (rsp.getResponseHeader().get(RESPONSE_HEADER_PARTIAL_RESULTS_DETAILS_KEY) == null) { Review Comment: This loses information about the exceeded limits inside some components, when we accept partial results (ie. exception is not thrown immediately). Why wouldn't duplicate keys be ok? NamedList is essentially a multi-map so multiple keys of the same value are explicitly allowed. ## solr/core/src/test/org/apache/solr/core/ExitableDirectoryReaderTest.java: ## @@ -70,9 +70,12 @@ public void testPrefixQuery() throws Exception { // this time we should get a query cache hit and hopefully no exception? this may change in the // future if time checks are put into other places. -assertJQ(req("q", q, "timeAllowed", "1", "sleep", sleep), assertionString); +// 2024-4-15: it did change..., and now this fails with 1 or 2 ms and passes with 3ms... I see +// no way this won't be terribly brittle. Maybe TestInjection of some sort to bring this back? Review Comment: Definitely the answer :) please see how it's used in `TestQueryLimits`. ## solr/core/src/java/org/apache/solr/handler/component/SearchHandler.java: ## @@ -621,46 +684,72 @@ public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throw rsp.addToLog(ThreadCpuTimer.CPU_TIME, totalShardCpuTime); } } + } -// SOLR-5550: still provide shards.info if requested even for a short circuited distrib request -if (!rb.isDistrib -&& req.getParams().getBool(ShardParams.SHARDS_INFO, false) -&& rb.shortCircuitedURL != null) { - NamedList shardInfo = new SimpleOrderedMap<>(); - SimpleOrderedMap nl = new SimpleOrderedMap<>(); - if (rsp.getException() != null) { -Throwable cause = rsp.getException(); -if (cause instanceof SolrServerException) { - cause = ((SolrServerException) cause).getRootCause(); -} else { - if (cause.getCause() != null) { -cause = cause.getCause(); - } -} -nl.add("error", cause.toString()); -if (!core.getCoreContainer().hideStackTrace()) { - StringWriter trace = new StringWriter(); - cause.printStackTrace(new
Re: [PR] SOLR-17151 - stop processing components once we have exceeded a query limit [solr]
gus-asf commented on PR #2403: URL: https://github.com/apache/solr/pull/2403#issuecomment-2057147058 Still needs unit test... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org