Re: Reporting multiple issues triggering an HTTP 500 in Solr
Among the 77 exceptions we described in the blog post [1] we have now created tickets for: - 24 different null pointer exceptions - 4 cast exceptions (one has been fixed in the mean time) - 3 array index out of bounds - 2 string index out of bounds Turned out that some of them could be merged into the same report (e.g. SOLR-13202), so this amounted to 28 different bug reports (3 of them in Lucene). Following Jan Høydahl's suggestion (thanks!) we have labelled all of them with 'diffblue' and 'newdev', so all issues can be found here: https://issues.apache.org/jira/issues/?jql=labels+%3D+diffblue Now, as described in the blog post [1] we still have a bunch of HTTP requests that produce an HTTP 500 for: - 19 NumberFormatException - 9 SolrException - 4 IllegalArgumentException - 4 IOException - 3 IllegalStateException - 2 UnsupportedOperationException - 1 RuntimeException - 1 org.noggit.JSONParser.ParserException I agree with Jan that they correspond to expected and wanted errors, but should they produce a 400 instead of a 500? In my view a service should never return an HTTP 500, even if you throw invalid or unusual requests at it. What people think? At any rate, does the community want the URLs that trigger those exceptions? If the fix is going to be catching all of them at the highest level in the servlet, then they are probably not necessary. But if you want to fix (some of) them case by case, and provide explanatory error responses for missing parameters, invalid data, etc. then these are useful, as you can use them to start a debugging session to understand where is the best to catch the exception. Also, please remark that the 'Environment' field of all reports we did contains instructions how to rebuild the films collections so that the reported problem are easy to debug. [1] https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br On Thu, Jan 31, 2019 at 12:37 PM Jan Høydahl wrote: > > Thanks for the reports, it will be good to get rid of NPEs even if the input > values are not likely to happen IRL. > Reading your blog post, at the bottom you have a table with various kinds of > exceptions triggering 500 responses. > Out of those, most are expected and wanted errors in my opinion, such as > NumberFormatException or IllegalArgumentException, > they clearly tell the client that their input is wrong. Also, SolrException > is in most cases thrown to explain what went wrong. > > But for NPE, ClassCastException, AIOOB and SIOOB exceptions are good > candidates for better input validation and fixing. > > BTW, here is a JIRA query that will list all your reported issues, in case > someone want to fix multiple of them in one single commit > https://issues.apache.org/jira/browse/SOLR-13180?jql=project%20%3D%20SOLR%20AND%20text%20~%20diffblue > > You may also consider adding a label=diffblue as well to display all issues > with a click, as well as label=newdev to signal that these are excellent > tasks to be done by new Solr/Lucene developers. > > Instead of attaching large home.zip archives to the issues, it would be much > more helpful with a few simple steps as {code} block in the JIRA to > reproduce, e.g. > > bin/solr start -c > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > bin/post -c films example/films/films.json > curl "http://localhost:8983/solr/films/select?json=0"; > > Note that the last line there reproduces the exception. > This allows any developer to simply copy/paste those five lines to reproduce > :) > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > 28. jan. 2019 kl. 17:06 skrev César Rodríguez : > > Thanks, we will do that. > > Just to be clear, we are talking about opening from 50 to 70 jira > tickets. We found 77 unique points in the source code where an > exception is thrown that causes an HTTP 500, but I'm guessing that > some of them will not be serious enough to be reported. > > We can provide patches for the two issues described below. We will do > our best to describe the probable cause of the error on each > individual report, but we won't be able to provide patches for most of > them. > > More information about this testing effort can be found here: > https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results > > On Mon, Jan 28, 2019 at 3:03 PM Mikhail Khludnev wrote: > > > Yes. Please create jiras and attach patches. Tests are highly appreciated. > > > On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez > wrote: > > > Hi there, > > We analyzed the source code of Apache Solr and found a number of > issues that we would like to report. We configured Solr using the > films collection from the quick start tutorial
Re: Reporting multiple issues triggering an HTTP 500 in Solr
Thanks for the reports, it will be good to get rid of NPEs even if the input values are not likely to happen IRL. Reading your blog post, at the bottom you have a table with various kinds of exceptions triggering 500 responses. Out of those, most are expected and wanted errors in my opinion, such as NumberFormatException or IllegalArgumentException, they clearly tell the client that their input is wrong. Also, SolrException is in most cases thrown to explain what went wrong. But for NPE, ClassCastException, AIOOB and SIOOB exceptions are good candidates for better input validation and fixing. BTW, here is a JIRA query that will list all your reported issues, in case someone want to fix multiple of them in one single commit https://issues.apache.org/jira/browse/SOLR-13180?jql=project%20%3D%20SOLR%20AND%20text%20~%20diffblue You may also consider adding a label=diffblue as well to display all issues with a click, as well as label=newdev to signal that these are excellent tasks to be done by new Solr/Lucene developers. Instead of attaching large home.zip archives to the issues, it would be much more helpful with a few simple steps as {code} block in the JIRA to reproduce, e.g. bin/solr start -c bin/solr create -c films curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema bin/post -c films example/films/films.json curl "http://localhost:8983/solr/films/select?json=0"; Note that the last line there reproduces the exception. This allows any developer to simply copy/paste those five lines to reproduce :) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 28. jan. 2019 kl. 17:06 skrev César Rodríguez : > > Thanks, we will do that. > > Just to be clear, we are talking about opening from 50 to 70 jira > tickets. We found 77 unique points in the source code where an > exception is thrown that causes an HTTP 500, but I'm guessing that > some of them will not be serious enough to be reported. > > We can provide patches for the two issues described below. We will do > our best to describe the probable cause of the error on each > individual report, but we won't be able to provide patches for most of > them. > > More information about this testing effort can be found here: > https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results > > On Mon, Jan 28, 2019 at 3:03 PM Mikhail Khludnev wrote: >> >> Yes. Please create jiras and attach patches. Tests are highly appreciated. >> >> >> On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez >> wrote: >>> >>> Hi there, >>> >>> We analyzed the source code of Apache Solr and found a number of >>> issues that we would like to report. We configured Solr using the >>> films collection from the quick start tutorial [2]. For every issue >>> that we found we can provide a request URL for which Solr returns an >>> HTTP 500 error (usually due to an uncaught exception). >>> >>> Please find below two example issues we found, together with an >>> explanation of the cause and a patch fixing the bug (attached patches >>> pass the unit tests). The issues we would like to report are similar >>> to those. We found them with help from Diffblue Microservice Testing >>> [1]. >>> >>> We would like to know: >>> * Should we open JIRA tickets for each one of them? >>> * We can provide a URL that triggers the problem, a stack trace, and >>> information about the server setup. What other information would you >>> like to see? >>> >>> I look forward to your response. >>> >>> Best regards, >>> César >>> >>> PS. If this question is not suitable for this list, please indicate >>> where to follow up. >>> >>> === Issue 1: Null pointer exception due to unhandled case in >>> ComplexPhraseQueryParser === >>> >>> Assume that we request the following URL: >>> >>> /solr/films/select?q={!complexphrase}genre:"-om*" >>> >>> Handling this query involves constructing a SpanQuery, which happens >>> in the rewrite method of ComplexPhraseQueryParser. In particular, the >>> expression is decomposed into a BooleanQuery, which has exactly one >>> clause, namely the negative clause -genre:”om*”. The rewrite method >>> then further transforms this into a SpanQuery; in this case, it goes >>> into the path that handles complex queries with both positive and >>> negative clauses. It extracts the subset of positive clauses - note >>> that this set of clauses is empty for this query. The positive clauses >>> are then combined into a SpanNearQuery (around line 340), which is >>> then used to build a SpanNotQuery. >>> >>> Further down the line, the field attribute of the SpanNearQuery is >>> accessed and used as an index into a TreeMap. But since we had an >>> empty set of positive clauses, the SpanNearQuery does not have its >>> field attribute set, so we get a null here - this
Re: Reporting multiple issues triggering an HTTP 500 in Solr
Thanks, we will do that. Just to be clear, we are talking about opening from 50 to 70 jira tickets. We found 77 unique points in the source code where an exception is thrown that causes an HTTP 500, but I'm guessing that some of them will not be serious enough to be reported. We can provide patches for the two issues described below. We will do our best to describe the probable cause of the error on each individual report, but we won't be able to provide patches for most of them. More information about this testing effort can be found here: https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results On Mon, Jan 28, 2019 at 3:03 PM Mikhail Khludnev wrote: > > Yes. Please create jiras and attach patches. Tests are highly appreciated. > > > On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez > wrote: >> >> Hi there, >> >> We analyzed the source code of Apache Solr and found a number of >> issues that we would like to report. We configured Solr using the >> films collection from the quick start tutorial [2]. For every issue >> that we found we can provide a request URL for which Solr returns an >> HTTP 500 error (usually due to an uncaught exception). >> >> Please find below two example issues we found, together with an >> explanation of the cause and a patch fixing the bug (attached patches >> pass the unit tests). The issues we would like to report are similar >> to those. We found them with help from Diffblue Microservice Testing >> [1]. >> >> We would like to know: >> * Should we open JIRA tickets for each one of them? >> * We can provide a URL that triggers the problem, a stack trace, and >> information about the server setup. What other information would you >> like to see? >> >> I look forward to your response. >> >> Best regards, >> César >> >> PS. If this question is not suitable for this list, please indicate >> where to follow up. >> >> === Issue 1: Null pointer exception due to unhandled case in >> ComplexPhraseQueryParser === >> >> Assume that we request the following URL: >> >> /solr/films/select?q={!complexphrase}genre:"-om*" >> >> Handling this query involves constructing a SpanQuery, which happens >> in the rewrite method of ComplexPhraseQueryParser. In particular, the >> expression is decomposed into a BooleanQuery, which has exactly one >> clause, namely the negative clause -genre:”om*”. The rewrite method >> then further transforms this into a SpanQuery; in this case, it goes >> into the path that handles complex queries with both positive and >> negative clauses. It extracts the subset of positive clauses - note >> that this set of clauses is empty for this query. The positive clauses >> are then combined into a SpanNearQuery (around line 340), which is >> then used to build a SpanNotQuery. >> >> Further down the line, the field attribute of the SpanNearQuery is >> accessed and used as an index into a TreeMap. But since we had an >> empty set of positive clauses, the SpanNearQuery does not have its >> field attribute set, so we get a null here - this leads to an >> exception. >> >> A possible fix would be to detect the situation where we have an empty >> set of positive clauses and include a single synthetic clause that >> matches either everything or nothing. See attached file >> 0001-Fix-NullPointerException.patch. >> >> === Issue 2: StringIndexOutOfBoundsException when expanding macros === >> >> Assume that we request the following URL: >> >> /solr/films/select?a=${${b}} >> >> Parameter macro expansion [3] seems to take place in >> org.apache.solr.request.macro.MacroExpander._expand(String val). >> However, this method throws a StringIndexOutOfBoundsException for the >> URL above. From reading the code it seems that macros are not expanded >> inside curly brackets ${...}, and so the “${b}” inside “${${b}}” >> should not be expanded. But the function seems to fail to detect this >> specific case and graciously refuse to expand it. Instead, it >> unexpectedly throws the exception. A possible fix could be updating >> the ‘idx’ variable when the StrParser detects that no valid identifier >> can be found inside the brackets. See attached file >> 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch. >> >> References: >> [1] https://www.diffblue.com/labs/ >> [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html >> [3] http://yonik.com/solr-query-parameter-substitution/ >> >> -- >> Diffblue Limited, a company registered in England and Wales number >> 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street, >> Oxford, OX1 1PT, England >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org > > > > -- > Sincerely yours > Mikhail Khludnev > > Diffblue Limited, a company registered in England and Wales number 09958102, > with its registered office at Rams
Re: Reporting multiple issues triggering an HTTP 500 in Solr
Yes. Please create jiras and attach patches. Tests are highly appreciated. On Mon, Jan 28, 2019 at 4:49 PM César Rodríguez < cesar.rodrig...@diffblue.com> wrote: > Hi there, > > We analyzed the source code of Apache Solr and found a number of > issues that we would like to report. We configured Solr using the > films collection from the quick start tutorial [2]. For every issue > that we found we can provide a request URL for which Solr returns an > HTTP 500 error (usually due to an uncaught exception). > > Please find below two example issues we found, together with an > explanation of the cause and a patch fixing the bug (attached patches > pass the unit tests). The issues we would like to report are similar > to those. We found them with help from Diffblue Microservice Testing > [1]. > > We would like to know: > * Should we open JIRA tickets for each one of them? > * We can provide a URL that triggers the problem, a stack trace, and > information about the server setup. What other information would you > like to see? > > I look forward to your response. > > Best regards, > César > > PS. If this question is not suitable for this list, please indicate > where to follow up. > > === Issue 1: Null pointer exception due to unhandled case in > ComplexPhraseQueryParser === > > Assume that we request the following URL: > > /solr/films/select?q={!complexphrase}genre:"-om*" > > Handling this query involves constructing a SpanQuery, which happens > in the rewrite method of ComplexPhraseQueryParser. In particular, the > expression is decomposed into a BooleanQuery, which has exactly one > clause, namely the negative clause -genre:”om*”. The rewrite method > then further transforms this into a SpanQuery; in this case, it goes > into the path that handles complex queries with both positive and > negative clauses. It extracts the subset of positive clauses - note > that this set of clauses is empty for this query. The positive clauses > are then combined into a SpanNearQuery (around line 340), which is > then used to build a SpanNotQuery. > > Further down the line, the field attribute of the SpanNearQuery is > accessed and used as an index into a TreeMap. But since we had an > empty set of positive clauses, the SpanNearQuery does not have its > field attribute set, so we get a null here - this leads to an > exception. > > A possible fix would be to detect the situation where we have an empty > set of positive clauses and include a single synthetic clause that > matches either everything or nothing. See attached file > 0001-Fix-NullPointerException.patch. > > === Issue 2: StringIndexOutOfBoundsException when expanding macros === > > Assume that we request the following URL: > > /solr/films/select?a=${${b}} > > Parameter macro expansion [3] seems to take place in > org.apache.solr.request.macro.MacroExpander._expand(String val). > However, this method throws a StringIndexOutOfBoundsException for the > URL above. From reading the code it seems that macros are not expanded > inside curly brackets ${...}, and so the “${b}” inside “${${b}}” > should not be expanded. But the function seems to fail to detect this > specific case and graciously refuse to expand it. Instead, it > unexpectedly throws the exception. A possible fix could be updating > the ‘idx’ variable when the StrParser detects that no valid identifier > can be found inside the brackets. See attached file > 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch. > > References: > [1] https://www.diffblue.com/labs/ > [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html > [3] http://yonik.com/solr-query-parameter-substitution/ > > -- > Diffblue Limited, a company registered in England and Wales number > 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street, > Oxford, OX1 1PT, England > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org -- Sincerely yours Mikhail Khludnev
Reporting multiple issues triggering an HTTP 500 in Solr
Hi there, We analyzed the source code of Apache Solr and found a number of issues that we would like to report. We configured Solr using the films collection from the quick start tutorial [2]. For every issue that we found we can provide a request URL for which Solr returns an HTTP 500 error (usually due to an uncaught exception). Please find below two example issues we found, together with an explanation of the cause and a patch fixing the bug (attached patches pass the unit tests). The issues we would like to report are similar to those. We found them with help from Diffblue Microservice Testing [1]. We would like to know: * Should we open JIRA tickets for each one of them? * We can provide a URL that triggers the problem, a stack trace, and information about the server setup. What other information would you like to see? I look forward to your response. Best regards, César PS. If this question is not suitable for this list, please indicate where to follow up. === Issue 1: Null pointer exception due to unhandled case in ComplexPhraseQueryParser === Assume that we request the following URL: /solr/films/select?q={!complexphrase}genre:"-om*" Handling this query involves constructing a SpanQuery, which happens in the rewrite method of ComplexPhraseQueryParser. In particular, the expression is decomposed into a BooleanQuery, which has exactly one clause, namely the negative clause -genre:”om*”. The rewrite method then further transforms this into a SpanQuery; in this case, it goes into the path that handles complex queries with both positive and negative clauses. It extracts the subset of positive clauses - note that this set of clauses is empty for this query. The positive clauses are then combined into a SpanNearQuery (around line 340), which is then used to build a SpanNotQuery. Further down the line, the field attribute of the SpanNearQuery is accessed and used as an index into a TreeMap. But since we had an empty set of positive clauses, the SpanNearQuery does not have its field attribute set, so we get a null here - this leads to an exception. A possible fix would be to detect the situation where we have an empty set of positive clauses and include a single synthetic clause that matches either everything or nothing. See attached file 0001-Fix-NullPointerException.patch. === Issue 2: StringIndexOutOfBoundsException when expanding macros === Assume that we request the following URL: /solr/films/select?a=${${b}} Parameter macro expansion [3] seems to take place in org.apache.solr.request.macro.MacroExpander._expand(String val). However, this method throws a StringIndexOutOfBoundsException for the URL above. From reading the code it seems that macros are not expanded inside curly brackets ${...}, and so the “${b}” inside “${${b}}” should not be expanded. But the function seems to fail to detect this specific case and graciously refuse to expand it. Instead, it unexpectedly throws the exception. A possible fix could be updating the ‘idx’ variable when the StrParser detects that no valid identifier can be found inside the brackets. See attached file 0001-Macro-expander-fail-gracefully-on-unsupported-syntax.patch. References: [1] https://www.diffblue.com/labs/ [2] https://lucene.apache.org/solr/guide/7_6/solr-tutorial.html [3] http://yonik.com/solr-query-parameter-substitution/ -- Diffblue Limited, a company registered in England and Wales number 09958102, with its registered office at Ramsey House, 10 St. Ebbes Street, Oxford, OX1 1PT, England From 5cf5371c1d6febf1a6423aa65aecde8be7eed77c Mon Sep 17 00:00:00 2001 From: Johannes Kloos Date: Thu, 24 Jan 2019 16:54:20 + Subject: [PATCH] Fix NullPointerException. --- .../src/java/org/apache/lucene/search/spans/SpanNearQuery.java | 2 ++ .../lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java | 7 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java b/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java index 17b9e51..a312ad2 100644 --- a/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java +++ b/lucene/core/src/java/org/apache/lucene/search/spans/SpanNearQuery.java @@ -130,6 +130,8 @@ public class SpanNearQuery extends SpanQuery implements Cloneable { * @param inOrder true if order is important */ public SpanNearQuery(SpanQuery[] clausesIn, int slop, boolean inOrder) { +if (clausesIn.length == 0) + throw new IllegalArgumentException("SpanNearQuery with no clauses"); this.clauses = new ArrayList<>(clausesIn.length); for (SpanQuery clause : clausesIn) { if (this.field == null) { // check field diff --git a/lucene/queryparser/src/java/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java b/lucene/queryparser/src/java/org/apache/lucene/queryparser/complexPhrase/ComplexPhraseQueryParser.java index ffe0066..574589f