Ok, great. We now moved from "identical setup breaks things in a bugfix version" to "strange behavior when field does not exist". The "identical" part was actually throwing us off the trail.
And all this leads us to https://issues.apache.org/jira/browse/SOLR-5163 , fixed in 8.0. Hope it helps, Alex. On Mon, 10 Jun 2019 at 09:19, Danilo Tomasoni <tomas...@cosbi.eu> wrote: > > Hello I was able to reproduce this behaviour in an isolated environment, > and performed some differential analysis between the two versions (that has > different schemas, diff of schemas attached) > > With the schema of solr1, the query is parsed as +(+(....) +(....)) > while with the schema of solr-test, the same query is parsed as +((....) > (....)) > > The query is > > "q":"(f1:PUBMEDPMID12159614 AND (_query_:\"{!edismax > qf='medline_chemical_terms medline_mesh_terms' q.op=OR mm=1 v=$subquery1}\"))" > > in solr1 and also in solr test f1 equals > "f.f1.qf":"id pmid pmc source_id other_id doi manuscript_id publication_id > secondary_ids"}} > > And then I suddenly remembered that the field secondary_ids was renamed to > external_data in solr-test (before the bulk import). > > So I changed f1 definition removing secondary_ids and adding external_data.. > and now the behaviour is the same! > > How is that possible? why the schema (and in this case a non-existing field) > can influence in such a profound way the behaviour of the query parser? > > I think that this is a subtle bug and an error should be raised instead of > performing an unexpected query. > > Danilo Tomasoni > > Fondazione The Microsoft Research - University of Trento Centre for > Computational and Systems Biology (COSBI) > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > tomas...@cosbi.eu > http://www.cosbi.eu > > As for the European General Data Protection Regulation 2016/679 on the > protection of natural persons with regard to the processing of personal data, > we inform you that all the data we possess are object of treatment in the > respect of the normative provided for by the cited GDPR. > It is your right to be informed on which of your data are used and how; you > may ask for their correction, cancellation or you may oppose to their use by > written request sent by recorded delivery to The Microsoft Research – > University of Trento Centre for Computational and Systems Biology Scarl, > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > P Please don't print this e-mail unless you really need to > > ________________________________________ > From: Alexandre Rafalovitch [arafa...@gmail.com] > Sent: 10 June 2019 12:49 > To: solr-user > Subject: [SPAM] Re: query parsed in different ways in two identical solr > instances > > Were you able to simplify it to the simplest use case showing the issue? Or > reproduce it on the stock Solr with stock example? Because otherwise, we > would be just as stuck in a Jira as now. It is the same people helping.... > > For example, is the _query_ part significant? > > Also, did you try running both queries with echoParams=all just to > eliminate stray differences? I know you looked at the debug line, but > perhaps this is worth a check too. > > Regards, > Alex > > > > On Mon, Jun 10, 2019, 5:46 AM Danilo Tomasoni, <tomas...@cosbi.eu> wrote: > > > Hello all, > > maybe I should consider this as a bug and open an issue? > > > > Danilo Tomasoni > > > > Fondazione The Microsoft Research - University of Trento Centre for > > Computational and Systems Biology (COSBI) > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > > tomas...@cosbi.eu > > http://www.cosbi.eu > > > > As for the European General Data Protection Regulation 2016/679 on the > > protection of natural persons with regard to the processing of personal > > data, we inform you that all the data we possess are object of treatment in > > the respect of the normative provided for by the cited GDPR. > > It is your right to be informed on which of your data are used and how; > > you may ask for their correction, cancellation or you may oppose to their > > use by written request sent by recorded delivery to The Microsoft Research > > – University of Trento Centre for Computational and Systems Biology Scarl, > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > P Please don't print this e-mail unless you really need to > > > > ________________________________________ > > From: Danilo Tomasoni > > Sent: 07 June 2019 11:47 > > To: solr-user@lucene.apache.org > > Subject: RE: query parsed in different ways in two identical solr instances > > > > any thoughts on that difference in the solr parsing? is it correct that > > the first looks like an AND while the second looks like and OR? > > Thank you > > > > Danilo Tomasoni > > > > Fondazione The Microsoft Research - University of Trento Centre for > > Computational and Systems Biology (COSBI) > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > > tomas...@cosbi.eu > > http://www.cosbi.eu > > > > As for the European General Data Protection Regulation 2016/679 on the > > protection of natural persons with regard to the processing of personal > > data, we inform you that all the data we possess are object of treatment in > > the respect of the normative provided for by the cited GDPR. > > It is your right to be informed on which of your data are used and how; > > you may ask for their correction, cancellation or you may oppose to their > > use by written request sent by recorded delivery to The Microsoft Research > > – University of Trento Centre for Computational and Systems Biology Scarl, > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > P Please don't print this e-mail unless you really need to > > > > ________________________________________ > > From: Danilo Tomasoni [tomas...@cosbi.eu] > > Sent: 06 June 2019 16:21 > > To: solr-user@lucene.apache.org > > Subject: RE: query parsed in different ways in two identical solr instances > > > > The two collections are not identical, many overlapping documents but with > > some different field names (test has also extra fields that 1 didn't have). > > Actually we have 42.000.000 docs in solr1, and 40.000.000 in solr-test, > > but I think this shouldn'd be relevant because the query is basically like > > > > id=x AND mesh=list of phrase queries > > > > where the second part of the and is handled through a nested query > > (_query_ magic keyword). > > > > I expect that a query like this one would return 1 documents (x) or 0 > > documents. > > > > The thing that puzzles me is that on solr1 the engine is returning 1 > > document (x) > > while on test the engine is returning 68.000 documents.. > > If you look at my first e-mail you will notice that in the correct engine > > the parsed query is like > > > > +(+(...) +(...)) > > > > That is correct for an AND > > > > while in the test engine the query is parsed like > > > > +((...) (...)) > > > > which is more like an OR... > > > > > > Danilo Tomasoni > > > > Fondazione The Microsoft Research - University of Trento Centre for > > Computational and Systems Biology (COSBI) > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > > tomas...@cosbi.eu > > http://www.cosbi.eu > > > > As for the European General Data Protection Regulation 2016/679 on the > > protection of natural persons with regard to the processing of personal > > data, we inform you that all the data we possess are object of treatment in > > the respect of the normative provided for by the cited GDPR. > > It is your right to be informed on which of your data are used and how; > > you may ask for their correction, cancellation or you may oppose to their > > use by written request sent by recorded delivery to The Microsoft Research > > – University of Trento Centre for Computational and Systems Biology Scarl, > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > P Please don't print this e-mail unless you really need to > > > > ________________________________________ > > From: Alexandre Rafalovitch [arafa...@gmail.com] > > Sent: 06 June 2019 15:53 > > To: solr-user > > Subject: Re: query parsed in different ways in two identical solr instances > > > > Those two queries look same after sorting the parameters, yet the > > results are clearly different. That means the difference is deeper. > > > > 1) Have you checked that both collections have the same amount of > > documents (e.g. mismatched final commit). Does basic "query=*:*" > > return the same counts in the same initial order? > > 2) Are you absolutely sure you are comparing 7.3.0 with 7.3.1? There > > was SOLR-11501 that may be relevant, but it was fixed in 7.2: > > https://issues.apache.org/jira/browse/SOLR-11501 > > > > Regards, > > Alex. > > > > Are you absolutely sure that your instances are 7.3.0 and 7.3.1? > > > > On Thu, 6 Jun 2019 at 09:26, Danilo Tomasoni <tomas...@cosbi.eu> wrote: > > > > > > Hello, and thank you for your answer. > > > Attached you will find the two logs for the working solr1 server, and > > the non-working solr-test server. > > > > > > > > > Danilo Tomasoni > > > > > > > > > Fondazione The Microsoft Research - University of Trento Centre for > > Computational and Systems Biology (COSBI) > > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy > > > tomas...@cosbi.eu > > > http://www.cosbi.eu > > > > > > As for the European General Data Protection Regulation 2016/679 on the > > protection of natural persons with regard to the processing of personal > > data, we inform you that all the data we possess are object of treatment in > > the respect of the normative provided for by the cited GDPR. > > > It is your right to be informed on which of your data are used and how; > > you may ask for their correction, cancellation or you may oppose to their > > use by written request sent by recorded delivery to The Microsoft Research > > – University of Trento Centre for Computational and Systems Biology Scarl, > > Piazza Manifattura 1, 38068 Rovereto (TN), Italy. > > > P Please don't print this e-mail unless you really need to > > > > > > ________________________________________ > > > From: Shawn Heisey [apa...@elyograg.org] > > > Sent: 05 June 2019 17:52 > > > To: solr-user@lucene.apache.org > > > Subject: Re: query parsed in different ways in two identical solr > > instances > > > > > > On 6/5/2019 8:41 AM, Danilo Tomasoni wrote: > > > > Hello, > > > > I have two solr instances with exactly the same configuration. > > > > The only difference that i know is that the first (the working one, is > > solr 7.3.0, > > > > while the one that's not working is solr 7.3.1) > > > > > > > > If I execute the same query (with debugQuery=on) it gets parsed in > > different ways on the two systems and I don't understand why. > > > > > > Look in solr.log. The full query, including parameters that are used > > > but not on the URL, will be shown there. Provide that whole line from > > > both versions. > > > > > > An example of the kind of line you need to find, with a very simple > > > query, is below: > > > > > > 2019-06-05 15:50:23.691 INFO (qtp1264413185-43) [ x:foo] > > > o.a.s.c.S.Request [foo] webapp=/solr path=/select > > > params={q=*:*&_=1559749821933} hits=0 status=0 QTime=38 > > > > > > If your index has multiple shards, there can be multiple lines. In that > > > situation, we need the last one, which should be the main query itself > > > rather than the subqueries. > > > > > > Thanks, > > > Shawn > >