Ok, great.

We now moved from "identical setup breaks things in a bugfix version"
to "strange behavior when field does not exist". The "identical" part
was actually throwing us off the trail.

And all this leads us to
https://issues.apache.org/jira/browse/SOLR-5163 , fixed in 8.0.

Hope it helps,
    Alex.

On Mon, 10 Jun 2019 at 09:19, Danilo Tomasoni <tomas...@cosbi.eu> wrote:
>
> Hello I was able to reproduce this behaviour in an isolated environment,
> and performed some differential analysis between the two versions (that has 
> different schemas, diff of schemas attached)
>
> With the schema of solr1, the query is parsed as +(+(....) +(....))
> while with the schema of solr-test, the same query is parsed as +((....) 
> (....))
>
> The query is
>
> "q":"(f1:PUBMEDPMID12159614 AND (_query_:\"{!edismax 
> qf='medline_chemical_terms medline_mesh_terms' q.op=OR mm=1 v=$subquery1}\"))"
>
> in solr1 and also in solr test f1 equals
> "f.f1.qf":"id pmid pmc source_id other_id doi manuscript_id publication_id 
> secondary_ids"}}
>
> And then I suddenly remembered that the field secondary_ids was renamed to 
> external_data in solr-test (before the bulk import).
>
> So I changed f1 definition removing secondary_ids and adding external_data..
> and now the behaviour is the same!
>
> How is that possible? why the schema (and in this case a non-existing field) 
> can influence in such a profound way the behaviour of the query parser?
>
> I think that this is a subtle bug and an error should be raised instead of 
> performing an unexpected query.
>
> Danilo Tomasoni
>
> Fondazione The Microsoft Research - University of Trento Centre for 
> Computational and Systems Biology (COSBI)
> Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> tomas...@cosbi.eu
> http://www.cosbi.eu
>
> As for the European General Data Protection Regulation 2016/679 on the 
> protection of natural persons with regard to the processing of personal data, 
> we inform you that all the data we possess are object of treatment in the 
> respect of the normative provided for by the cited GDPR.
> It is your right to be informed on which of your data are used and how; you 
> may ask for their correction, cancellation or you may oppose to their use by 
> written request sent by recorded delivery to The Microsoft Research – 
> University of Trento Centre for Computational and Systems Biology Scarl, 
> Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> P Please don't print this e-mail unless you really need to
>
> ________________________________________
> From: Alexandre Rafalovitch [arafa...@gmail.com]
> Sent: 10 June 2019 12:49
> To: solr-user
> Subject: [SPAM] Re: query parsed in different ways in two identical solr 
> instances
>
> Were you able to simplify it to the simplest use case showing the issue? Or
> reproduce it on the stock Solr with stock example? Because otherwise, we
> would be just as stuck in a Jira as now. It is the same people helping....
>
> For example, is the _query_ part significant?
>
> Also, did you try running both queries with echoParams=all just to
> eliminate stray differences? I know you looked at the debug line, but
> perhaps this is worth a check too.
>
> Regards,
>     Alex
>
>
>
> On Mon, Jun 10, 2019, 5:46 AM Danilo Tomasoni, <tomas...@cosbi.eu> wrote:
>
> > Hello all,
> > maybe I should consider this as a bug and open an issue?
> >
> > Danilo Tomasoni
> >
> > Fondazione The Microsoft Research - University of Trento Centre for
> > Computational and Systems Biology (COSBI)
> > Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> > tomas...@cosbi.eu
> > http://www.cosbi.eu
> >
> > As for the European General Data Protection Regulation 2016/679 on the
> > protection of natural persons with regard to the processing of personal
> > data, we inform you that all the data we possess are object of treatment in
> > the respect of the normative provided for by the cited GDPR.
> > It is your right to be informed on which of your data are used and how;
> > you may ask for their correction, cancellation or you may oppose to their
> > use by written request sent by recorded delivery to The Microsoft Research
> > – University of Trento Centre for Computational and Systems Biology Scarl,
> > Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> > P Please don't print this e-mail unless you really need to
> >
> > ________________________________________
> > From: Danilo Tomasoni
> > Sent: 07 June 2019 11:47
> > To: solr-user@lucene.apache.org
> > Subject: RE: query parsed in different ways in two identical solr instances
> >
> > any thoughts on that difference in the solr parsing? is it correct that
> > the first looks like an AND while the second looks like and OR?
> > Thank you
> >
> > Danilo Tomasoni
> >
> > Fondazione The Microsoft Research - University of Trento Centre for
> > Computational and Systems Biology (COSBI)
> > Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> > tomas...@cosbi.eu
> > http://www.cosbi.eu
> >
> > As for the European General Data Protection Regulation 2016/679 on the
> > protection of natural persons with regard to the processing of personal
> > data, we inform you that all the data we possess are object of treatment in
> > the respect of the normative provided for by the cited GDPR.
> > It is your right to be informed on which of your data are used and how;
> > you may ask for their correction, cancellation or you may oppose to their
> > use by written request sent by recorded delivery to The Microsoft Research
> > – University of Trento Centre for Computational and Systems Biology Scarl,
> > Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> > P Please don't print this e-mail unless you really need to
> >
> > ________________________________________
> > From: Danilo Tomasoni [tomas...@cosbi.eu]
> > Sent: 06 June 2019 16:21
> > To: solr-user@lucene.apache.org
> > Subject: RE: query parsed in different ways in two identical solr instances
> >
> > The two collections are not identical, many overlapping documents but with
> > some different field names (test has also extra fields that 1 didn't have).
> > Actually we have 42.000.000 docs in solr1, and 40.000.000 in solr-test,
> > but I think this shouldn'd be relevant because the query is basically like
> >
> > id=x AND mesh=list of phrase queries
> >
> > where the second part of the and is handled through a nested query
> > (_query_ magic keyword).
> >
> > I expect that a query like this one would return 1 documents (x) or 0
> > documents.
> >
> > The thing that puzzles me is that on solr1 the engine is returning 1
> > document (x)
> > while on test the engine is returning 68.000 documents..
> > If you look at my first e-mail you will notice that in the correct engine
> > the parsed query is like
> >
> > +(+(...) +(...))
> >
> > That is correct for an AND
> >
> > while in the test engine the query is parsed like
> >
> > +((...) (...))
> >
> > which is more like an OR...
> >
> >
> > Danilo Tomasoni
> >
> > Fondazione The Microsoft Research - University of Trento Centre for
> > Computational and Systems Biology (COSBI)
> > Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> > tomas...@cosbi.eu
> > http://www.cosbi.eu
> >
> > As for the European General Data Protection Regulation 2016/679 on the
> > protection of natural persons with regard to the processing of personal
> > data, we inform you that all the data we possess are object of treatment in
> > the respect of the normative provided for by the cited GDPR.
> > It is your right to be informed on which of your data are used and how;
> > you may ask for their correction, cancellation or you may oppose to their
> > use by written request sent by recorded delivery to The Microsoft Research
> > – University of Trento Centre for Computational and Systems Biology Scarl,
> > Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> > P Please don't print this e-mail unless you really need to
> >
> > ________________________________________
> > From: Alexandre Rafalovitch [arafa...@gmail.com]
> > Sent: 06 June 2019 15:53
> > To: solr-user
> > Subject: Re: query parsed in different ways in two identical solr instances
> >
> > Those two queries look same after sorting the parameters, yet the
> > results are clearly different. That means the difference is deeper.
> >
> > 1) Have you checked that both collections have the same amount of
> > documents (e.g. mismatched final commit). Does basic "query=*:*"
> > return the same counts in the same initial order?
> > 2) Are you absolutely sure you are comparing 7.3.0 with 7.3.1? There
> > was SOLR-11501 that may be relevant, but it was fixed in 7.2:
> > https://issues.apache.org/jira/browse/SOLR-11501
> >
> > Regards,
> >    Alex.
> >
> > Are you absolutely sure that your instances are 7.3.0 and 7.3.1?
> >
> > On Thu, 6 Jun 2019 at 09:26, Danilo Tomasoni <tomas...@cosbi.eu> wrote:
> > >
> > > Hello, and thank you for your answer.
> > > Attached you will find the two logs for the working solr1 server, and
> > the non-working solr-test server.
> > >
> > >
> > > Danilo Tomasoni
> > >
> > >
> > > Fondazione The Microsoft Research - University of Trento Centre for
> > Computational and Systems Biology (COSBI)
> > > Piazza Manifattura 1,  38068 Rovereto (TN), Italy
> > > tomas...@cosbi.eu
> > > http://www.cosbi.eu
> > >
> > > As for the European General Data Protection Regulation 2016/679 on the
> > protection of natural persons with regard to the processing of personal
> > data, we inform you that all the data we possess are object of treatment in
> > the respect of the normative provided for by the cited GDPR.
> > > It is your right to be informed on which of your data are used and how;
> > you may ask for their correction, cancellation or you may oppose to their
> > use by written request sent by recorded delivery to The Microsoft Research
> > – University of Trento Centre for Computational and Systems Biology Scarl,
> > Piazza Manifattura 1, 38068 Rovereto (TN), Italy.
> > > P Please don't print this e-mail unless you really need to
> > >
> > > ________________________________________
> > > From: Shawn Heisey [apa...@elyograg.org]
> > > Sent: 05 June 2019 17:52
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: query parsed in different ways in two identical solr
> > instances
> > >
> > > On 6/5/2019 8:41 AM, Danilo Tomasoni wrote:
> > > > Hello,
> > > > I have two solr instances with exactly the same configuration.
> > > > The only difference that i know is that the first (the working one, is
> > solr 7.3.0,
> > > > while the one that's not working is solr 7.3.1)
> > > >
> > > > If I execute the same query (with debugQuery=on) it gets parsed in
> > different ways on the two systems and I don't understand why.
> > >
> > > Look in solr.log.  The full query, including parameters that are used
> > > but not on the URL, will be shown there.  Provide that whole line from
> > > both versions.
> > >
> > > An example of the kind of line you need to find, with a very simple
> > > query, is below:
> > >
> > > 2019-06-05 15:50:23.691 INFO  (qtp1264413185-43) [   x:foo]
> > > o.a.s.c.S.Request [foo]  webapp=/solr path=/select
> > > params={q=*:*&_=1559749821933} hits=0 status=0 QTime=38
> > >
> > > If your index has multiple shards, there can be multiple lines.  In that
> > > situation, we need the last one, which should be the main query itself
> > > rather than the subqueries.
> > >
> > > Thanks,
> > > Shawn
> >

Reply via email to