Re: Search by similarity?

2017-09-19 Thread alessandro.benedetti
In addition to that, I still believe More Like This is a better option for you. The reason is that the MLT is able to evaluate the interesting terms from your document (title is the only field of interest for you), and boost them accordingly. Related your "80% of similarity", this is more tricky.

Re: Search by similarity?

2017-08-29 Thread Josh Lincoln
I reviewed the dismax docs and it doesn't support the fieldname:term portion of the lucene syntax. To restrict a search to a field and use mm you can either A) use edismax exactly as you're currently trying to use dismax B) use dismax, with the following changes * remove the title: portion of the q

Re: Search by similarity?

2017-08-29 Thread Josh Lincoln
Darko, Can you use edismax instead? When using dismax, solr is parsing the title field as if it's a query term. E.g. the query seems to be interpreted as title "title-123123123-end" (note the lack of a colon)...which results in querying all your qf fields for both "title" and "title-123123123-end"

Re: Search by similarity?

2017-08-29 Thread Darko Todoric
Hi Erick, "debug":{ "rawquerystring":"title:\"title-123123123-end\"", "querystring":"title:\"title-123123123-end\"", "parsedquery":"(+(DisjunctionMaxQuery(((author_full:title)^7.0 | (abstract:titl)^2.0 | (title:titl)^3.0 | (keywords:titl)^5.0 | (authors:title)^4.0 | (doi:title:)^1.0)) Disjun

Re: Search by similarity?

2017-08-28 Thread Erick Erickson
What are the results of adding &debug=query to the URL? The parsed query will be especially illuminating. Best, Erick On Mon, Aug 28, 2017 at 4:37 AM, Emir Arnautovic wrote: > Hi Darko, > > The issue is the wrong expectations: title-1-end is parsed to 3 tokens > (guessing) and mm=99% of 3 tokens

Re: Search by similarity?

2017-08-28 Thread Emir Arnautovic
Hi Darko, The issue is the wrong expectations: title-1-end is parsed to 3 tokens (guessing) and mm=99% of 3 tokens is 2.99 and it is rounded down to 2. Since all your documents have 'title' and 'end' tokens, all match. If you want to round up, you can use mm=-1% - that will result in zero (or

Re: Search by similarity?

2017-08-28 Thread Darko Todoric
Hm... I cannot make that this DisMax work on my Solr... In solr I have document with title: - "title-1-end" - "title-2-end" - "title-3-end" - ... - ... - "title-312-end" and when I make query "*http://localhost:8983/solr/SciLit/select?defType=dismax&indent=on&mm=99%&q=title:"title-1231231

RE: Search by similarity?

2017-08-25 Thread Markus Jelsma
Yes, that is roughly how MLT works as well. You can also do a full OR-search on the terms using LuceneQParser. Markus -Original message- > From:Junte Zhang > Sent: Friday 25th August 2017 18:38 > To: solr-user@lucene.apache.org > Subject: RE: Search by similarity?

RE: Search by similarity?

2017-08-25 Thread Junte Zhang
If you already have the title of the document, then you could run that title as a new query against the whole index and exclude the source document from the results as a filter. You could use the DisMax query parser: https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser And