In addition to that, I still believe More Like This is a better option for
you.
The reason is that the MLT is able to evaluate the interesting terms from
your document (title is the only field of interest for you), and boost them
accordingly.
Related your "80% of similarity", this is more tricky.
I reviewed the dismax docs and it doesn't support the fieldname:term
portion of the lucene syntax.
To restrict a search to a field and use mm you can either
A) use edismax exactly as you're currently trying to use dismax
B) use dismax, with the following changes
* remove the title: portion of the q
Darko,
Can you use edismax instead?
When using dismax, solr is parsing the title field as if it's a query term.
E.g. the query seems to be interpreted as
title "title-123123123-end"
(note the lack of a colon)...which results in querying all your qf fields
for both "title" and "title-123123123-end"
Hi Erick,
"debug":{ "rawquerystring":"title:\"title-123123123-end\"",
"querystring":"title:\"title-123123123-end\"",
"parsedquery":"(+(DisjunctionMaxQuery(((author_full:title)^7.0 |
(abstract:titl)^2.0 | (title:titl)^3.0 | (keywords:titl)^5.0 |
(authors:title)^4.0 | (doi:title:)^1.0))
Disjun
What are the results of adding &debug=query to the URL? The parsed
query will be especially illuminating.
Best,
Erick
On Mon, Aug 28, 2017 at 4:37 AM, Emir Arnautovic
wrote:
> Hi Darko,
>
> The issue is the wrong expectations: title-1-end is parsed to 3 tokens
> (guessing) and mm=99% of 3 tokens
Hi Darko,
The issue is the wrong expectations: title-1-end is parsed to 3 tokens
(guessing) and mm=99% of 3 tokens is 2.99 and it is rounded down to 2.
Since all your documents have 'title' and 'end' tokens, all match. If
you want to round up, you can use mm=-1% - that will result in zero (or
Hm... I cannot make that this DisMax work on my Solr...
In solr I have document with title:
- "title-1-end"
- "title-2-end"
- "title-3-end"
- ...
- ...
- "title-312-end"
and when I make query
"*http://localhost:8983/solr/SciLit/select?defType=dismax&indent=on&mm=99%&q=title:"title-1231231
Yes, that is roughly how MLT works as well. You can also do a full OR-search on
the terms using LuceneQParser.
Markus
-Original message-
> From:Junte Zhang
> Sent: Friday 25th August 2017 18:38
> To: solr-user@lucene.apache.org
> Subject: RE: Search by similarity?
If you already have the title of the document, then you could run that title as
a new query against the whole index and exclude the source document from the
results as a filter.
You could use the DisMax query parser:
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser
And