http://splainer.io/ <http://splainer.io/> from the gents at OpenSourceConnections is pretty good for this sort of thing, I find…
Alan Woodward www.flax.co.uk > On 13 Jan 2017, at 16:35, Tom Chiverton <t...@extravision.com> wrote: > > Well, I've tried much larger values than 8, and it still doesn't seem to do > the job ? > > For now, assume my users are searching for exact sub strings of a real title. > > Tom > > > On 13/01/17 16:22, Walter Underwood wrote: >> I use a boost of 8 for title with no boost on the content. Both Infoseek and >> Inktomi settled on the 8X boost, getting there with completely different >> methodologies. >> >> You might not want the title to completely trump the content. That causes >> some odd anomalies. If someone searches for “ice age 2”, do you really want >> every title with “2” to come before “ice age two”? Or a search for “steve >> jobs” to return every article with “job” or “jobs” in the title first? >> >> Also, use “edismax”, not “dismax”. Dismax was obsolete in Solr 3.x, five >> years ago. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >> >>> On Jan 13, 2017, at 7:10 AM, Tom Chiverton <t...@extravision.com> wrote: >>> >>> I have a few hundred documents with title and content fields. >>> >>> I want a match in title to trump matches in content. If I search for >>> "connected vehicle" then a news article that has that in the content >>> shouldn't be ranked higher than the page with that in the title is >>> essentially what I want. >>> >>> I have tried dismax with qf=title^2 as well as several other variants with >>> the standard query parser (like q="title:"foo"^2 OR content:"foo") but >>> documents without the search term in the title still come out before those >>> with the term in the title when ordered by score. >>> >>> Is there something I am missing ? >>> >>> From the docs, something like q=title:"connected vehicle"^2 OR >>> content:"connected vehicle" should have worked ? Even using ^100 didn't >>> help. >>> >>> I tried with the dismax parser using >>> >>> "q": "Connected Vehicle", >>> "defType": "dismax", >>> "indent": "true", >>> "qf": "title^2000 content", >>> "pf": "pf=title^4000 content^2", >>> "sort": "score desc", >>> "wt": "json", >>> >>> but that was not better. if I remove content from pf/qf then documents seem >>> to rank correctly. >>> Example query and results (content omitted) : http://pastebin.com/5EhrRJP8 >>> <http://pastebin.com/5EhrRJP8> with managed-schema >>> http://pastebin.com/mdraWQWE <http://pastebin.com/mdraWQWE> >>> >>> -- >>> <spacer.gif> >>> <spacer.gif> >>> <spacer.gif> >>> Tom Chiverton >>> Lead Developer >>> <spacer.gif> >>> e: <mailto:t...@extravision.com>t...@extravision.com >>> <mailto:t...@extravision.com> >>> p: 0161 817 2922 >>> t: @extravision <http://www.twitter.com/extravision> >>> w: <http://www.extravision.com/>www.extravision.com >>> <http://www.extravision.com/> >>> <spacer.gif> >>> <outlook-logo.gif> <http://www.extravision.com/> >>> <spacer.gif> >>> Registered in the UK at: 107 Timber Wharf, 33 Worsley Street, Manchester, >>> M15 4LD. >>> Company Reg No: 05017214 VAT: GB 824 5386 19 >>> >>> This e-mail is intended solely for the person to whom it is addressed and >>> may contain confidential or privileged information. >>> Any views or opinions presented in this e-mail are solely of the author and >>> do not necessarily represent those of Extravision Ltd. >>> <spacer.gif> >> >> ______________________________________________________________________ >> This email has been scanned by the Symantec Email Security.cloud service. >> For more information please visit http://www.symanteccloud.com >> ______________________________________________________________________ >