Jay-thank you!

I haven't checked edismax yet, but it's on my agenda.
To me it seems that this topic will not touch Solr but Invenio only since "any 
field" search results (Invenio-based) need to be combined with fulltext results 
(Solr-based).

Anyways, this approach needs to be generic since we support both Solr and 
Xapian (+ hopefully others in the future).

Cheers
Patrick

________________________________________
From: jluker.cfa.harvard....@gmail.com [jluker.cfa.harvard....@gmail.com] on 
behalf of Jay Luker [jlu...@cfa.harvard.edu]
Sent: Tuesday, December 04, 2012 3:25 PM
To: Patrick Oliver Glauner
Cc: Alexander Wagner; Johnny Mariéthoz; project-invenio-devel (Invenio 
developers mailing-list)
Subject: Re: Default full text search with SOLR

I haven't checked, but I assume from the conversation that you're
using the standard solr query parser (OK, just checked and yes, you
are).

Rather than creating an "all" field, have you considered switching to
the edismax query parser? It allows for easily configuring the default
search to search across different fields, and allows configuring
different boosts for different fields. In that way you could include
fulltext but have matches in that field count less towards the overall
relevance score.

For instance, instead of using copyField to create the "global"
bucket, you would use the qf parameter, like so:

...
<str name="defType">edismax</str>
<str name="qf">abstract^2 author^5 fulltext^.5 keyword^5 title^10</str>
...

--jay

On Tue, Dec 4, 2012 at 9:05 AM, Patrick Oliver Glauner
<patrick.oliver.glau...@cern.ch> wrote:
> Hi Alexander
>
> Thanks for your message! I absolutely agree, that fulltext should not be
> included by default in "all" index. Nevertheless, many people love a
> Google-like search and subconsciously expect fulltext to be queried
> automatically.
> The solution will be configurable. We will see how it is going to look
> exactly as soon as I implement it.
>
> Regards
> Patrick
>
> ________________________________________
> From: Alexander Wagner [a.wag...@fz-juelich.de]
> Sent: Tuesday, December 04, 2012 2:57 PM
> To: Patrick Oliver Glauner
> Cc: Johnny Mariéthoz; project-invenio-devel (Invenio developers mailing-list)
> Subject: Re: Default full text search with SOLR
>
> On 04.12.2012 14:33, Patrick Oliver Glauner wrote:
>
> Hi!
>
>> Awesome! And a Google-like search field should also definitely query the 
>> fulltext index.
>
> I may mention that it is IMHO not advisable at all to include full texts
> _by default_ in "all" index. You just get to much garbage in the results.
>
> Think of someone searching the papers of Smith, John. Searching full
> text by default will give you all papers mentioning a "Smith, John"
> anywhere. Be it authorship, citations within the text as it is just a
> sample name, what have you... Results will be bad enough if you just
> search metadata. Or think of stuff like "mentioned in the book of Smith"
> or "we handle x y z but not <your search term>".
>
> For some time a huge union catalogue did something like full text
> indexing of reviews only. From one day to the next you were not even
> able to find Gerthsens Physik anymore just as all discussions of physics
> books that compared book X with Gerthsen floated to the top of the list.
>
> You can check this easily in some of those commercially available
> "discovery" systems. You can discover there a lot you just don't find
> your papers anymore ;>
>
> IMHO full text indices/searches are WAY overrated. But that's my
> personal opinion of course ;)
>
> --
>
> Kind regards,
>
> Alexander Wagner
> Subject Specialist
> Central Library
> 52425 Juelich
>
> mail : a.wag...@fz-juelich.de
> phone: +49 2461 61-1586
> Fax  : +49 2461 61-6103
> www.fz-juelich.de/zb/DE/zb-fi
>
>
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> ------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------



--
******************************************************
Jay Luker               Astrophysics Data System (ADS)
jlu...@cfa.harvard.edu  Center for Astrophysics
617-495-4588            60 Garden Street  MS 67
617-495-7356 fax        Cambridge, MA  02138
******************************************************

Reply via email to