On Sat, Mar 31, 2007 at 07:41:06PM +0200, Andreas Korth wrote:
>
> On Mar 31, 2007, at 5:36 PM, Jeff Mallatt wrote:
>
> > I'm getting some results that I don't understand from a search.
> >
> > index << {:uid => 'one', :title => 'Some Title', :content => 'my
> > first text'}
> > index << {:uid => 'two', :title => 'Some Title', :content => 'some
> > second content'}
> > index << {:uid => 'three', :title => 'Other Title', :content => 'my
> > third text'}
> >
> > query(index, 'title:"Some"')
> > query(index, 'title:"Title"')
> > query(index, 'uid:"two"')
>
> Nice one.
>
> When people don't understand search results, it's usually to do with
> stop words. The StandardAnalyzer which parses documents and(!)
> queries, uses a list of stop words which are ignored. See
> Ferret::Analysis::FULL_ENGLISH_STOP_WORDS for a complete list of
> (english) stop words.
>
> In the case of "title:Some", "Some" is removed by the analyzer giving
> only "title:", i.e. an empty query which (surprisingly) matches all
> documents.
>
> However, the same should happen with "content:some" but this one
> returns only one document which leaves me completely puzzled. This
> just isn't consistent.
adding the output of index.process_query to the script I get:
Query 'content:"some"'...
processed to <title:content uid:content content:content>
Query 'title:"Some"'...
processed to <title:title uid:title content:title>
so it seems the stop word is stripped first, then the query is
recognized as invalid, and the parser does it's best to run it anyway -
it takes the remaining word that once was the field name, and interprets
it as the query string.
Setting handle_parse_errors to false turns this behaviour off and leads
to no results for the empty queries.
Jens
--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[EMAIL PROTECTED] | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk