On Sat, Mar 31, 2007 at 07:41:06PM +0200, Andreas Korth wrote:
> 
> On Mar 31, 2007, at 5:36 PM, Jeff Mallatt wrote:
> 
> > I'm getting some results that I don't understand from a search.
> >
> > index << {:uid => 'one', :title => 'Some Title', :content => 'my  
> > first text'}
> > index << {:uid => 'two', :title => 'Some Title', :content => 'some  
> > second content'}
> > index << {:uid => 'three', :title => 'Other Title', :content => 'my  
> > third text'}
> >
> > query(index, 'title:"Some"')
> > query(index, 'title:"Title"')
> > query(index, 'uid:"two"')
> 
> Nice one.
> 
> When people don't understand search results, it's usually to do with  
> stop words. The StandardAnalyzer which parses documents and(!)  
> queries, uses a list of stop words which are ignored. See  
> Ferret::Analysis::FULL_ENGLISH_STOP_WORDS for a complete list of  
> (english) stop words.
> 
> In the case of "title:Some", "Some" is removed by the analyzer giving  
> only "title:", i.e. an empty query which (surprisingly) matches all  
> documents.
> 
> However, the same should happen with "content:some" but this one  
> returns only one document which leaves me completely puzzled. This  
> just isn't consistent.

adding the output of index.process_query to the script I get:

Query 'content:"some"'...
processed to <title:content uid:content content:content>
Query 'title:"Some"'...
processed to <title:title uid:title content:title>

so it seems the stop word is stripped first, then the query is
recognized as invalid, and the parser does it's best to run it anyway -
it takes the remaining word that once was the field name, and interprets
it as the query string.

Setting handle_parse_errors to false turns this behaviour off and leads
to no results for the empty queries.


Jens

-- 
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[EMAIL PROTECTED] | www.webit.de
 
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to