Hi Dave,

Excerpts from David Balmain's mail of 26 Sep 2006 (PDT):
> You need to downcase the term when you add it to a TermQuery. The
> StandardAnalyzer downcases all text so you need to do the same with
> any terms you add to any hand built queries.

Thanks for the response. Downcasing the string passed into the TermQuery
does, in fact, retrieve the document. BUT, I had used a
WhitespaceAnalyzer with no downcasing on that field, so it should have
preserved case in the index.

In fact, some experimentation shows:

> mid = "[EMAIL PROTECTED]"
> i = Ferret::Index::Index.new
> wsa = Ferret::Analysis::WhiteSpaceAnalyzer.new false
> wsa.token_stream(:message_id, mid).next
=> token["[EMAIL PROTECTED]":0:26:1]
> i.add_document({:message_id => mid}, wsa)
> i.search(Ferret::Search::TermQuery.new(:message_id, mid))
=> #<struct Ferret::Search::TopDocs total_hits=0, hits=[], max_score=0.0>
> i.search(Ferret::Search::TermQuery.new(:message_id, mid.downcase))
=> #<struct Ferret::Search::TopDocs total_hits=1, hits=[#<struct 
Ferret::Search::Hit doc=0, score=0.3068528175354>], max_score=0.3068528175354>

So it looks like WSA#token_stream does the right thing. Is it possible
isn't not actually being called at insertion time? Or am I
misunderstanding something?

-- 
William <[EMAIL PROTECTED]>
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to