On 9/27/06, William Morgan <[EMAIL PROTECTED]> wrote:
> Hi Dave,
>
> Excerpts from David Balmain's mail of 24 Sep 2006 (PDT):
> > Did you rebuild the index? You'll need to do that before it makes any
> > difference.
>
> Yes, the original example now works---thanks! Unfortunately, I still see
> a lot of queries that return nothing in TermQuery form, but work fine in
> String form.
>
> For example:
>
> > (0..10).each do |j|
> > m = @i[j][:message_id]
> > n1 = @i.search(Ferret::Search::TermQuery.new(:message_id, m)).total_hits
> > n2 = @i.search("message_id:#{m}").total_hits
> > puts "#{m}: #{n1} #{n2}"
> > end
> [EMAIL PROTECTED]: 0 1
> [EMAIL PROTECTED]: 1 1
> [EMAIL PROTECTED]: 1 1
> [EMAIL PROTECTED]: 0 1
> [EMAIL PROTECTED]: 0 1
> [EMAIL PROTECTED]: 1 1
> [EMAIL PROTECTED]: 1 1
> [EMAIL PROTECTED]: 0 1
> [EMAIL PROTECTED]: 1 1
> [EMAIL PROTECTED]: 0 1
> [EMAIL PROTECTED]: 0 1
>
> Based on the first and third entries, I can't imagine this is a
> tokenization problem. What do you think?
>
> --
> William <[EMAIL PROTECTED]>
Hi William,
You need to downcase the term when you add it to a TermQuery. The
StandardAnalyzer downcases all text so you need to do the same with
any terms you add to any hand built queries.
One way to see what might possibly be wrong is to run the term through
the analyzer yourself.
require 'rubygems'
require 'ferret'
include Ferret::Analysis
EMAILS = [
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]",
"[EMAIL PROTECTED]"
]
a = StandardAnalyzer.new
EMAILS.each do |email|
print email + ":"
tz = a.token_stream(:field, email)
puts email == tz.next.text
end
Hope that clears things up.
Cheers,
Dave
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk