[Fedora-commons-users] exact phrase matching in gsearch

Mike Korcynski Wed, 06 May 2009 11:41:39 -0700

Hi,

I'm trying to implement gsearch (was until today using 2.1 but upgraded 
to 2.2 in hopes of solving my problem.)  I have almost everything 
working the way I would expect it to at this point, but I am having 
problems with exact phrase searches.  As an example I have a document 
with many chunks of text like this:


chunk.12     rights regarding immigration. Unlike other Latin Americans, 
Puerto Ricans are US. citizens. The right
chunk.2    Latin School for collaborating with us, especially Maira 
Perez and Melissa Lee. They have

I do an exact phrase search for "Latin School" and both of these chunks 
return in the results.  I would expect only chunk.2 to return since that 
is the only chunk containing the exact phrase.  The score for the 
returning document is different if I quote the phrase .062 unquoted vs 
0.03 quoted but it seems like its returning false positives by returning 
the chunk that has Latin only and not the entire phrase Latin School.

Chunks are stored as TOKENIZED fields:
<IndexField index="TOKENIZED" store="YES" termVector="YES">

and I am using the StandardAnalyzer and chunks are included in 
DefaultQueryFields.

Has anyone encountered a similar problem or know of a way around this?

Thanks,

Mike


------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Fedora-commons-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

[Fedora-commons-users] exact phrase matching in gsearch

Reply via email to