I don't mind adding "an" to the list, but should we be concerned about any backwards compatibility issues with this change?

Erik


On May 13, 2004, at 2:05 PM, [EMAIL PROTECTED] wrote:


DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=28960>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=28960

Add "an" to the English stop words

           Summary: Add "an" to the English stop words
           Product: Lucene
           Version: unspecified
          Platform: PC
        OS/Version: Windows NT/2K
            Status: NEW
          Severity: Minor
          Priority: Other
         Component: Analysis
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


In org.apache.lucene.analysis.StopAnalyzer, the ENGLISH_STOP_WORDS array
contains "a" but not "an". So searching for "a fund" will get the same hits as
"fund", but searching for "an investment" will get many more hits than "investment".


This is true in the latest revision of the file, but appears to have always been
the case. I'm amazed nobody's pointed it out before now, our users had only
been testing for a few hours before they complained about it :-)


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to