It's common practice to remove "stop words"
(http://en.wikipedia.org/wiki/Stopwords) from queries, but also to
provide some syntax for exceptions. For example, there should be a way
to find Hamlet's soliloquy by searching for "to be or not to be". One
technique is to remove individual query terms that are stop words, but
to leave quoted phrases intact.
Other common practices are to lower-case individual query terms, to
remove some or all punctuation, and to remove singular possessives
(trailing "'s"). But not every application will implement all of these
techniques: requirements vary.
-- Mike
Paul M wrote:
Do you pre-process your search queries, so that common words are removed, such
as (and, the, to, I, an, a, etc...)? Does this speed search results noticeably?
(Many fragments returned when common words are use as search terms, correct?)
thank you
------------------------------------------------------------------------
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general