Re: [htdig3-dev] Characters like 'ďż˝' 'ďż˝' 'ďż˝' ...

Gilles Detillieux Tue, 14 Dec 1999 13:00:57 -0800
According to Jerome ALET:
> however what we need in France, and probably in Italy and Spain too, among
> others, is more complicated: it is a way to make accentuated (?) 
> characters treated by htsearch just like the same letter without the
> accent, and dealing correctly with the uppercase/lowercase. 
> 
> the database should store characters exactly as they are in the html
> documents, but the search function must return OK in these cases: 
> 
>       * the user typed the word without any accent
>       * the user typed the word with some or all accents (if all then
> the user typed the word exactly as it is stored in the database)
>       * the user typed the word with bad accents (yes)
>       * the user typed the word mixing lower and upper, eventually
> accentuated, letters.
>       * ... (Did I forget something ?)

Yes, you also want to allow matching of words with incorrect or missing
accents within the documents, even if the user typed the search words
with the correct accents.  Some authors are sloppy typists, but it doesn't
mean that what they're writing about isn't relevant to the searcher's
query.  :)

All of these requirements fit the mold of fuzzy matching.  In particular,
the requirement that "the database should store characters exactly as they
are in the html documents" can only be met by fuzzy matching.  None of the
hacks that have been posted so far meet that requirement.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this.
Re: [htdig3-dev] Characters like 'ďż˝' 'ďż˝' 'ďż˝' ...

Reply via email to