Re: [htdig] A Suggestion on Accents

D . J . Adams Tue, 16 May 2000 01:11:10 -0700

> >Rather than a fuzzy accents search method, why not make the htdig database
> >accent independent?  After all, it is case independent already!
> >For example:
> >
> >Gar&ccedil;on  ->   Gar�on   ->   gar�on   ->   garcon
> 
> I would make the analogy to word suffixes rather than to case. There 
> is an endings fuzzy rather than a general stemming step during 
> indexing. IMHO, this makes searches a bit more precise because the 
> alternatives will get less weight than what the user actually 
> entered. (Remember the old maxim "the customer is always right?")
> 
> Besides, there are some situations where the unaccented word and the 
> accented word do *not* mean the same thing.

Yes, and when I search for 'gar�on' am I looking for a waiter or a school boy?

> 
> (BTW, the 3.2 code isn't completely case independent. It stores a 
> flag when the word is capitalized. My feeling is that user queries 
> with capitals should return capitals preferentially.)
> 

Neat idea.

> All that said, it would be possible to patch the code in WordList.cc 
> and remove accents before storing the word.
> 

I'll take a look at the 3.1.5 code, but don't hold your breath.

> --
> -Geoff Hutchison
> Williams Students Online
> http://wso.williams.edu/

-- 
 
David J Adams
<[EMAIL PROTECTED]>
Computing Services
University of Southampton

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
Re: [htdig] A Suggestion on Accents

Reply via email to