On Tue, 12 Feb 2002, Alois Treindl wrote: > When I look into db.wordlist while it is generated by 'rundig', > it mutilates all words containing an Umlaut or other accented > character. > > It treats an accented character is a word-splitting character, instead > of mapping it to a non accented equivalent. > > Example: German word 'ungem�tlich' (uncomfortable) > creates wordlist entries > ungem i:1490 l:327 w:673 a:1 > tlich i:1433 l:826 w:174 a:50 > > instead of mapping the umlaut to 'u' as in ungemutlich > > Searching for 'ungem�tlich' results in no hit at all.
I solved PART of the problem myself in the meantime. On my HPUX system, I recompiled htdig while I had set the environment LANG=de_DE.iso88591 and I have added locale: de_DE.iso88591 to htdig.conf Now the 8-bit characters are entered into wordlist, i.e. ungem�tlich i:310 l:327 w:673 a:1 and searching for 'ungem�tlich' works as well. This is better, but it is not exactly what I want. I would prefer if both, htdig and htsearch would do mapping to 'ungemutlich' For example, on a US ascii keyboard it is difficult to enter the accented characters into a search form. It would be an advantage for users on such keyboards to have the mapping enabled. Can I get the mapping activated somehow? Alois _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

