> Will it it work if we don't use html entities for non-ascii, but use
> real charsacters instead (i.e. used � instead och ä)?
>
> If this is not implemeted, please could someone point me to where in the
> code such a conversion would fit? I might just write it myself ;)
Non-ASCII is a headache for itself, as you sure know already... I can send
you a quick-and-ugly patch that should be on his way to a better shape in
the not-so-far future. Till then, here is what I have working at our
sites:
- HTML-texts are in Romanian (change it as you wish) so we have lots of
non_ASCII chars in it (most of them as real chars and an appropiate
"charset=iso-8859-2" META-option
- before indexing, non_ASCII chars are mapped to real ASCII text (both
from real chars as from html entities)
- the same mapping occurs in search phrases
Only inconvenient so far: users get also pages they actually don't wish
(all chars with accent 'suddenly' forget about that wghen searched.
Let me know if interested.
Iosif Fettich
-----------------------------------------------------------------------
Iosif Fettich | e-mail: [EMAIL PROTECTED] ICQ UIN: 5496730
Mng. Director | phone/fax: +40-(0)65-162614
NetSoft SRL | mail: NetSoft SRL,4300 Tg.Mures,O.P.1-C.P.182,Romania
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.