> Will it it work if we don't use html entities for non-ascii, but use
> real charsacters instead (i.e. used � instead och ä)?
> 
> If this is not implemeted, please could someone point me to where in the
> code such a conversion would fit? I might just write it myself ;)

Non-ASCII is a headache for itself, as you sure know already... I can send
you a quick-and-ugly patch that should be on his way to a better shape in
the not-so-far future. Till then, here is what I have working at our
sites:

- HTML-texts are in Romanian (change it as you wish) so we have lots of
non_ASCII chars in it (most of them as real chars and an appropiate
"charset=iso-8859-2" META-option

- before indexing, non_ASCII chars are mapped to real ASCII text (both
from real chars as from html entities)

- the same mapping occurs in search phrases

Only inconvenient so far: users get also pages they actually don't wish
(all chars with accent 'suddenly' forget about that wghen searched.

Let me know if interested.

Iosif Fettich

-----------------------------------------------------------------------
Iosif Fettich | e-mail: [EMAIL PROTECTED]            ICQ UIN: 5496730
Mng. Director |                             phone/fax: +40-(0)65-162614     
NetSoft SRL   | mail:   NetSoft SRL,4300 Tg.Mures,O.P.1-C.P.182,Romania

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to