> The "right thing" to do would be to either not decode SGML entities > at all, but somehow compensate for that in the word matching, or to > decode all standard or proposed entities UNAMBIGUOUSLY so that you can > map them back correctly in htsearch. This means not being limited to > 256 characters in a single byte. htsearch would then have to be aware > of the encoding used on output, and map the characters to the correct > single character or SGML encoding as appropriate.
Thanks a lot Gilles, I'll keep monitoring the list for news on that. Ionut Nistor [EMAIL PROTECTED] _______________________________________________ htdig-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/htdig-dev
