At 4:02 AM -0500 11/13/98, Jacques Reynes wrote:
>If somebody has found a solution, I will be very glad to have an answer.
>This problem might appear in many languages : german, spanish, ...
A proposal was made to me by Iosif Fettich, specifically regarding Romanian:
>To prevent misleading indexing, I prefer to 'ASCIIfy' everything before
>feeding the databases.
>Of course, I wont be able any more to differentiate two words
>that have the same spelling after ASCII-fying (and that were differing
>before that) but that is the smaller evil in most cases.
>
>Then, I have to do the same ASCII-fying mapping when searching.
>
>When delivering the real documents, they will be seen OK, even if the
>simplification done is visible in the excerpts that htdig is showing up.
>That is rather helpfull, indeed.
So I suggested that we include a general "ASCIIfy" function that takes a
filename as an argument. If the filename is empty (default), it will do
nothing. Otherwise, it will use the filename to define a 8-bit -> 7 bit
mapping and perform this on all text.
Granted, using full Unicode support and some other internationalization
would be best. But this sounds like a reasonable alternative. Comments?
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.