> however what we need in France, and probably in Italy and Spain too, among
> others, is more complicated: it is a way to make accentuated (?) 
> characters treated by htsearch just like the same letter without the
> accent, and dealing correctly with the uppercase/lowercase. 

I have an older patch that solves exactly this issue, but for romanian
special chars. It can easily adapted to any language, however, it's still
as ugly as it was from the beginning and needs modification of the source
code.

Altough I tried to promess that I'll make a way to this into the official
release, I never succeeded to really make it. Shame on me - it's still
easier to apply the patch in 5 minutes to the newest version then to work
a few hours on making it as it should be done... :(((

I'll attach the patches - the question comes up often enough, so I suppose
some people will benefit of it even so. 

Sincerely,

Iosif Fettich

--------------
The patches are for version 3.1.3, in directory htdig:

*********************
SGMLEntities.cc:
*********************

164,183c164
< //PATCH to make romanian ISO_8859_2 chars fit into plain ASCII//
<         unsigned char x;
<         x = atoi (entity + 1);
<         if (x == 227 || x == 226 || x == 225 ) return 'a';
<         if (x == 195 || x == 194 || x == 193 ) return 'A';
<         if (x == 233) return 'e';
<         if (x == 201) return 'E';
<         if (x == 238 || x == 237) return 'i';
<         if (x == 206 || x == 205) return 'I';
<         if (x == 243 || x == 245 || x == 246) return 'o';
<         if (x == 211 || x == 213 || x == 214) return 'O';
<         if (x == 186) return 's';
<         if (x == 170) return 'S';
<         if (x == 254) return 't';
<         if (x == 222) return 'T';
<         if (x == 250 || x == 251 || x == 252) return 'u';
<         if (x == 218 || x == 219 || x == 220) return 'U';
<         return x;
< //END OF PATCH
< //    return atoi(entity + 1);
---
>       return atoi(entity + 1);


****************
HTML.cc
****************

162,184d161
< 
< //PATCH to make romanian ISO_8859_2 chars fit into plain ASCII//
<     start = position;
<     while (*position)
<     {
<         if (*position == 227 || *position == 226 || *position == 225 ) *position = 
'a';
<         else if (*position == 195 || *position == 194 || *position == 193 ) 
*position = 'A';
<         else if (*position == 233) *position = 'e';
<         else if (*position == 201) *position = 'E';
<         else if (*position == 238 || *position == 237) *position = 'i';
<         else if (*position == 206 || *position == 205) *position = 'I';
<         else if (*position == 243 || *position == 245 || *position == 246) *position 
= 'o';
<         else if (*position == 211 || *position == 213 || *position == 214) *position 
= 'O';
<         else if (*position == 186) *position = 's';
<         else if (*position == 170) *position = 'S';
<         else if (*position == 254) *position = 't';
<         else if (*position == 222) *position = 'T';
<         else if (*position == 250 || *position == 251 || *position == 252) *position 
= 'u';
<         else if (*position == 218 || *position == 219 || *position == 220) *position 
= 'U';
<         ++position;
<     }
<     position = start;
< //END OF PATCH


------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 

Reply via email to