Hi,
the last weeks i wondered why htdig don't like any words with the german U umlaut
(char 252) on my solaris server. All locale setting were correct and the same
configuration runs on a linux box without any problems.
Today i discovered, that the reason for that is, that WordList::valid_word() is not
8-bit-clean on Sun Solaris 2.6 !
(iscntrl(252) gets 1, but iscntrl((unsigned char)252) is 0)
The patch in htdig-3.1.4/htcommon/WordList.cc is easy:
111c111
< if (HtIsStrictWordChar((unsigned char)*word) && !isdigit(*word))
---
> if (HtIsStrictWordChar((unsigned char)*word) && !isdigit((unsigned char)*word))
116c116
< else if (allow_numbers && isdigit(*word))
---
> else if (allow_numbers && isdigit((unsigned char)*word))
122c122
< else if (iscntrl(*word))
---
> else if (iscntrl((unsigned char)*word))
Marc
Marc Pohl, Online-Service-Center, Westdeutscher Rundfunk, D-50600 Koeln
[EMAIL PROTECTED], +49 221 220 8618, http://www.wdr.de/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.