It can get complicated... Document encoding needs to be consistent throughout the indexing chain.
Internal document encoding should be the same as what the web server serves. Character encoding of data in a database should be the same as what the web server serves. Htdig should be configured to serve with the same document encoding as the web server (I think). If the pages are displayed fine on the web site, but messed up in search results, it seems pretty clear the problem is with htdig. I'm pretty sure that htdig has problems with unicode (utf-8, etc.) but I don't know if these symptoms are indicative of these problems. Content in different languages can use the same encoding, but with some limitations (asian languages can't necessarily be encoded the same as western european languages). I too would spend time reading the FAQ. HTH Ted Stresen-Reuter On Apr 2, 2007, at 6:49 PM, Mac OS X Server Administrator wrote: > Hi, > > I've gone through the list archives, but most of the promising results > (e.g. http://www.geocrawler.com/archives/3/8822/2002/2/0/7814312/) > return 404s from SourceForge... > > Any accented characters are parsed as an empty box (IE/Win) or a > question mark (in a diamond in Safari) in search results pages. > > Should this be handled via set_locale? If so, then how does one handle > German- or Spanish-accented characters on an otherwise English site > (e.g. proper nouns), which has an en_US locale set? We do use the span > tag with a language attribute in the XHTML; is there any way to hook > that into ht://Dig? > > HTML entities show up as plain text -- m-dashes, curly-quotes, curly > apostrophes, etc. render as source code, not as m-dashes, curly-quotes > and curly apostrophes. > > Are these separate problems, and what can be done? > > ----------------------------------------------------------------------- > -- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys-and earn cash > http://www.techsay.com/default.php? > page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > ht://Dig general mailing list: <[email protected]> > ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html > List information (subscribe/unsubscribe, etc.) > https://lists.sourceforge.net/lists/listinfo/htdig-general > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

