Hi Ken, Now I set the jsp pages to utf-8 encoding, but the result is still the same. Do you have any other idea? There's another problem. If I'm writing non-english characters in the input textbox, after clicking on the search the content of the text-box will be false, e.g. the text is not displayed correctly.
Many Thanks, Zsolt Ken Krugler ([EMAIL PROTECTED]) wrote: > > Hi Zsolt, > > >Here is the cache view : > >http://64.34.163.57:8080/nutch-0.9/cached.jsp?idx=0&id=0 > > When I hit this with curl, I see that it's > returning Content-Type: > text/html;charset=iso-8859-2 in the response > header, and the content has <meta > http-equiv="Content-Type" content="text/html; > charset=iso-8859-2">. > > But I see that the base href is: > > <base href="http://www.daganatok.hu/"> > > And when I hit that URL, I get back: > > < Content-Type: text/html; charset=utf-8 > <meta http-equiv="Content-Type" > content="text/html; charset=utf-8" > />.org/TR/xhtml1/DTD/xhtml1-transitional.dtd'> > > The data seems to be valid UTF-8, and from my > experience Nutch works correctly with correctly > identified UTF-8 web pages. > > So I'm I'm guessing the '?' come about when your > webapp container/server tries to convert the > UTF-8 data to 8859-2. > > -- Ken > > >Ken Krugler ([EMAIL PROTECTED]) wrote: > >> > >> >Hi All, > >> > > >> >I would like to share an issue regarding the encoding > >> >using Nutch 0.9.x. > >> > > >> >When I'm indexing some sites, which contains lot of > >> >ISO-8859-2 characters, (these are mainly eastern-european > >> >sites, mainly hungarian ones) then at the search page > >> >I cannot see the characters correcty. Even at the cached > >> >view, the non-english characters like áéúő are visible > >> >as a question mark. > >> > > >> >If some of you, have an experience with this issue, > >> >I would be glad when some of You can help me. > >> > >> What's the URL of an example page with this type of problem? > >> > > > -- Ken > > -- > Ken Krugler > Krugle, Inc. > +1 530-210-6378 > "Find Code, Find Answers" > > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
