[Nutch-dev] Re: search.jsp (and opensearchservlet) and query (utf-8) encoding

2005-10-06 Thread Dawid Weiss
> 5. queryString = new String(queryString.getBytes("ISO-8859-1"), "UTF-8"); This will work, but is I'd say a dirty solution :) Add a configuration option to Tomcat forcing the container to treat URL-encoded arguments as UTF-8. In Tomcat 5 it means you need to

[Nutch-dev] Content-Type and character encoding

2004-05-26 Thread Andrzej Bialecki
characters in the index, and to mixed-encoded cached content. It's almost impossible to search or display consistently such mixed content. E.g. for Polish language there are at least two/three popular encodings different than ISO-8859-1, and the text becomes garbled when the code assumes this enc

Re: Re: [Nutch-dev] RE: A problem about Chinese word segment

2005-03-17 Thread cao yuzhong
13:37 +0800 weird! Nutch supports Chinese characters searching. Can you print your query string in search.jsp? NOTE: the page should be encoded in UTF-8. /Jack === At 2005-03-17, 13:49:00 you wrote: === >I have added Chinese stopwords in String[] STOP_WORDS in NutchAnalysis.jj. >My

Re: Re: [Nutch-dev] RE: A problem about Chinese word segment

2005-03-16 Thread Jason Tang
weird! Nutch supports Chinese characters searching. Can you print your query string in search.jsp? NOTE: the page should be encoded in UTF-8. /Jack === At 2005-03-17, 13:49:00 you wrote: === >I have added Chinese stopwords in String[] STOP_WORDS in NutchAnalysis.jj. >My prob