With me it works well. My language is Portuguese and I use the language-identifier plugin to recognize it. Take a look to my nutch-site.xml :
.... <nutch-conf> <property> <name>plugin.includes</name> <value>nutch-extensionpoints|protocol-http| language-identifier |urlfilter-regex|parse-(text|html|pdf|msword)|index-basic|query-(basic|site|url)</value> <description>Regular expression naming plugin directory names to include. Any plugin not matching this expression is excluded.</description> </property> <property> <name>http.content.limit</name> <value>-1</value> <description>The length limit for downloaded content, in bytes. If this value is nonnegative (>=0), content longer than it will be truncated; otherwise, no truncation at all. </description> </property> </nutch-conf> I hope I've help you :) Regards, Lourival Junior On 7/10/06, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
If I set my preferred language to non-English (German, for example), and choose "de" from the list of country/language (mixed) codes, the first screen looks good. But in the search result screen, I see character corruptions. Is this working well for everybody else? -kuro
-- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
