chee wu wrote: > Thanks Sami. I tried LanguageIndexingFilter,and it seems the > LanguageIdentifier can't recognize Chinese now ?
No it doesn't. The list of languages can be checked here (*.ngp): http://svn.apache.org/viewvc/lucene/nutch/branches/branch-0.8/src/plugin/languageidentifier/src/java/org/apache/nutch/analysis/lang/ You can build a ngp profile for chinese, but i think that in language identifiers current form it might not work that well. You could also build an specialized identifier and add it as indexing filter - the most basic form could just blindly set lang to Chinese if that suits your use case. -- Sami Siren > > ----- Original Message ----- > From: "Sami Siren" <[EMAIL PROTECTED]> > To: <[email protected]> > Sent: Sunday, January 07, 2007 5:47 PM > Subject: Re: Nutch .81: the process to add a new analyzer ? > > >> Chee Wu wrote: >>> Hi, >>> I am trying to add a new analyzer for Chinese,and I found the >>> code below in the "org.apache.nutch.indexer.Indexer" >>> >>> The question of mine is: >>> For doc.get("lang"). Where and how can I set the "lang" property for >> lang field is put there by language identifier plugin if it is active. >> >> http://lucene.apache.org/nutch/apidocs-0.8.x/org/apache/nutch/analysis/lang/LanguageIndexingFilter.html >> >> -- >> Sami Siren >> ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
