Hello Neera, Even I was looking for solution for the same problem. I did not find it yet. Please let me know if you find the solution.
Thanks, Kunal Neera wrote: > > Hi, > > I am trying to use LanguageIdentifier plugin for detecting language for > crawled results and found the following link : > http://wiki.apache.org/nutch/LanguageIdentifier > > This page mentions some open issues on the lab test benchmark. Since these > numbers were reported by analyzing results > from the previous version nutch-0.7, I am curious if these issues have > been > fixed in the newer versions (nutch-0.9) ? > Is there a newer link/thread for the LanguageIdentifier plugin. > > Also this plugin API assumes that the given contents are in UTF-8 format. > Are the contents of nutch dump file in UTF-8 fomat? > > Thanks and Regards, > Neera > > -- View this message in context: http://www.nabble.com/Language-Identifier-plugin-tp22318564p23041507.html Sent from the Nutch - User mailing list archive at Nabble.com.
