Hi, I want to index & search Tamil (an Indian language) pages using Nutch. I have some knowledge of Lucene and just got the "Nutch Basic Tutorial" working.
Where do I look for indexing Tamil or any other Indian language pages? I'm looking for: *step-by-step" documentation for indexing and search foreign language pages, particularly Indian languages (hope that's not too much) *some examples, samples, tutorials would be nice Or if you could just point me in the right direction, that'll be fine too. I saw some postings from "saran" & "saravana kumar" talking about this same thing. Guys, did you figure this out? if yes - could you please help? Could someone help? thanks, Surya saran wrote: > > i try to set the classpath > > as and i run this command in > ...../nutch0.9 the nutch directory > > java -classpath build/language-identifier/language- > identifier.jar:build/language-identifier/classes/org/apache/nutch/analysis/lang/NGramProfile > -create ta > sample.ta.utf8.txt UTF8 and i got the error as follows > > Exception in thread "main" java.lang.NoClassDefFoundError: loaded class > NGramProfile was in fact named org.apache.nutch.analysis.lang.NGramProfile > at java.lang.VMClassLoader.defineClass(libgcj.so.7) > at java.lang.ClassLoader.defineClass(libgcj.so.7) > at java.security.SecureClassLoader.defineClass(libgcj.so.7) > at java.net.URLClassLoader.findClass (libgcj.so.7) > at java.lang.ClassLoader.loadClass(libgcj.so.7) > at java.lang.ClassLoader.loadClass(libgcj.so.7) > at java.lang.Class.forName(libgcj.so.7) > at gnu.java.lang.MainThread.run(libgcj.so.7) > > > and i am not able to figure out that can u please help me ..... > > -- View this message in context: http://www.nabble.com/help-regarding-creating-the-NGramProfile-for-Tamil-language-tp12205855p21657675.html Sent from the Nutch - User mailing list archive at Nabble.com.
