Hi,
I want to index & search Tamil (an Indian language) pages using Nutch. I
have some knowledge of Lucene and just got the "Nutch Basic Tutorial"
working. 

Where do I look for indexing Tamil or any other Indian language pages? 

I'm looking for:
*step-by-step" documentation for indexing and search foreign language pages,
particularly Indian languages (hope that's not too much)
*some examples, samples, tutorials would be nice

Or if you could just point me in the right direction, that'll be fine too. 

I saw some postings from "saran" & "saravana kumar" talking about this same
thing. Guys, did you figure this out? if yes - could you please help? 

Could someone help? 

thanks,
Surya



saran wrote:
> 
>  i try to set the classpath
> 
> as  and i run this command in
> ...../nutch0.9      the nutch directory
> 
> java -classpath build/language-identifier/language-
> identifier.jar:build/language-identifier/classes/org/apache/nutch/analysis/lang/NGramProfile
> -create ta
> sample.ta.utf8.txt UTF8 and i  got the error as follows
> 
> Exception in thread "main" java.lang.NoClassDefFoundError: loaded class
> NGramProfile was in fact named org.apache.nutch.analysis.lang.NGramProfile
>    at java.lang.VMClassLoader.defineClass(libgcj.so.7)
>    at java.lang.ClassLoader.defineClass(libgcj.so.7)
>    at java.security.SecureClassLoader.defineClass(libgcj.so.7)
>    at java.net.URLClassLoader.findClass (libgcj.so.7)
>    at java.lang.ClassLoader.loadClass(libgcj.so.7)
>    at java.lang.ClassLoader.loadClass(libgcj.so.7)
>    at java.lang.Class.forName(libgcj.so.7)
>    at gnu.java.lang.MainThread.run(libgcj.so.7)
> 
> 
> and i am not able to figure out that can u please help me .....
> 
> 

-- 
View this message in context: 
http://www.nabble.com/help-regarding-creating-the-NGramProfile-for-Tamil-language-tp12205855p21657675.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to