Re: Problems compiling Nutch in Eclipse
inverted index - A sequence of (key, pointer) pairs where each pointer points to a record in a database which contains the key value in some particular field. The index is sorted on the key values to allow rapid searching for a particular key value, using e.g. binary search. The index is "inverted" in the sense that the key value is used to find the record rather than the other way round. in nutch indexes are created on: from parse, for title, metadata, etc. from parse, for text from invert, for anchors from fetch, for fetch date Checkout the indexes folder after crawling. On Mon, Mar 23, 2009 at 7:56 PM, Rodrigo Reyes C. wrote: > Ninad > > I've been reading your blog, specifically the article named "Nutch > Architecture". I posted a comment there but I am not sure you have noticed > it so I will post it here too. > > What do you mean by: > > *"The index is the inverted index of all of the pages the system has > retrieved, and is created by merging all of the individual segment indexes. > *" > > Can you give us an example of how the original segment index looks like and > how it is inverted? Thanx > > Rodrigo > > 2009/3/21 Ninad Raut > >> Check out my blog : >> >> http://j2eewebsearch.blogspot.com/ >> >> Check out the third point... >> >> Let me know if you you get it all right. Your comments will be >> appreciated. >> >> Regards, >> Ninad >> >> >> On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. >> wrote: >> >>> Hi >>> >>> I have configured my eclipse project as stated here >>> >>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9 >>> >>> Still, I am getting the following errors: >>> >>>- The return type is incompatible with Parser.getParse(Content) >>>RTFParseFactory.java >>>nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 >>>Java Problem >>>- Type mismatch: cannot convert from ParseResult to Parse >>>TestRTFParser.java >>>nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 >>>Java Problem >>> >>> Any ideas on what could be wrong? I already included both >>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and >>> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars. >>> >>> Thanks in advance >>> >>> -- >>> Rodrigo Reyes C. >>> >>> >> > >
Re: Problems compiling Nutch in Eclipse
Ninad I've been reading your blog, specifically the article named "Nutch Architecture". I posted a comment there but I am not sure you have noticed it so I will post it here too. What do you mean by: *"The index is the inverted index of all of the pages the system has retrieved, and is created by merging all of the individual segment indexes.* " Can you give us an example of how the original segment index looks like and how it is inverted? Thanx Rodrigo 2009/3/21 Ninad Raut > Check out my blog : > http://j2eewebsearch.blogspot.com/ > > Check out the third point... > > Let me know if you you get it all right. Your comments will be appreciated. > > Regards, > Ninad > > > On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. > wrote: > >> Hi >> >> I have configured my eclipse project as stated here >> >> http://wiki.apache.org/nutch/RunNutchInEclipse0.9 >> >> Still, I am getting the following errors: >> >>- The return type is incompatible with Parser.getParse(Content) >>RTFParseFactory.java >>nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 >>Java Problem >>- Type mismatch: cannot convert from ParseResult to Parse >>TestRTFParser.java >>nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 >>Java Problem >> >> Any ideas on what could be wrong? I already included both >> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and >> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars. >> >> Thanks in advance >> >> -- >> Rodrigo Reyes C. >> >> >
Re: Problems compiling Nutch in Eclipse
Doğacan This answers my questions. Thank you so much. Rodrigo 2009/3/21 Doğacan Güney > RTF parser is not built by default because the jars it uses has some > licensing issues. And it is out of sync with current trunk so it > does not even build anymore. > > This issue may help: > https://issues.apache.org/jira/browse/NUTCH-644 > > On Sat, Mar 21, 2009 at 03:02, Rodrigo Reyes C. > wrote: > > Hi > > > > I have configured my eclipse project as stated here > > > > http://wiki.apache.org/nutch/RunNutchInEclipse0.9 > > > > Still, I am getting the following errors: > > > > The return type is incompatible with Parser.getParse(Content) > > RTFParseFactory.java > > nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 > > Java Problem > > Type mismatch: cannot convert from ParseResult to Parse > > TestRTFParser.java > > nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 > > Java Problem > > > > Any ideas on what could be wrong? I already included both > > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and > > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars. > > > > Thanks in advance > > > > -- > > Rodrigo Reyes C. > > > > > > > > -- > Doğacan Güney > -- Rodrigo Reyes C. Software Developer Avity LLC 105 Court Street, Suite 401 New Haven, CT 06511-6957 O rrc179 F 203-643-2002 rodrigo.re...@avity.com www.avity.com
Re: Problems compiling Nutch in Eclipse
RTF parser is not built by default because the jars it uses has some licensing issues. And it is out of sync with current trunk so it does not even build anymore. This issue may help: https://issues.apache.org/jira/browse/NUTCH-644 On Sat, Mar 21, 2009 at 03:02, Rodrigo Reyes C. wrote: > Hi > > I have configured my eclipse project as stated here > > http://wiki.apache.org/nutch/RunNutchInEclipse0.9 > > Still, I am getting the following errors: > > The return type is incompatible with Parser.getParse(Content) > RTFParseFactory.java > nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtf line 52 > Java Problem > Type mismatch: cannot convert from ParseResult to Parse > TestRTFParser.java > nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtf line 78 > Java Problem > > Any ideas on what could be wrong? I already included both > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/ jars. > > Thanks in advance > > -- > Rodrigo Reyes C. > > -- Doğacan Güney
Re: Problems compiling Nutch in Eclipse
Ninad Thanks for your answer. I have to say I am eager to read all you have written in your blog about Nutch inner workings. I've already done everything your blog post tells to do (and a couple more things like downloading a couple of extra jars that are not included in the SVN version). Nevertheless, I am still getting the error I wrote. I think I should also mention I am not working on 0.9 code base but on the trunk code base. Maybe that is why I am getting this error. Rodrigo PS: By the way, I did managed to have Nutch crawling yesterday late at night. Still, I haven't been able to compile this specific plugin (rtf plugin) 2009/3/21 Ninad Raut > Check out my blog : > http://j2eewebsearch.blogspot.com/ > > Check out the third point... > > Let me know if you you get it all right. Your comments will be appreciated. > > Regards, > Ninad > > > On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. > wrote: > >> Hi >> >> I have configured my eclipse project as stated here >> >> http://wiki.apache.org/nutch/RunNutchInEclipse0.9 >> >> Still, I am getting the following errors: >> >>- The return type is incompatible with Parser.getParse(Content) >>RTFParseFactory.java >>nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 >>Java Problem >>- Type mismatch: cannot convert from ParseResult to Parse >>TestRTFParser.java >>nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 >>Java Problem >> >> Any ideas on what could be wrong? I already included both >> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/and >> http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars. >> >> Thanks in advance >> >> -- >> Rodrigo Reyes C. >> >> >
Re: Problems compiling Nutch in Eclipse
Check out my blog : http://j2eewebsearch.blogspot.com/ Check out the third point... Let me know if you you get it all right. Your comments will be appreciated. Regards, Ninad On Sat, Mar 21, 2009 at 6:32 AM, Rodrigo Reyes C. wrote: > Hi > > I have configured my eclipse project as stated here > > http://wiki.apache.org/nutch/RunNutchInEclipse0.9 > > Still, I am getting the following errors: > >- The return type is incompatible with Parser.getParse(Content) >RTFParseFactory.java >nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 >Java Problem >- Type mismatch: cannot convert from ParseResult to Parse >TestRTFParser.java >nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 >Java Problem > > Any ideas on what could be wrong? I already included both > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and > > http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/jars. > > Thanks in advance > > -- > Rodrigo Reyes C. > >
Problems compiling Nutch in Eclipse
Hi I have configured my eclipse project as stated here http://wiki.apache.org/nutch/RunNutchInEclipse0.9 Still, I am getting the following errors: - The return type is incompatible with Parser.getParse(Content) RTFParseFactory.java nutch/src/plugin/parse-rtf/src/java/org/apache/nutch/parse/rtfline 52 Java Problem - Type mismatch: cannot convert from ParseResult to Parse TestRTFParser.java nutch/src/plugin/parse-rtf/src/test/org/apache/nutch/parse/rtfline 78 Java Problem Any ideas on what could be wrong? I already included both http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-mp3/lib/ and http://nutch.cvs.sourceforge.net/nutch/nutch/src/plugin/parse-rtf/lib/ jars. Thanks in advance -- Rodrigo Reyes C.