Hi Tony, The simplified steps with snapshots are now added to Nutch wiki [0]. It would be helpful if you could try those out and lets us know if there are any improvements or corrections that you think.
PS: Few images look shrinked. I will be fixing it soon. [0] : https://wiki.apache.org/nutch/RunNutchInEclipse On Mon, Jun 10, 2013 at 2:57 PM, Tejas Patil <tejas.patil...@gmail.com>wrote: > I have created a google doc [0] with several snapshots describing how to > setup nutch 2.x + eclipse. This is different from the one over the wiki > page and tailored for Nutch 2.x. Please try it out, let us know if you > still have issues with that. Based on your comments, I would add the same > over nutch wiki. > > [0] : > https://docs.google.com/document/d/1qvJwrZ9Sc0NAF9p3ie4uV7JsfCHxnrh9QF19HINw48c/edit?usp=sharing > > > On Mon, Jun 10, 2013 at 11:32 AM, Tejas Patil <tejas.patil...@gmail.com>wrote: > >> yes. >> >> - Close the project in eclipse. Right click on the project, click on >> "Properties" and get the location of the project. >> - Goto that location in terminal >> - >> >> Run 'ant eclipse'. (Note that you need to have Apache >> Ant<http://ant.apache.org/manual/index.html> installed >> and configured) >> >> After going command line, you might as well do this: >> Specify the GORA backend in nutch-site.xml, uncomment its dependency in >> ivy/ivy.xml and ensure that the store you selected is set as the default >> datastore in gora.properties >> >> >> On Mon, Jun 10, 2013 at 11:21 AM, Tony Mullins >> <tonymullins...@gmail.com>wrote: >> >>> Hi, >>> >>> So the latest Nutch2.x includes the Teja's Patch ( >>> https://issues.apache.org/jira/browse/NUTCH-1577) , means if I have >>> latest >>> source then it already has that patch. >>> >>> Now can some one please help me here what is meant by the 2nd last step >>> 'Run 'ant eclipse' on http://wiki.apache.org/nutch/RunNutchInEclipse. >>> >>> Do I need to go to the location where source is and give ant command 'ant >>> -f build.xml' , or its something else ??? >>> And after refreshing the source, Eclipse would let compile and run my >>> code ? >>> >>> Thanks, >>> Tony >>> >>> >>> On Mon, Jun 10, 2013 at 6:56 PM, Tony Mullins <tonymullins...@gmail.com >>> >wrote: >>> >>> > Hi Lewis, >>> > >>> > I understand this, that there may be something wrong on my end. And as >>> I >>> > said I get different errors on running Nutch 2.x with Eclipse, after >>> > following different tutorials. >>> > >>> > My background is in .NET and I might will just move to JAVA , just >>> because >>> > of this project (Nutch). But at the moment I am having difficult time >>> > understanding the 'setup/configuration' required to run Nutch in >>> Eclipse. >>> > >>> > When you say '...*you may find it convenient to patch >>> > >>> > your dist with Tejas' Eclipse ant target and simply run 'ant eclipse' >>> from >>> > within your terminal prior to doing a file, import, existing projects >>> in to >>> > workspace from within Eclipse..*.' >>> > >>> > which patch do I need to get and how to apply it ? >>> > And by running 'ant eclipse' , do you mean dropping build.xml to Ant >>> > window in Eclipse , OR building the Nutch source by using the "ant -f >>> > build.xml" command in terminal ? ( by the way I have done both and >>> both >>> > successfully builds the source , but eclipse doesn't run the source). >>> > >>> > So could you please guide me here in more details, I would be really >>> > grateful to you and Nutch community. >>> > >>> > Thanks, >>> > Tony. >>> > >>> > >>> > On Mon, Jun 10, 2013 at 6:38 PM, Lewis John Mcgibbney < >>> > lewis.mcgibb...@gmail.com> wrote: >>> > >>> >> Hi Tony, >>> >> These issues stem from your environment not being correct. >>> >> I, as many other, have been able to DEBUG and develop Nutch 1.7 and >>> 2.x >>> >> series from within Eclipse. >>> >> As you are working with 2.x source, you may find it convenient to >>> patch >>> >> your dist with Tejas' Eclipse ant target and simply run 'ant eclipse' >>> from >>> >> within your terminal prior to doing a file, import, existing projects >>> in >>> >> to >>> >> workspace from within Eclipse. >>> >> I can guarantee you, the reason the tutorial is on the Nutch wiki is >>> >> because as some stage, someone (many many people), somewhere have >>> found it >>> >> useful for developing Nutch in Eclipse. I don't want to sound like a >>> >> baloon >>> >> here, but your java security exceptions are not a problem with >>> Nutch... >>> >> it's your environment. >>> >> hth >>> >> >>> >> On Monday, June 10, 2013, Tony Mullins <tonymullins...@gmail.com> >>> wrote: >>> >> > Hi , >>> >> > Ok now I have followed this tutorial word by word. >>> >> >>> http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse >>> . >>> >> > >>> >> > After getting new source 2.2 , I have build it using Ant - which was >>> >> successful then set the configurations and comment the 'hsqldb' >>> dependency >>> >> and uncomment the cassandra dependency ( as I want to run it against >>> >> cassandra). After doing this all when I run the code from eclipse I >>> get >>> >> error >>> >> > "Exception in thread "main" java.lang.SecurityException: Prohibited >>> >> package name: java.org.apache.nutch.crawl >>> >> > at java.lang.ClassLoader.preDefineClass(ClassLoader.java:649) >>> >> > at java.lang.ClassLoader.defineClass(ClassLoader.java:785) >>> >> > at >>> >> >>> >> >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)...." >>> >> > >>> >> > and have red '*' all over my code. Please see the attached image. >>> >> > >>> >> > Now what I do ? >>> >> > Please any one could tell me that is it even possible to >>> >> compile/run/debug latest Nutch 2.x branch from Eclipse ? >>> >> > >>> >> > I need help here............... >>> >> > >>> >> > Tony !!! >>> >> > >>> >> > On Mon, Jun 10, 2013 at 12:15 PM, Tejas Patil < >>> tejas.patil...@gmail.com >>> >> > >>> >> wrote: >>> >> >> >>> >> >> Hi Tony, >>> >> >> >>> >> >> That tutorial is based on some earlier nutch version. Please follow >>> >> >> >>> >> >>> http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse >>> . >>> >> >> There has been recent changes to that wiki page and those new steps >>> >> would >>> >> >> take care of getting automation.jar and etc dependencies in place. >>> >> >> >>> >> >> >>> >> >> On Sun, Jun 9, 2013 at 11:58 PM, Tony Mullins < >>> >> tonymullins...@gmail.com >>> >> >wrote: >>> >> >> >>> >> >> > Hi , >>> >> >> > >>> >> >> > The last try I made was with this tutorial ' >>> >> >> > run nutch in eclipse | profilerajanimaski' >>> >> >> > , >>> >> >> > after following word to word ( which didn't work for me) then I >>> made >>> >> some >>> >> >> > modifications to it as for step 11 I added 'bin' , 'gora' , >>> 'java' >>> >> ,'test' >>> >> >> > , 'testprocess' , 'testresources' . And for step 14 I couldn't >>> find >>> >> >> > 'src/plugin/url-filter-automation/lib/automation.jar' in my >>> source. >>> >> >> > >>> >> >> > And when I try to run main 'Crawler' project it says there are >>> errors >>> >> and >>> >> >> > give me option to proceed with errors and when I proceed with >>> errors >>> >> I am >>> >> >> > getting this error: >>> >> >> > >>> >> >> > "InjectorJob: Using class org.apache.gora.memory.store.MemStore >>> as >>> >> the >>> >> >> > Gora storage class. >>> >> >> > InjectorJob: total number of urls rejected by filters: 0 >>> >> >> > InjectorJob: total number of urls injected after normalization >>> and >>> >> >> > filtering: 0 >>> >> >> > Exception in thread "main" java.lang.RuntimeException: job >>> failed: >>> >> >> > name=generate: null, jobid=job_local_0002....... >>> >> >> > ..... >>> >> >> > " >>> >> >> > >>> >> >> > So please help me what I am doing wrong here or guide me to a >>> >> tutorial >>> >> >> > which works.... >>> >> >> > If the latest Nutch 2.2 source doesn't work with these tutorials >>> then >>> >> >> > which version of 2.x will work and how ? >>> >> >> > >>> >> >> > Thanks. >>> >> >> > Tony >>> >> >> > >>> >> >> > >>> >> >> > On Mon, Jun 10, 2013 at 7:20 AM, Tejas Patil < >>> >> tejas.patil...@gmail.com >>> >> >wrote: >>> >> >> > >>> >> >> >> Could you try closing and re-opening the eclipse and then let >>> >> eclipse >>> >> >> >> rebuild workspace. BTW: On which packages / classes do you see >>> red >>> >> dots ? >>> >> >> >> >>> >> >> >> >>> >> >> >> On Sun, Jun 9, 2013 at 9:23 AM, Lewis John Mcgibbney < >>> >> >> >> lewis.mcgibb...@gmail.com> wrote: >>> >> >> >> >>> >> >> >> > Hi Tony, >>> >> >> >> > This source has literally just been released. The tutorial on >>> the >>> >> Nutch >>> >> >> >> > wiki has also just been updated but you need to follow it >>> closely >>> >> and >>> >> >> >> pay >>> >> >> >> > attention to each step. It sounds like the red dots problem >>> your >>> >> having >>> >> >> >> is >>> >> >> >> > explained in the 2nd to last bullet point below >>> >> >> >> > >>> >> >> >> > >>> >> >> >> >>> >> >>> http://wiki.apache.org/nutch/RunNutchInEclipse#Checkout_Nutch_in_Eclipse >>> >> >> >> > >>> >> >> >> > Also, you've not actually said what went wrong! >>> >> >> >> > Lewis >>> >> >> >> > >>> >> >> >> > >>> >> >> >> > On Sunday, June 9, 2013, Tony Mullins < >>> tonymullins...@gmail.com> >>> >> wrote: >>> >> >> >> > > Hi, >>> >> >> >> > > >>> >> >> >> > > I am new to Nutch. I am trying to use Nutch with Cassandra >>> and >>> >> have >>> >> >> >> > > successfully build the Nutch 2.x ( >>> >> >> >> > > http://svn.apache.org/repos/asf/nutch/branches/2.x/). >>> >> >> >> > > >>> >> >> >> > > But I get errors ( different errors after following >>> different >>> >> >> >> tutorials) >>> >> >> >> > > when I try to run it directly from Eclipse ( I am on CentOS >>> 6.4) >>> >> , I >>> >> >> >> have >>> >> >> >> > > tried to follow these tutorials to run Nutch source from >>> Eclipse >>> >> but >>> >> >> >> no >>> >> >> >> > use. >>> >> >> >> > > >>> >> >> >> > > http://wiki.apache.org/nutch/RunNutchInEclipse >>> >> >> >> > > run nutch in eclipse | profilerajanimaski >>> >> >> >> > > >>> >> >> >> >>> >> >>> http://jarpit83.blogspot.com/2012/07/configuring-nutch-in-eclipse.html >>> >> >> >> > > >>> http://techvineyard.blogspot.com/2010/12/build-nutch-20.html >>> >> >> >> > > >>> >> >> >> > > Whatever I do, I get red "*" on my source and it doesn't >>> get >>> >> run >>> >> by >>> >> >> >> > > Eclipse , but it always get build successfully using Ant. >>> >> >> >> > > >>> >> >> >> > > Pleeeeaaase help me here, could any one please guide me to >>> >> single >>> >> web >>> >> >> >> > > tutorial which actually could help me compile and run latest >>> >> Nutch 2.x >>> >> >> >> > with >>> >> >> >> > > Eclipse (Juno) on CentOS. >>> >> >> >> > > >>> >> >> >> > > Thanksss. >>> >> >> >> > > Tony. >>> >> >> >> > > >>> >> >> >> > >>> >> >> >> > -- >>> >> >> >> > *Lewis* >>> >> >> >> > >>> >> >> >> >>> >> >> > >>> >> >> > >>> >> > >>> >> > >>> >> >>> >> -- >>> >> *Lewis* >>> >> >>> > >>> > >>> >> >> >