Hi, Thanks for your mail. It helped me.
Thanks, David On Mon, Jan 7, 2013 at 8:30 PM, kiran chitturi <[email protected]>wrote: > Hi Michael, > > The Nutch2Tutorial [1] is only for configuring Hbase with Nutch. The > 'readdb' commands needs a parameter to work with. > > Please check [2] for steps to crawl using Nutch 2 and Hbase. There is also > patch in the issue [3] for using a script for crawling with Nutch 2. > > [1] - http://wiki.apache.org/nutch/Nutch2Tutorial > [2] - > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html > [3] - https://issues.apache.org/jira/browse/NUTCH-1087 > > Hope this helps! > > Regards, > Kiran. > > On Mon, Jan 7, 2013 at 11:52 AM, Michael Gang <[email protected]> > wrote: > > > Hi all, > > > > I am trying to follow the tutorial of nutch2 at > > http://wiki.apache.org/nutch/Nutch2Tutorial > > but after inject the tutorial ends and i don't know how to continue from > > there. > > > > When i try to run > > > > nutch readdb > > > > > > I get an error > > > > :bin/nutch readdb > > Usage: WebTableReader (-stats | -url [url] | -dump <out_dir> [-regex > > regex]) > > [-crawlId <id>] [-content] [-headers] [-links] > > [-text] > > -crawlId <id> - the id to prefix the schemas to operate on, > > (default: storage.crawl.id) > > -stats [-sort] - print overall statistics to System.out > > [-sort] - list status sorted by host > > -url <url> - print information on <url> to System.out > > -dump <out_dir> [-regex regex] - dump the webtable to a text file in > > <out_dir> > > -content - dump also raw content > > -headers - dump protocol headers > > -links - dump links > > -text - dump extracted text > > [-regex] - filter on the URL of the webtable entry > > > > I am asking myself how i can configure nutch that it will crawl a certain > > page and all his children pages. > > I see that this is the topic in the tutorial > > http://wiki.apache.org/nutch/NutchTutorial > > but i am not sure from which point to continue, as in nutch2 i am working > > against hbase and not against a directory. > > > > Thanks, > > David > > > > > > -- > Kiran Chitturi >

