Hi All, I have done as per below and can create a table from within the hbase shell. I found the appropriate create table method bin/nutch org.apache.nutch.storage.WebTableCreator webtable but it only returns null
Any help would be great Regards Dave On 2 Sep 2010, at 13:12, Julien Nioche wrote: > Hi David, > > I haven't used the Hbase backend with GORA for quite some time but from what > I can remember you'll need the following things : > > * conf/hbase-site.xml => this should correspond to your local configuration > * conf/gora-hbase-mapping.xml => see below > * conf/gora.properties => don't think there anything you need to specify for > Hbase > > * in nutch-site.xml > > <property> > <name>storage.data.store.class</name> > <value>org.gora.hbase.store.HbaseStore</value> > <description>Default class for storing data</description> > </property> > > and of course all the necessary Hbase jars in the /lib dir - probably easier > to modify ivy/ivy.xml so that it includes Hbase > > gora-hbase-mapping.xml : not sure this is the latest version though > > <?xml version="1.0" encoding="UTF-8"?> > > <gora-orm> > > <table name="webtable"> > <family name="p"/> <!-- This can also have params like compression, bloom > filters --> > <family name="f"/> > <family name="s"/> > <family name="il"/> > <family name="ol"/> > <family name="h"/> > <family name="mtdt"/> > <family name="mk"/> > </table> > > <class table="webtable" keyClass="java.lang.String" > name="org.apache.nutch.storage.WebPage"> > <!-- fetch fields --> > <field name="baseUrl" family="f" qualifier="bas"/> > <field name="status" family="f" qualifier="st"/> > <field name="prevFetchTime" family="f" qualifier="pts"/> > <field name="fetchTime" family="f" qualifier="ts"/> > <field name="fetchInterval" family="f" qualifier="fi"/> > <field name="retriesSinceFetch" family="f" qualifier="rsf"/> > <field name="reprUrl" family="f" qualifier="rpr"/> > <field name="content" family="f" qualifier="cnt"/> > <field name="contentType" family="f" qualifier="typ"/> > <field name="protocolStatus" family="f" qualifier="prot"/> > <field name="modifiedTime" family="f" qualifier="mod"/> > > <!-- parse fields --> > <field name="title" family="p" qualifier="t"/> > <field name="text" family="p" qualifier="c"/> > <field name="parseStatus" family="p" qualifier="st"/> > <field name="signature" family="p" qualifier="sig"/> > <field name="prevSignature" family="p" qualifier="psig"/> > > <!-- score fields --> > <field name="score" family="s" qualifier="s"/> > > <field name="headers" family="h"/> > > <field name="inlinks" family="il"/> > > <field name="outlinks" family="ol"/> > > <field name="metadata" family="mtdt"/> > > <field name="markers" family="mk"/> > > </class> > > </gora-orm> > > > HTH > > Good luck! > > Julien > > -- > > Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > > On 2 September 2010 12:58, David Stuart > <david.stu...@progressivealliance.co.uk> wrote: > Hey All, > > I have setup the latest version nutch from trunk and am running into a few > issues with hbase and injecting urls. when I run the command > > runtime/local/bin/nutch inject runtime/local/seed/ > > I get > InjectorJob: java.lang.RuntimeException: Could not create datastore > at > org.apache.nutch.storage.StorageUtils.initMapperJob(StorageUtils.java:70) > at > org.apache.nutch.storage.StorageUtils.initMapperJob(StorageUtils.java:50) > at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:233) > at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:246) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:256) > > Under the gora properties it should be pointing at localhost/nutchtest and I > created that store manually in hbase is that right? > > I have found a few tutorials around nutchbase but the api seems to have > changed since the merge with Nutch trunk > > Any help would be appreciated and I try to do a how to writeup > > Regards, > > Dave >