Hi, Doğacan I have checked out Nutchbase from http://svn.apache.org/repos/asf/lucene/nutch/branches/nutchbase/ My Hbase version is 0.20.2.
createtable succeeded, but inject doesn't work. $bin/nutch createtable *crawl* Here is the status of Hbase: hbase(main):014:0> list 10/01/12 15:37:43 DEBUG client.HConnectionManager$TableServers: Cache hit for row <> in tableName .META.: location server 10.214.10.146:34592, location region name .META.,,1 *crawl* 1 row(s) in 0.0110 seconds $bin/nutch inject crawl urls Injector: starting Injector: urlDir: urls Injecting new users failed! Here is the log: 2010-01-12 15:38:57,515 WARN mapred.LocalJobRunner - job_local_0001 java.lang.reflect.UndeclaredThrowableException at $Proxy0.getRegionInfo(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootRegion(HConnectionManager.java:874) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:515) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:565) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:524) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:565) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:528) at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:491) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:123) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:101) at org.apache.nutch.crawl.Injector$UrlMapper.setup(Injector.java:102) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:518) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176) Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NullPointerException at java.lang.Class.searchMethods(Class.java:2646) at java.lang.Class.getMethod0(Class.java:2670) at java.lang.Class.getMethod(Class.java:1603) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:643) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:720) at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:329) ... 17 more 2010-01-12 15:38:57,806 WARN crawl.Injector - Injecting new users failed! What's the problem? Thanks! Xiao 2009/8/17 Doğacan Güney (JIRA) <j...@apache.org>: > > [ https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743919#action_12743919] > > Doğacan Güney commented on NUTCH-650: > ------------------------------------- > > I just committed code to branch nutchbase. The scoring API did not turn out as clean as I expected but I decided to put in what I have. Also, I made some changes so that web UI also works. > > I am leaving this issue open because I will add documentation tomorrow. Meanwhile, > > To download: > > svn co http://svn.apache.org/repos/asf/lucene/nutch/branches/nutchbase > > Usage: > > After starting hbase 0.20 (checkout rev. 804408 from hbase branch 0.20), create a webtable with > > bin/nutch createtable webtable > > After that, usage is similar. > > bin/nutch inject webtable url_dir # inject urls > > for as many cycles as you want; > bin/nutch generate webtable #-topN N works > bin/nutch fetch webtable # -threads N works > bin/nutch parse webtable > bin/nutch updatetable webtable > > bin/nutch index <index> webtable > or > bin/nutch solrindex <solr url> webtable > > To use solr, use this schema file > http://www.ceng.metu.edu.tr/~e1345172/schema.xml > > > Again, a note of warning: This is extremely new code. I hope people will test and use it but there is no guarantee that it will work :) > > >> Hbase Integration >> ----------------- >> >> Key: NUTCH-650 >> URL: https://issues.apache.org/jira/browse/NUTCH-650 >> Project: Nutch >> Issue Type: New Feature >> Affects Versions: 1.0.0 >> Reporter: Doğacan Güney >> Assignee: Doğacan Güney >> Fix For: 1.1 >> >> Attachments: hbase-integration_v1.patch, hbase_v2.patch, malformedurl.patch, meta.patch, meta2.patch, nofollow-hbase.patch, nutch-habase.patch, searching.diff, slash.patch >> >> >> This issue will track nutch/hbase integration > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >