[ https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803623#action_12803623 ]
Xiao Yang edited comment on NUTCH-650 at 1/22/10 7:59 AM: ---------------------------------------------------------- Some instructions for NUTCH-650.patch 1. API in hbase-0.20.0-r804408.jar is different from the final release. 2. Avoid some NullPointer error 3. Change invalid Column family name 4. Add "id" field to index to avoid this error: java.lang.IllegalArgumentException: it doesn't make sense to have a field that is neither indexed nor stored at org.apache.lucene.document.Field.(Field.java:279) at org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136) at org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79) at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) was (Author: yangxiao): 1. API in hbase-0.20.0-r804408.jar is different from the final release. 2. Avoid some NullPointer error 3. Change invalid Column family name 4. Add "id" field to index to avoid this error: java.lang.IllegalArgumentException: it doesn't make sense to have a field that is neither indexed nor stored at org.apache.lucene.document.Field.(Field.java:279) at org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136) at org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46) at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79) at org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.Child.main(Child.java:170) > Hbase Integration > ----------------- > > Key: NUTCH-650 > URL: https://issues.apache.org/jira/browse/NUTCH-650 > Project: Nutch > Issue Type: New Feature > Affects Versions: 1.0.0 > Reporter: Doğacan Güney > Assignee: Doğacan Güney > Fix For: 1.1 > > Attachments: hbase-integration_v1.patch, hbase_v2.patch, > malformedurl.patch, meta.patch, meta2.patch, nofollow-hbase.patch, > NUTCH-650.patch, nutch-habase.patch, searching.diff, slash.patch > > > This issue will track nutch/hbase integration -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.