[jira] Issue Comment Edited: (NUTCH-650) Hbase Integration

Xiao Yang (JIRA) Fri, 22 Jan 2010 00:00:49 -0800

    [ 
https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803623#action_12803623
 ]


Xiao Yang edited comment on NUTCH-650 at 1/22/10 7:59 AM:
----------------------------------------------------------

Some instructions for NUTCH-650.patch
1. API in hbase-0.20.0-r804408.jar is different from the final release.
2. Avoid some NullPointer error
3. Change invalid Column family name
4. Add "id" field to index to avoid this error:
java.lang.IllegalArgumentException: it doesn't make sense to have a field that 
is neither indexed nor stored
        at org.apache.lucene.document.Field.(Field.java:279)
        at 
org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136)
        at 
org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79)
        at 
org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)



      was (Author: yangxiao):
    1. API in hbase-0.20.0-r804408.jar is different from the final release.
2. Avoid some NullPointer error
3. Change invalid Column family name
4. Add "id" field to index to avoid this error:
java.lang.IllegalArgumentException: it doesn't make sense to have a field that 
is neither indexed nor stored
        at org.apache.lucene.document.Field.(Field.java:279)
        at 
org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:136)
        at 
org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:245)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:46)
        at 
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:41)
        at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
        at 
org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:79)
        at 
org.apache.nutch.indexer.IndexerReducer.reduce(IndexerReducer.java:20)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)


  
> Hbase Integration
> -----------------
>
>                 Key: NUTCH-650
>                 URL: https://issues.apache.org/jira/browse/NUTCH-650
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 1.0.0
>            Reporter: Doğacan Güney
>            Assignee: Doğacan Güney
>             Fix For: 1.1
>
>         Attachments: hbase-integration_v1.patch, hbase_v2.patch, 
> malformedurl.patch, meta.patch, meta2.patch, nofollow-hbase.patch, 
> NUTCH-650.patch, nutch-habase.patch, searching.diff, slash.patch
>
>
> This issue will track nutch/hbase integration

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (NUTCH-650) Hbase Integration

Reply via email to