Hi,

My Nutch is 2.3 with Gora and Hbase, below are the sample field values I
have scanned from HBase here:

baseUrl: value=http://www.apache.org
status: value=\x00\x00\x00\x01
prevFetchTime: value=\x00\x00\x01L\x91]\xF5\x1C
fetchTime: value=\x00\x00\x01L\x93\x92\x0F\x5C
fetchInterval: value=\x00'\x8D\x00
retriesSinceFetch:  \x00\x00\x00\x00
reprUrl:  value=http://www.apache.org
protocolStatus: value=\x02\x00\x00
modifiedTime: value=\x00\x00\x01L\x93@\xE1H
prevModifiedTime: value=\x00\x00\x00\x00\x00\x00\x00\x00
batchId: value=1428399528-1598360492
parseStatus: value=\x02\x00\x00
signature: value=\xD7\xA7\x04pT7?E\xFA\x1A\x01"\x08\x89$0
prevSignature: value=\x85\xC2i@\xFC(\xDE\xEEt?\xE7\xFB\xE1rY\xAF
score: value=\x00\x00\x00\x00




Below is my related gora-hbase-mapping.xml about these fields

        <field name="baseUrl" family="f" qualifier="bas"/>
        <field name="status" family="f" qualifier="st"/>
        <field name="prevFetchTime" family="f" qualifier="pts"/>
        <field name="fetchTime" family="f" qualifier="ts"/>
        <field name="fetchInterval" family="f" qualifier="fi"/>
        <field name="retriesSinceFetch" family="f" qualifier="rsf"/>
        <field name="reprUrl" family="f" qualifier="rpr"/>
        <field name="content" family="f" qualifier="cnt"/>
        <field name="contentType" family="f" qualifier="typ"/>
        <field name="protocolStatus" family="f" qualifier="prot"/>
        <field name="modifiedTime" family="f" qualifier="mod"/>
        <field name="prevModifiedTime" family="f" qualifier="pmod"/>
        <field name="batchId" family="f" qualifier="bid"/>
        <field name="title" family="p" qualifier="t"/>
        <field name="text" family="p" qualifier="c"/>
        <field name="parseStatus" family="p" qualifier="st"/>
        <field name="signature" family="p" qualifier="sig"/>
        <field name="prevSignature" family="p" qualifier="psig"/>
        <field name="score" family="s" qualifier="s"/>



Q: Is there a way to configure Nutch/Gora/HBase so it will store the value
like following and no need to do field type conversion?

baseUrl:    null
status: 4 (status_redir_temp)
fetchTime:  1426888912463
prevFetchTime:  1424296904936
fetchInterval:  2592000
retriesSinceFetch:  0
modifiedTime:   0
prevModifiedTime:   0
protocolStatus: (null)
parseStatus:    (null)
title:  null
score:  1.0
marker _injmrk_ :   y
marker dist :   0
reprUrl:    null
batchId:    1424296906-20007


Please help!

Regards

Reply via email to