Thanks Feng!!!

Renato M.

2013/5/15 Adriana Farina <[email protected]>:
> Thank you very much!
>
> 2013/5/14 feng lu <[email protected]>
>
>> yes, the id will be automatically stored in HBase and  the outlinks that
>> extract from seed url will not have any of this information. the
>> information is store in the metadata of current url, as part of the
>> metadata of current url.
>>
>>
>>
>>
>> On Fri, May 10, 2013 at 10:59 PM, Renato Marroquín Mogrovejo <
>> [email protected]> wrote:
>>
>> > Hi Feng,
>> >
>> > So this means I could put any type of information for the seed urls but
>> > what about the ones fetched in the next cycles? They won't have any of
>> this
>> > information right?
>> > And where is this information stored? As part of the fetched or the
>> parsed
>> > information?
>> > Thanks!
>> >
>> > Renato M.
>> > On May 10, 2013 9:46 AM, "Adriana Farina" <[email protected]>
>> > wrote:
>> >
>> > > And the ids and will be automatically stored in HBase?
>> > >
>> > >
>> > > 2013/5/10 feng lu <[email protected]>
>> > >
>> > > > Hi Adriana
>> > > >
>> > > > you can add metadata to each seed url like this
>> > > >
>> > > > http://www.example.com  id=123
>> > > > http://www.example.com  id=456
>> > > >
>> > > > each CrawlDatum include many metadatas, you can use that to store any
>> > > > information about url.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 10, 2013 at 5:26 PM, Adriana Farina
>> > > > <[email protected]>wrote:
>> > > >
>> > > > > Hello,
>> > > > >
>> > > > > I'm using Nutch 2.1 on top of Hadoop 1.0.4, with HBase 0.90.4 as
>> > > storage
>> > > > > system. I run Nutch in distributed mode.
>> > > > >
>> > > > > I need to associate an id to each url inside the seed list of nutch
>> > and
>> > > > to
>> > > > > store this information in HBase. I think that I have to create a
>> new
>> > > > column
>> > > > > family in HBase and modify the gora and hbase configuration files
>> in
>> > > the
>> > > > > nutch conf folder.
>> > > > >
>> > > > > However, I think I need to modify the code of Nutch, but I don't
>> know
>> > > > which
>> > > > > classes I have to modify. I googled a bit, but I didn't find any
>> > > > > documentation; I've searched inside the code but I wasn't able to
>> > solve
>> > > > my
>> > > > > problem.
>> > > > >
>> > > > > Can anybody help me?
>> > > > >
>> > > > > Thank you!
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Adriana Farina
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Don't Grow Old, Grow Up... :-)
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Adriana Farina
>> > >
>> >
>>
>>
>>
>> --
>> Don't Grow Old, Grow Up... :-)
>>
>
>
>
> --
> Adriana Farina

Reply via email to