Thanks Feng!!!
Renato M. 2013/5/15 Adriana Farina <[email protected]>: > Thank you very much! > > 2013/5/14 feng lu <[email protected]> > >> yes, the id will be automatically stored in HBase and the outlinks that >> extract from seed url will not have any of this information. the >> information is store in the metadata of current url, as part of the >> metadata of current url. >> >> >> >> >> On Fri, May 10, 2013 at 10:59 PM, Renato Marroquín Mogrovejo < >> [email protected]> wrote: >> >> > Hi Feng, >> > >> > So this means I could put any type of information for the seed urls but >> > what about the ones fetched in the next cycles? They won't have any of >> this >> > information right? >> > And where is this information stored? As part of the fetched or the >> parsed >> > information? >> > Thanks! >> > >> > Renato M. >> > On May 10, 2013 9:46 AM, "Adriana Farina" <[email protected]> >> > wrote: >> > >> > > And the ids and will be automatically stored in HBase? >> > > >> > > >> > > 2013/5/10 feng lu <[email protected]> >> > > >> > > > Hi Adriana >> > > > >> > > > you can add metadata to each seed url like this >> > > > >> > > > http://www.example.com id=123 >> > > > http://www.example.com id=456 >> > > > >> > > > each CrawlDatum include many metadatas, you can use that to store any >> > > > information about url. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Fri, May 10, 2013 at 5:26 PM, Adriana Farina >> > > > <[email protected]>wrote: >> > > > >> > > > > Hello, >> > > > > >> > > > > I'm using Nutch 2.1 on top of Hadoop 1.0.4, with HBase 0.90.4 as >> > > storage >> > > > > system. I run Nutch in distributed mode. >> > > > > >> > > > > I need to associate an id to each url inside the seed list of nutch >> > and >> > > > to >> > > > > store this information in HBase. I think that I have to create a >> new >> > > > column >> > > > > family in HBase and modify the gora and hbase configuration files >> in >> > > the >> > > > > nutch conf folder. >> > > > > >> > > > > However, I think I need to modify the code of Nutch, but I don't >> know >> > > > which >> > > > > classes I have to modify. I googled a bit, but I didn't find any >> > > > > documentation; I've searched inside the code but I wasn't able to >> > solve >> > > > my >> > > > > problem. >> > > > > >> > > > > Can anybody help me? >> > > > > >> > > > > Thank you! >> > > > > >> > > > > >> > > > > -- >> > > > > Adriana Farina >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Don't Grow Old, Grow Up... :-) >> > > > >> > > >> > > >> > > >> > > -- >> > > Adriana Farina >> > > >> > >> >> >> >> -- >> Don't Grow Old, Grow Up... :-) >> > > > > -- > Adriana Farina

