Re: [Nutch-dev] get CrawlDatum

2006-08-30 Thread HUYLEBROECK Jeremy RD-ILAB-SSF
My current solution is having a modified Fetcher putting info in the Parse Metadata in the output method. Then this info can be used during parsing and so on. As Andrzej said, I also had to create my own OutputFormat. -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]

Re: [Nutch-dev] get CrawlDatum

2006-08-30 Thread Uroš Gruber
Andrzej Bialecki wrote: Uroš Gruber wrote: ParseData.metadata sounds nice, but I think I'm lost again :) If I understand code flow the best place would be in Fetcher [262] but i'm not sure that datum holds info of url being fetched On the input to the fetcher you get a URL and a CrawlDatum (o

Re: [Nutch-dev] get CrawlDatum

2006-08-30 Thread Andrzej Bialecki
Uroš Gruber wrote: > ParseData.metadata sounds nice, but I think I'm lost again :) > If I understand code flow the best place would be in Fetcher [262] > > but i'm not sure that datum holds info of url being fetched On the input to the fetcher you get a URL and a CrawlDatum (originally coming fro

Re: [Nutch-dev] get CrawlDatum

2006-08-30 Thread Uroš Gruber
Andrzej Bialecki wrote: > Uroš Gruber wrote: >> Hi, >> >> Could someone point me how to get CrawlDatum data from key url in >> ParseOutputFormat.write [83]. >> I would like to add data to link urls but this data depend on data of >> url being crawled. > > You can't, because that instance of Crawl

Re: [Nutch-dev] get CrawlDatum

2006-08-30 Thread Andrzej Bialecki
Uroš Gruber wrote: > Hi, > > Could someone point me how to get CrawlDatum data from key url in > ParseOutputFormat.write [83]. > I would like to add data to link urls but this data depend on data of > url being crawled. You can't, because that instance of CrawlDatum is not available at this pla

[Nutch-dev] get CrawlDatum

2006-08-30 Thread Uroš Gruber
Hi, Could someone point me how to get CrawlDatum data from key url in ParseOutputFormat.write [83]. I would like to add data to link urls but this data depend on data of url being crawled. I hope I was clear enough about my problem. regards Uros --