Implement a ScoringFilter, specifically the generate something method(), and 
emit a high float for image MIME's.
 
 
-----Original message-----
> From:Eyeris RodrIguez Rueda <eru...@uci.cu>
> Sent: Friday 6th February 2015 19:54
> To: user@nutch.apache.org
> Subject: how to crawl image first on every round of nutch?
> 
> Hi all.
> 
> I want to use nutch for to crawl images, but my problem is how to fetch 
> images first, from crawldb on every round of crawl.
> I was reading about AdaptiveFetchSchedule by MIME-type option but i´m not 
> sure if this solve my problem because it only function when nutch has crawl 
> the link at least once and extracted metadata of it.
> 
> In my case you crawl page A and you discover 5 links to images, i want to 
> fetch in the next round that images, before other types of documents of 
> crawldb.
> Is there any way to prioritize images on every round of crawl?
> 
> I´m using nutch 1.9 and solr 4.10 in local mode.
> 
> 
> ---------------------------------------------------
> XII Aniversario de la creación de la Universidad de las Ciencias 
> Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
> 

Reply via email to