Hi. 

I'm using Nutch 1.9 with Solr 4.10 in a local environment. 
I need a way to priorize some links in the Fetching Steps, through filtering 
the new links identified in the last crawls by some criterias, for example the 
extension of the resource. The goal is priorize images, documents, etc, before 
HTML pages in crawling process. 

Is there any property in nutch-site.xml or any plugin capable to do this?? How 
can I do this??? 

I accept any sugestion, or some source code snippets for creating a new plugin 
for nutch. 

Best regards 

-- 
Ing. Yulio Aleman Jimenez 
Dpto. Soluciones Informáticas para Internet. CIDI 
Universidad de las Ciencias Informáticas (UCI) 
-----------------------------------------------------------------------------------------------------------------------------------
 
"Podrán morir los hombres, PERO JAMÁS SUS IDEAS" 


La UCI presente este 1ro. de Mayo en la Plaza de la Revolución
junto a todo el pueblo.¡Por Cuba: Unidad y Compromiso!

Reply via email to