I have written a custom URLFilter that resolves the hostname into an IP 
address and checks the latter against a GeoIP database. Unfortunately the 
source code was developed under a commercial contract, and is not freely 
available.

Enzo

----- Original Message ----- 
From: "Cesar Voulgaris" <[EMAIL PROTECTED]>
To: "nutch user" <[EMAIL PROTECTED]>
Sent: Monday, June 11, 2007 9:24 AM
Subject: crawling by ip range


Hi all, I have some problem for some time, I want to crawl only sites of my
country or related to it. The problem is
that crawling only by domain (in my case I set teh regex-urlfiter regex to
cath "(com|org|..).uy") lives out a lot of sites wich doesn,t end in .uy but

in .com .org, .... I don´t want to crawl to a certain depth and expand the
crawled pages outside the country. Is ther any clever method to crawl over a
range of ip´s
without touching the code?. If not, which plugin or extension point I have
to extend to consider such thing as ip checking for a gven url?

thanks in advance


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to