hi to all, I'm trying to index pages iside my coutry. I set the
regex-urlfilter to crawl within my country domain (.uy).
The problem of coarse is that there are sites in the country not necesarily
with a URl ending in .uy
I tried to put a regular expression (even a single IP!!) in the
regex-urlfilter with IP in the range of my country (eg:
http://201.111.103.1/), the
rawl seems to work Ok but when I check the pages fetched (with readdb )
there is nothing, the db seems to be empty (it gives a null pointer
exception with the readdb command)

Can I set a pattern directly with IP's in the regex-urlfilter?
If no, then how  can I crawl in a  range of IPs?

Thanks in advance
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to