If ip filtering is not done a malicious user, controlling a single domain could feed links to crawl your internal servers (if they're accessible from host runnning fetcher). That is why I think this issue should be addressed somehow at some point.

--
 Sami Siren

Stefan Groschupf wrote:
Hi,
this a curious dns entry.
To check the ip until fetch-list generation could be an performance problem, to check ip until crawling would be the only chance i see.
Anyway since this is a problem of domain registration we should think about just inform denic instead of writing a filter.




Stefan


Am 18.03.2005 um 20:02 schrieb Sami Siren:

Hi,

I agree, at least in theory current behaviour might expose some unwanted content if the search results were public.

Could you please submit this to jira?

--
 Sami Siren


Matthias Jaekle wrote:

For the domains www.tik24.de there is a dns entry 127.0.0.1.
I think nutch should realize that and ignore such domains, if this won't be a problem for intranet crawling.



---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net





------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to