Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The "SetupNutchAndTor" page has been changed by LewisJohnMcgibbney:
https://wiki.apache.org/nutch/SetupNutchAndTor?action=diff&rev1=4&rev2=5

  <<TableOfContents(4)>>
  
  == Important Note ==
- The aim of this tutorial is to explain *crawling of* hidden services... not 
for us to use hidden services to crawl. This is a critical point which should 
both be taken into consideration when reading and using Nutch to crawl the Tor 
network.
+ The aim of this tutorial is to explain '''crawling of''' hidden services... 
not for us to use hidden services to crawl. This is a critical point which 
should both be taken into consideration when reading and using Nutch to crawl 
the Tor network. Crawling normal websites via Tor can overload the Tor network, 
but more importantly you can end up making those websites block connections 
from Tor, thus preventing normal users from being able to reach or use that 
website.
- If you are looking to use Nutch to crawl the web from behind the Tor network, 
then you are in the wrong place.
+ '''If you are looking to use Nutch to crawl the web from behind the Tor 
network, then you are in the wrong place.'''
  
  == Introduction ==
  [[https://www.torproject.org/|Tor]] is a network of virtual tunnels that 
allows people and groups to improve their privacy and security on the Internet. 
It also enables software developers to create new communication tools with 
built-in privacy features. Tor provides the foundation for a range of 
applications that allow organizations and individuals to share information over 
public networks without compromising their privacy. 

Reply via email to