Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by tyrellperera:
http://wiki.apache.org/nutch/Nutch_-_The_Java_Search_Engine

------------------------------------------------------------------------------
        
  === 3.2.2 Edit the file conf/crawl-urlfilter.txt ===
  
- and replace the existing domain name with the name of the domain you wish to 
crawl. For example, if you wished to limit the crawl to the openreach.co.uk 
domain, the line should read:
+ and replace the existing domain name with the name of the domain you wish to 
crawl. For example, if you wished to limit the crawl to the virtusa.com domain, 
the line should read:
  
                {{{ +^http://([a-z0-9]*\.)*virtusa.com/ }}}
  
@@ -154, +154 @@

  
  == 3.3 Configuring the Nutch Web Application ==
  
- The search web application is already integrated and deployed along with the 
ORPG application. In order for the nutch search web application to function 
properly, it needs to know where to find the indexes. We need to map our 
indexes by editing the ‘nutch-site.xml’ file.
+ The search web application is included in your downloaded Nutch archive. In 
order for the nutch search web application to function properly, it needs to 
know where to find the indexes. We need to map our indexes by editing the 
‘nutch-site.xml’ file.
  
  NOTE: the steps below assume that the 
  

Reply via email to