Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by ThorstenScherler:
http://wiki.apache.org/nutch/Nutch_-_The_Java_Search_Engine

The comment on the change is:
Adding more information about working with trunk

------------------------------------------------------------------------------
        
  === 3.2.2 Edit the file conf/crawl-urlfilter.txt ===
  
+ If you are using TRUNK then there is no file called conf/crawl-urlfilter.txt 
but conf/crawl-urlfilter.txt. Just do 
+ {{{
+  cat conf/crawl-urlfilter.txt.template|sed 
's/MY.DOMAIN.NAME/criaturitas.org/'g> conf/crawl-urlfilter.txt
+ }}}
- and replace the existing domain name with the name of the domain you wish to 
crawl. For example, if you wished to limit the crawl to the virtusa.com domain, 
the line should read:
+ If you already have this file then replace the existing domain name with the 
name of the domain you wish to crawl. For example, if you wished to limit the 
crawl to the virtusa.com domain, the line should read:
  
                {{{ +^http://([a-z0-9]*\.)*virtusa.com/ }}}
  

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-cvs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to