Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by pannous:
http://wiki.apache.org/nutch/Nutch_on_windows_without_cygwin

New page:
It is possible to run a simple Nutch instance on windows without cygwin!

This is intended for users of java users who want to know how to use nutch 
without cygwin.

After 
configuring the hadoop.xml file for [[Nutch on local filesystem]], 
configuring log4j.properties,
configuring folders and
configuring plugins
just as described in other tutorials,

some little patches where neccessary to make nutch 0.8 with hadoop 0.11 
cooperate:
http://files.pannous.de/org.rar

Other combinations of versions might work without patches. To get to know nutch 
it can be useful to play with the sources. 

After all exceptions have been eliminated we are able to use nutch from java:

CRAWL:

Crawl.main(new String[]{dirWithUrls, "-dir", indexDirToBeCreated});

SEARCH:

NutchBean bean = new NutchBean(configuration, path);
Hits hits = bean.search(Query.parse("Google", configuration), 10);


-------------------------

These patches were neccessary:
* eliminates spaces from the $PATH variable ("for runChild in TaskRunner ")
* get rid of the LOG.warn(dir + " already exists."); inconcistency : 
new File(index + "/crawldb/current").mkdirs();
new File(index + "/linkdb/current").mkdirs();
* fixing some NoMethodFound conflicts in fetcher package
* fixing one UTF8 / Text Classcast version conflict
* No hadoop services have to be started by hand whatsoever. But for you have to 
set 
  <name>mapred.job.tracker</name>
  <value>local</value>

again: Other combinations of versions might work without patches. 



-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-cvs mailing list
Nutch-cvs@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to