Hello,
I'm using nutch 1.0 on Windows Server 2003 (actual Windows XP for testing) for
our intranet. I'm having a hard time creating a batch-file for automized crawl
(via task scheduler).
Nutch is located in C:\cygwin\bin and this is the content of the batch:
--
@echo off
C:
chdir C:\cygwin\bin
bash --login -i -c "C:/cygwin/bin/nutch crawl C:/cygwin/urls -dir crawl-test9
-depth 5"
--
The problem is the following error message I'm getting in command line(with
commented echo off):
~~
WARN plugin.PluginRepository: Plugins: directory not found: plugins
[...]
WARN mapred.LocalJobRunner: job_local_0001
java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer not
found.
[...]
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:113)
~~
Do I miss some parameter for setting a path, where nutch should look for
/plugins? It's located at C:\cygwin\plugins.
Any help is highly appreciated :)
--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss
für nur 17,95 Euro/mtl.!*
http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a