Hi,
I am trying to configure a recent nutch (0.8+) to configure to fetch
directly from the file system instead of http which is fairly slow. The
fetcher hits a 404 - File not found (see below). When I'm copying the
file:/// <file:///> URL into lynx it gets found without any problems.
2006-09-15 10:29:57,739 INFO fetcher.Fetcher - fetching
file:///mnt/smbfs/hollywood/projects/Telstra/Keystone\
<file:///mnt/smbfs/hollywood/projects/Telstra/Keystone\> -\
Leapfrog/Keystone/Architecture/Archives/info.txt
2006-09-15 10:29:57,746 INFO fetcher.Fetcher - fetch of
file:///mnt/smbfs/hollywood/projects/Telstra/Keystone\
<file:///mnt/smbfs/hollywood/projects/Telstra/Keystone\> -\
Leapfrog/Keystone/Architecture/Archives/info.txt failed with:
org.apache.nutch.protocol.file.FileError: File Error: 404
Anybody having a similar problem - or better - resolution?
Cheers, Bruno
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general