An update:
Doug Cutting wrote:
Sometime soon I will move the code from CVS at Sourceforge into
Subversion at Apache. The Subversion repository will be:
https://svn.apache.org/repos/asf/incubator/nutch/trunk/
We're almost ready to do this. All Nutch comitters now have Apache
accounts. We now
Dear valued PayPal® member:
It has come to our attention that your PayPal® account information needs to be updated as part of our continuing commitment to protect your account and to reduce the instance of fraud on our website. If you could please take 5-10 minutes out of your online exper
Dear valued PayPal® member:
It has come to our attention that your PayPal® account information needs to be updated as part of our continuing commitment to protect your account and to reduce the instance of fraud on our website. If you could please take 5-10 minutes out of your online exper
Greetings,
We have renamed the regex-urlfilter.txt to xxx-regex-urlfilter.txt,
and also we have changed the xxx-regex-urlfilter.txt in
nutch-site.xml. When crawling we got the following error.
===
found resource regex-urlfilter-group.txt at
/conf/regex-urlfilter-gr
Hello, all!
I have made more tests and think that parser always stops on entry 16037
Last output (specified 1 thread for parse):
050210 085448 Entry: 16037 3360 wait=0 read=0 parse=4 wait=0 write=0ms
050210 085448 Read in entry 16038
050210 085448 parsing http://www5a.biglobe.ne.jp/~wakers/top
Hello, all!
I have wrote earlier about problem with dmoz db fetch. After
unsuccessful retries I have started fetch without parse by
"-noParsing" option. All links has been fetched successfully and
now I trying to make "nutch parse" and again see that the parse
stops after ~146