subject:"why i can't crawl all the linked pages in the specified page to crawl."

Re: why i can't crawl all the linked pages in the specified page to crawl.

2006-07-07 Thread kevin

Hi,Stefan, thanks your reply. i've tried a 20 depth and it works better,it can crawl almost all the pages. however it have not crawled all pages yet. i'll try a bigger depth like 30 later... Stefan Groschupf 写道: Hi, may be you can try to have a much higher depth something like 20? However in

Re: why i can't crawl all the linked pages in the specified page to crawl.

2006-07-06 Thread Tonal Communications \(Stijn Amundsen\)

tefan Groschupf" <[EMAIL PROTECTED]> To: Sent: Friday, July 07, 2006 1:59 AM Subject: Re: why i can't crawl all the linked pages in the specified page to crawl. > Hi, > may be you can try to have a much higher depth something like 20? > However in general check: > + t

Re: why i can't crawl all the linked pages in the specified page to crawl.

2006-07-06 Thread Honda-Search Administrator

: Thursday, July 06, 2006 10:59 PM Subject: Re: why i can't crawl all the linked pages in the specified page to crawl. Hi, may be you can try to have a much higher depth something like 20? However in general check: + the regex url filter file. + the rebotos.txt + nofollow tag in the pages +

Re: why i can't crawl all the linked pages in the specified page to crawl.

2006-07-06 Thread Stefan Groschupf

Hi, may be you can try to have a much higher depth something like 20? However in general check: + the regex url filter file. + the rebotos.txt + nofollow tag in the pages + number of out links to extrac in nutch-default.cml Stefan On 06.07.2006, at 19:12, kevin pang wrote: i set up the nutch to

why i can't crawl all the linked pages in the specified page to crawl.

2006-07-06 Thread kevin pang

i set up the nutch to crawl the url: http://www.haha365.com/gd_joke/ but after crawl complete, only 54 pages were fetched. here is the log info: 060705 154332 parsing file:/C:/cygwin/nutch-0.7.2/conf/nutch-default.xml 060705 154332 parsing file:/C:/cygwin/nutch-0.7.2/conf/crawl-tool.xml 060705 1

Re: why i can't crawl all the linked pages in the specified page to crawl.

Re: why i can't crawl all the linked pages in the specified page to crawl.

Re: why i can't crawl all the linked pages in the specified page to crawl.

Re: why i can't crawl all the linked pages in the specified page to crawl.

why i can't crawl all the linked pages in the specified page to crawl.

5 matches

Site Navigation

Mail list logo

Footer information