Hello, Thanks.
I am running nutch 0.8.1.
What is this property for? Should I set it at 120 as requested by the error
message?
Another prolem that I have is that on some website, all pages are not fetched,
and even more weird, some which are doesn't actually exist...
Thanking you in advance,
Mat
----- Message d'origine ----
De : Sami Siren <[EMAIL PROTECTED]>
À : [email protected]
Envoyé le : Mercredi, 22 Novembre 2006, 22h40mn 10s
Objet : Re: Fetch fails
frgrfg gfsdgffsd wrote:
> Hi all,
>
> I have a problem with the crawl/fetch of 1 website (www.lequipe.fr),
> although it works for fine another (www.lemonde.fr).
>
> Here are the errors:
> ERROR [MAT] 2006-11-22 00:36:20,860 - Http.invoke0(?) |
> java.lang.IllegalArgumentException: null metadata
> ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at
> org.apache.nutch.protocol.Content.<init>(Content.java:60)
> ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at
> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:196)
> ERROR [MAT] 2006-11-22 00:36:20,870 - Http.invoke0(?) | at
> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:162)
>
> Don't understand why metadata is null when there are some metadata on the
> pages...
>
what version of nutch are you running?
> I also have this messsage just before:
> INFO [MAT] 2006-11-22 00:36:32,477 - HttpBase.getProtocolOutput(194) |
> Skipping: http://www.lequipe.fr/ exceeds fetcher.max.crawl.delay, max=30,
> Crawl-Delay=120
>
> and i can't find this property in nutch-site.xml
You need to add it there.
<property>
<name>fetcher.max.crawl.delay</name>
<value> your value here </value>
</property>
--
Sami Siren
___________________________________________________________________________
Yahoo! Mail réinvente le mail ! Découvrez le nouveau Yahoo! Mail et son
interface révolutionnaire.
http://fr.mail.yahoo.com-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general