Stefan Groschupf wrote:
Andrzej,
... in LinkDB line 114, in the configure method and it is used in line
168 and 176.
Ah, true - yes, it should be added and documented. I'll do it.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __
[__ || __|__
Andrzej,
... in LinkDB line 114, in the configure method and it is used in
line 168 and 176.
Stefan
Am 18.07.2006 um 16:02 schrieb Andrzej Bialecki:
Stefan Groschupf wrote:
Hi,
shouldn't db.max.inlinks be in the nutch-default.xml configuration?
Where this is used?
--
Best regards,
An
Stefan Groschupf wrote:
Hi,
shouldn't db.max.inlinks be in the nutch-default.xml configuration?
Where this is used?
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| |
Hi,
shouldn't db.max.inlinks be in the nutch-default.xml configuration?
Stefan
[
http://issues.apache.org/jira/browse/NUTCH-293?page=comments#action_12421930 ]
Sami Siren commented on NUTCH-293:
--
perhaps instead of
delay = crawlDelay > 0 ? crawlDelay : serverDelay;
we could do
delay=Math.max(crawlDelay, serverDelay);
als
Very nice.
I visited the site, searched for nlp and found 5 listings!
How often will the crawl run? How hard was it getting the app to run on
GoDaddy? Do you run the crawl from GoDaddy or elsewhere and then either
upload or reference your index site?
Thank you, and please tell me how I can help
Hello Nutchians,
Please visit the site http://www.myopensourcejobs.com. The site is built
using LAMP and Nutch.
I use the Nutch crawler to crawl jobs from commercial sites such as Hotjobs,
DICE and CareerBuilder (As of today), specifically for opensource skill sets.
Basically the site filter