Re: db.max.inlinks

2006-07-18 Thread Andrzej Bialecki
Stefan Groschupf wrote: Andrzej, ... in LinkDB line 114, in the configure method and it is used in line 168 and 176. Ah, true - yes, it should be added and documented. I'll do it. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__

Re: db.max.inlinks

2006-07-18 Thread Stefan Groschupf
Andrzej, ... in LinkDB line 114, in the configure method and it is used in line 168 and 176. Stefan Am 18.07.2006 um 16:02 schrieb Andrzej Bialecki: Stefan Groschupf wrote: Hi, shouldn't db.max.inlinks be in the nutch-default.xml configuration? Where this is used? -- Best regards, An

Re: db.max.inlinks

2006-07-18 Thread Andrzej Bialecki
Stefan Groschupf wrote: Hi, shouldn't db.max.inlinks be in the nutch-default.xml configuration? Where this is used? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| |

db.max.inlinks

2006-07-18 Thread Stefan Groschupf
Hi, shouldn't db.max.inlinks be in the nutch-default.xml configuration? Stefan

[jira] Commented: (NUTCH-293) support for Crawl-delay in Robots.txt

2006-07-18 Thread Sami Siren (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-293?page=comments#action_12421930 ] Sami Siren commented on NUTCH-293: -- perhaps instead of delay = crawlDelay > 0 ? crawlDelay : serverDelay; we could do delay=Math.max(crawlDelay, serverDelay); als

Re: Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-18 Thread William Surowiec
Very nice. I visited the site, searched for nlp and found 5 listings! How often will the crawl run? How hard was it getting the app to run on GoDaddy? Do you run the crawl from GoDaddy or elsewhere and then either upload or reference your index site? Thank you, and please tell me how I can help

Vertical Search (Nutch) for Opensource Jobs- http://www.myopensourcejobs.com

2006-07-18 Thread Sudhi Seshachala
Hello Nutchians, Please visit the site http://www.myopensourcejobs.com. The site is built using LAMP and Nutch. I use the Nutch crawler to crawl jobs from commercial sites such as Hotjobs, DICE and CareerBuilder (As of today), specifically for opensource skill sets. Basically the site filter