Hi I need an advice if I could do the following using Nutch - specify start and end point of the crawl - regular monitoring of pre-programmed sites, schedule auto-crawl to crawl a list of urls at regular intervals on a given day - allow additional sites to be added on ad-hoc basis - manage user rights and security levels - crawl password protected sites - results matched to pre determined patterns both automatically and manually - integrate it with any type of database (either MySQL or MSSQL)
I would like to get the crawler integrated with another hosted web app. So that the confirmed result from the crawler are passed onto the web app for display and analysis purposes. If I am unable to do any of the above, could anyone advise of any other web crawler, even if non-open source? Many thanks Rahil
