Hello Markus Sorry for duplicate question. I added selenium plugin in conf/nutch-default.xml and included following
<name>plugin.includes</name> <value>protocol-http|protocol-httpclient|protocol-selenium|urlfilter-(regex|validator)|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value> Still the site is not crawling. Are there any additional steps to be followed for installation of selenium. Please suggest Thanks and Regards Raj Chidara ----- Original Message ----- From: Markus Jelsma (markus.jel...@openindex.io) Date: 30-01-2023 16:26 To: user@nutch.apache.org Subject: Re: Siet is not crawling Hello Raj, I think the same question about the same site was asked here some time ago. Anyway, this site loads its content via Javascript. You will need a protocol plugin that supports it, either protocol-htmlunit, or protocol-selenium, instead of protocol-http or any other. Change the configuration for plugin.includes, and it should work. Markus Op ma 30 jan. 2023 om 10:39 schreef Raj Chidara <raj.chid...@ddismart.com>: > > Hello, > > Nutch is not able crawl this site. Are there any nutch configuration > changes required for this site? > > https://www.ich.org/ > > > Thanks and Regards > > Raj Chidara > > >