Already unsubscribed. Why do I still get this email?
Thanks

Steven

On Mon, Jan 30, 2023 at 7:06 AM Markus Jelsma <markus.jel...@openindex.io>
wrote:

> Yes, remove the other protocol-* plugins from the configuration. With all
> three active it is not always determined which one is going to do the work.
>
> Op ma 30 jan. 2023 om 12:50 schreef Raj Chidara <raj.chid...@ddismart.com
> >:
>
> >
> > Hello Markus
> >   Sorry for duplicate question.  I added selenium plugin in
> > conf/nutch-default.xml and included following
> >
> > <name>plugin.includes</name>
> >
> >
> <value>protocol-http|protocol-httpclient|protocol-selenium|urlfilter-(regex|validator)|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
> >
> > Still the site is not crawling.  Are there any additional steps to be
> > followed for installation of selenium. Please suggest
> >
> >
> > Thanks and Regards
> >
> > Raj Chidara
> >
> > ----- Original Message -----
> > From: Markus Jelsma (markus.jel...@openindex.io)
> > Date: 30-01-2023 16:26
> > To: user@nutch.apache.org
> > Subject: Re: Siet is not crawling
> >
> > Hello Raj,
> >
> > I think the same question about the same site was asked here some time
> ago.
> > Anyway, this site loads its content via Javascript. You will need a
> > protocol plugin that supports it, either protocol-htmlunit, or
> > protocol-selenium, instead of protocol-http or any other.
> >
> > Change the configuration for plugin.includes, and it should work.
> >
> > Markus
> >
> > Op ma 30 jan. 2023 om 10:39 schreef Raj Chidara <
> raj.chid...@ddismart.com
> > >:
> >
> > >
> > > Hello,
> > >
> > >   Nutch is not able crawl this site.  Are there any nutch configuration
> > > changes required for this site?
> > >
> > > https://www.ich.org/
> > >
> > >
> > > Thanks and Regards
> > >
> > > Raj Chidara
> > >
> > >
> > >
> >
> >
>

Reply via email to