Re: Crawling / Indexation Query
Many Thanks On Thu, May 7, 2020 at 4:11 PM Karl Wright wrote: > Hi, > > ManifoldCF is not a crawler, it's a synchronizer. If robots says not to > crawl something, then it will not be indexed. If robots is changed to > prohibit crawling of certain documents, then yes, those documents will be > removed from the index. > > But you can override the robots behavior in the document specification or > configuration, I believe. > > Karl > > > On Thu, May 7, 2020 at 6:27 AM ritika jain > wrote: > >> Hi All, >> >> Can any body explain >> If a URL was indexed, and afterwards a noindex tag was added - will that >> URL then be deleted from the index when it is visited again by the crawler? >> >> >> Say a url was previously having indexation required meta tag and was >> present in Elastic index, but then afterwards >> >> was added to page design afterwards. >> >> Should it be deleted from Index when the Manifoldcf job crawl that url >> again or the URL will still be present in the index. >> >> Thanks >> >> >> >
Re: Crawling / Indexation Query
Hi, ManifoldCF is not a crawler, it's a synchronizer. If robots says not to crawl something, then it will not be indexed. If robots is changed to prohibit crawling of certain documents, then yes, those documents will be removed from the index. But you can override the robots behavior in the document specification or configuration, I believe. Karl On Thu, May 7, 2020 at 6:27 AM ritika jain wrote: > Hi All, > > Can any body explain > If a URL was indexed, and afterwards a noindex tag was added - will that > URL then be deleted from the index when it is visited again by the crawler? > > > Say a url was previously having indexation required meta tag and was > present in Elastic index, but then afterwards > > was added to page design afterwards. > > Should it be deleted from Index when the Manifoldcf job crawl that url > again or the URL will still be present in the index. > > Thanks > > >
Crawling / Indexation Query
Hi All, Can any body explain If a URL was indexed, and afterwards a noindex tag was added - will that URL then be deleted from the index when it is visited again by the crawler? Say a url was previously having indexation required meta tag and was present in Elastic index, but then afterwards was added to page design afterwards. Should it be deleted from Index when the Manifoldcf job crawl that url again or the URL will still be present in the index. Thanks
Re: ES 7.6.2
Hi Ritika, ManifoldCF's ElasticSearch connector does not include any code that requires Java 11, so you are all set. Because JDK 11 removes many packages, however, you should expect to run ManifoldCF 2.14 with Java 8. ManifoldCF 2.16, just released, supports Java 11. Karl On Thu, May 7, 2020 at 5:14 AM ritika jain wrote: > Hi, > > Can any body tell me please whether Manifoldcf 2.14 version is compatible > with Elastic Search Version 7.6.2 as it requires Java 11. > > Thanks > Ritika >
ES 7.6.2
Hi, Can any body tell me please whether Manifoldcf 2.14 version is compatible with Elastic Search Version 7.6.2 as it requires Java 11. Thanks Ritika