Re: Manifold CF-Non existent of URL

2019-09-03 Thread Karl Wright
Yes, if mcf receives a 404 response it will delete the document from the index. Continuous crawling though means the document may not be retried for a long time. Exponential back off is used. Karl On Tue, Sep 3, 2019, 1:36 AM Priya Arora wrote: > Yes its a continuous Job. > > On Tue, Sep

Re: Manifold CF-Non existent of URL

2019-09-02 Thread Priya Arora
Yes its a continuous Job. On Tue, Sep 3, 2019 at 11:05 AM Priya Arora wrote: > Hi , > I am having a job Job:-myuniversity_intranet (which is crawling data from > intranet site) and the data has been indexed in an index. > My query here is, does manifold have some functionality to test a url

Re: Manifold CF-Non existent of URL

2019-09-02 Thread Priya Arora
Hi , I am having a job Job:-myuniversity_intranet (which is crawling data from intranet site) and the data has been indexed in an index. My query here is, does manifold have some functionality to test a url before indexing that whether the URL is existing or not?. Likewise , in my index (say index

Re: Manifold CF-Non existent of URL

2019-09-02 Thread Karl Wright
Hi, You aren't giving me enough information to know why your job isn't rechecking URLs. Please tell me how your job is configured, specifically whether it's continuous or not. Thanks. Karl On Mon, Sep 2, 2019 at 4:47 AM Priya Arora wrote: > Hi, > > I have a query regarding manifoldCF. Is

Manifold CF-Non existent of URL

2019-09-02 Thread Priya Arora
Hi, I have a query regarding manifoldCF. Is this having some kind of functionality to check, if the URL it is crawling, does exist actually or page not found(404). Like I have a requirement in which i am crawling data for university and job i continuously running.After some period it found that