So , I suppose, the best solution could be :
Continous recrawling and one periodic recrawling to delete orphaned documents.

Can I superimpose the two jobs?

Mario Bisonti
Information and Comunications Technology

VIMAR SpA
Tel. +39 0424 488 644
[email protected]<mailto:[email protected]>
Rispetta l’ambiente. Stampa solo se necessario.
Take care of the environment. Print only if necessary.





Da: Karl Wright [mailto:[email protected]]
Inviato: martedì 12 agosto 2014 12:21
A: [email protected]
Oggetto: Re: How delete unreachable documents on continous crawling?

Hi Mario,

Yes, periodic recrawling allows ManifoldCF the opportunity to discover 
abandoned documents and remove them.

Karl

On Tue, Aug 12, 2014 at 6:18 AM, Bisonti Mario 
<[email protected]<mailto:[email protected]>> wrote:
Ok, thanks..

So you suggest to me to not use continuos crawling and schedule a re-crawling 
periodically of all documents?
Is it better?
Thanks a lot.



Mario





Da: Karl Wright [mailto:[email protected]<mailto:[email protected]>]
Inviato: martedì 12 agosto 2014 12:16
A: [email protected]<mailto:[email protected]>
Oggetto: Re: How delete unreachable documents on continous crawling?

Hi Mario,
Please read ManifoldCF in Action Chapter 1.  Continuous crawling has no 
mechanism for deleting unreachable documents, and never will, because it is 
fundamentally impossible to do.
Thanks,
Karl

On Tue, Aug 12, 2014 at 6:10 AM, Bisonti Mario 
<[email protected]<mailto:[email protected]>> wrote:
Hallo.
I set continuous crawling on a folder of a website to index the pdf files 
contained.

Schedule type: Rescan documents dinamically
Recrawl interval (if continuous):5

I see that if documents are added on the folder, they are indexed, but if 
documents are deleted they aren’t deleted from indexing.
I see that on the “MainfoldCF in action” , is mentioned “…that continuous 
crawling seems to be missing a phase – the “delete unreachable documents” 
phase.”

But, how could I solve the problem, please?
Thanks a lot for yopur help.
Mario






Reply via email to