So , I suppose, the best solution could be : Continous recrawling and one periodic recrawling to delete orphaned documents.
Can I superimpose the two jobs? Mario Bisonti Information and Comunications Technology VIMAR SpA Tel. +39 0424 488 644 [email protected]<mailto:[email protected]> Rispetta l’ambiente. Stampa solo se necessario. Take care of the environment. Print only if necessary. Da: Karl Wright [mailto:[email protected]] Inviato: martedì 12 agosto 2014 12:21 A: [email protected] Oggetto: Re: How delete unreachable documents on continous crawling? Hi Mario, Yes, periodic recrawling allows ManifoldCF the opportunity to discover abandoned documents and remove them. Karl On Tue, Aug 12, 2014 at 6:18 AM, Bisonti Mario <[email protected]<mailto:[email protected]>> wrote: Ok, thanks.. So you suggest to me to not use continuos crawling and schedule a re-crawling periodically of all documents? Is it better? Thanks a lot. Mario Da: Karl Wright [mailto:[email protected]<mailto:[email protected]>] Inviato: martedì 12 agosto 2014 12:16 A: [email protected]<mailto:[email protected]> Oggetto: Re: How delete unreachable documents on continous crawling? Hi Mario, Please read ManifoldCF in Action Chapter 1. Continuous crawling has no mechanism for deleting unreachable documents, and never will, because it is fundamentally impossible to do. Thanks, Karl On Tue, Aug 12, 2014 at 6:10 AM, Bisonti Mario <[email protected]<mailto:[email protected]>> wrote: Hallo. I set continuous crawling on a folder of a website to index the pdf files contained. Schedule type: Rescan documents dinamically Recrawl interval (if continuous):5 I see that if documents are added on the folder, they are indexed, but if documents are deleted they aren’t deleted from indexing. I see that on the “MainfoldCF in action” , is mentioned “…that continuous crawling seems to be missing a phase – the “delete unreachable documents” phase.” But, how could I solve the problem, please? Thanks a lot for yopur help. Mario
