Hi Timo,

As I said, I don't think your configuration is the source of the delete
issue. I suspect the searchblox connector.

In the absence of a thread dump, can you look for exceptions in the
manifoldcf log?

Karl

Sent from my Windows Phone
------------------------------
From: Timo Selvaraj
Sent: 5/8/2015 10:06 AM
To: [email protected]
Subject: Re: File system continuous crawl settings

When I change the settings to the following, updated or modified documents
are now indexed but deleting the documents that are removed is still an
issue:

Schedule type:Rescan documents dynamicallyMinimum recrawl interval:5
minutesMaximum
recrawl interval:10 minutesExpiration interval:InfinityReseed interval:60
minutesNo scheduled run timesMaximum hop count for link type 'child':
UnlimitedHop count mode:Delete unreachable documents

Do I need to set the reseed interval to Infinity?

Any thoughts?


On May 8, 2015, at 6:18 AM, Karl Wright <[email protected]> wrote:

I just tried your configuration here.  A deleted document in the file
system was indeed picked up as expected.

I did notice that your "expiration" setting is, essentially, cleaning out
documents at a rapid clip.  With this setting, documents will be expired
before they are recrawled.  You probably want one strategy or the other but
not both.

As for why a deleted document is "stuck" in Processing: the only thing I
can think of is that the output connection you've chosen is having trouble
deleting the document from the index.  What output connector are you using?

Karl


On Fri, May 8, 2015 at 4:36 AM, Timo Selvaraj <[email protected]>
wrote:

> Hi,
>
> We are testing the continuous crawl feature for file system connector on a
> small folder to test if new documents are added to the folder, missing
> documents removed and modified documents updated are handled by the
> continuous crawl job:
>
> Here are the settings we use:
>
> Schedule type:Rescan documents dynamicallyMinimum recrawl interval:5
> minutesMaximum recrawl interval:10 minutesExpiration interval:5 minutesReseed
> interval:10 minutesNo scheduled run timesMaximum hop count for link type
> 'child':UnlimitedHop count mode:Delete unreachable documents
>
> Adding new documents seem to be getting picked up by the job however
> removal of a document or update to a document are not being picked up.
>
> Am I missing any settings for the deletions or updates? I do see the
> document that has been removed is showing as Processing under Queue Status
> and others are showing as Waiting for Processing.
>
> Any idea what setting is missing for the deletes/updates to be recognized
> and re-indexed?
>
> Thanks,
> Timo
>

Reply via email to