Hi Karl,

The only error message which seems to be continuously thrown in manifold log is 
:

FATAL 2015-05-08 18:42:47,043 (Worker thread '40') - Error tossed: null
java.lang.NullPointerException

I do notice that the file that needs to deleted is shown under the Queue Status 
report and keeps jumping between “Processing” and “About to Process” statuses 
every 30 seconds.

Timo


> On May 8, 2015, at 1:40 PM, Karl Wright <[email protected]> wrote:
> 
> Hi Timo,
> 
> As I said, I don't think your configuration is the source of the delete 
> issue. I suspect the searchblox connector.
> 
> In the absence of a thread dump, can you look for exceptions in the 
> manifoldcf log?
> 
> Karl
> 
> Sent from my Windows Phone
> From: Timo Selvaraj
> Sent: 5/8/2015 10:06 AM
> To: [email protected] <mailto:[email protected]>
> Subject: Re: File system continuous crawl settings
> 
> When I change the settings to the following, updated or modified documents 
> are now indexed but deleting the documents that are removed is still an issue:
> 
> Schedule type:        Rescan documents dynamically
> Minimum recrawl interval:     5 minutes       Maximum recrawl interval:       
> 10 minutes
> Expiration interval:  Infinity        Reseed interval:        60 minutes
> No scheduled run times
> Maximum hop count for link type 'child':      Unlimited
> Hop count mode:       Delete unreachable documents
> 
> Do I need to set the reseed interval to Infinity?
> 
> Any thoughts?
> 
> 
>> On May 8, 2015, at 6:18 AM, Karl Wright <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> I just tried your configuration here.  A deleted document in the file system 
>> was indeed picked up as expected.
>> 
>> I did notice that your "expiration" setting is, essentially, cleaning out 
>> documents at a rapid clip.  With this setting, documents will be expired 
>> before they are recrawled.  You probably want one strategy or the other but 
>> not both.
>> 
>> As for why a deleted document is "stuck" in Processing: the only thing I can 
>> think of is that the output connection you've chosen is having trouble 
>> deleting the document from the index.  What output connector are you using?
>> 
>> Karl
>> 
>> 
>> On Fri, May 8, 2015 at 4:36 AM, Timo Selvaraj <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi,
>> 
>> We are testing the continuous crawl feature for file system connector on a 
>> small folder to test if new documents are added to the folder, missing 
>> documents removed and modified documents updated are handled by the 
>> continuous crawl job:
>> 
>> Here are the settings we use:
>> 
>> Schedule type:       Rescan documents dynamically
>> Minimum recrawl interval:    5 minutes       Maximum recrawl interval:       
>> 10 minutes
>> Expiration interval: 5 minutes       Reseed interval:        10 minutes
>> No scheduled run times
>> Maximum hop count for link type 'child':     Unlimited
>> Hop count mode:      Delete unreachable documents
>> 
>> 
>> Adding new documents seem to be getting picked up by the job however removal 
>> of a document or update to a document are not being picked up.
>> 
>> Am I missing any settings for the deletions or updates? I do see the 
>> document that has been removed is showing as Processing under Queue Status 
>> and others are showing as Waiting for Processing.
>> 
>> Any idea what setting is missing for the deletes/updates to be recognized 
>> and re-indexed?
>> 
>> Thanks,
>> Timo 
>> 
> 

Reply via email to