Hi Cheng,

Dynamic recrawl revisits documents based on the frequency that they changed
in the past.   It is therefore hard to make any prediction about whether a
document will be recrawled in a given time interval.  You need recrawls of
existing directories in order to discover new documents in SharePoint.

If you want more predictable crawling, I'd suggest doing standard minimal
crawls on a fixed schedule.  That will pick up any new documents added.
Then do full crawls (not dynamic) periodically (once a week?) to clean up
any deleted documents.

Thanks,
Karl


On Mon, Jul 30, 2018 at 4:35 AM Cheng Zeng <ze...@hotmail.co.uk> wrote:

> Hi Karl,
>
>
> I have a question about the schedule-related configuration in the job. I
> have a continuously running job which crawls the documents in Sharepoint
> 2013 and the job is supposed to re-crawl about 26,000 docs every 24 hours
> as configured, however, it seems that there are something wrong with my
> configuration, as I found that the number of active documents is only
> increased by 1 or 2  when there are about 20 new documents created in the
> Sharepoint after the continuous job runs for over a few weeks. If I
> restarted the job, there were more active documents found and the number of
> active documents reflected the correct number of the documents in the
> Sharepoint lists. It seems that the job is not re-scanning all the
> documents. I suspect there is something wrong with my scheduling
> configuration. Although I have read section about how to set up the
> schedule-related configuration information at end-user-documentation at
> https://manifoldcf.apache.org/release/release-2.10/en_US/end-user-documentation.html#jobs,
> I am still confused by the incorrect number of active documents of the job
> after the continuous job runs for a few weeks.  The version of mcf I am
> using is 2.6.
>
>
> My schedule configuration is as follows:
>
>
> Schedule type: Rescan Documents Dynamically
>
> Recrawl interval (if continuous): 1440 minutes
>
> Maximum recrawl interval (if continuous): blank
>
> Expiration interval (if continuous): blank
>
> Reseed interval (if continuous): blank
>
>
>
> Scheduled time:
>
> schedule 1: Any day of week, 5am plus 0, every month of year on any day of
> month         Job invocation:complete
>
>                      Maximum run time: 3000 minutes
>
>
> schedule 2: Any day of week, 12pm plus 0, every month of year on any day
> of month      Job invocation:complete
>
>                      Maximum run time: 3000 minutes
>
>
> The screenshot of the scheduling is attached.
>
>
> Could you please give me some advice about the problem I face with.
>
>
> BTW: Does MCF support Domino? Are there any methods to extract documents
> from Domino?
>
>
>
> Best wishes,
>
>
> Cheng
>
>
>
>

Reply via email to