[ 
https://issues.apache.org/jira/browse/CONNECTORS-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16795722#comment-16795722
 ] 

Karl Wright commented on CONNECTORS-880:
----------------------------------------

[~SubasiniR], your issue has nothing whatsoever to do with this ticket.  It 
really belongs first on the user list.

The issue is that your database is going offline for 2700 seconds while your 
crawl is taking place, or almost 45 minutes.  Queries that normally would be 
instantaneous are therefore just not being completed at all for that period of 
time.  The plans look fine so that isn't it.

If this is using HSQLDB (which is the default database for the single-process 
example), then you probably have exceeded its capacity.  It stores all of its 
tables in memory.  You will want to upgrade to a real database instead.  I 
would preter postgresql over mysql because mysql has been having transactional 
integrity issues for a couple of versions now, and that will be fatal to use 
with ManifoldCF.

By the way, "Illegal seed URL" is a warning and does not impact behavior other 
than to notify you that one of the seeds you are using in your crawl is not 
valid according to the w3c spec.  The seed will not be used.





> Under the right conditions, job aborts do not update "last checked" time
> ------------------------------------------------------------------------
>
>                 Key: CONNECTORS-880
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-880
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Framework crawler agent
>    Affects Versions: ManifoldCF 1.4.1
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>            Priority: Major
>             Fix For: ManifoldCF 1.6
>
>
> When a scheduled job is being considered to be started, MCF updates the 
> last-check field ONLY if the job didn't start.  It relies on the job's 
> completion to set the last-check field in the case where the job does start.  
> But if the job aborts, in at least one case the last-check field is NOT 
> updated.  This leads to the job being run over and over again within the 
> schedule window.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to