[ https://issues.apache.org/jira/browse/CONNECTORS-850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900271#comment-13900271 ]
Karl Wright commented on CONNECTORS-850: ---------------------------------------- Anything that you change for the job that affects what is indexed. For example, forced metadata, Solr variable mapping, etc. all will cause a reindex to take place. If you want to get specific, there are two version strings, one for the repository connection, and another for the output connection. It's up to the connector what to put in them. For the web connector, the document's metadata (from ALL header fields), URL mappings (if any), and a checksum of the content goes into it. For the solr output connector, metadata and some kinds of configuration information go into it. If this isn't making any sense in your case, I suppose you can debug it -- or look in the database at some version fields to see what is changing. > Maximum interval in dynamic crawling > ------------------------------------ > > Key: CONNECTORS-850 > URL: https://issues.apache.org/jira/browse/CONNECTORS-850 > Project: ManifoldCF > Issue Type: New Feature > Components: Framework crawler agent > Affects Versions: ManifoldCF 1.4.1 > Reporter: Florian Schmedding > Assignee: Karl Wright > Priority: Minor > Labels: features > Fix For: ManifoldCF 1.5 > > > Currently, the dynamic crawling method used for a continuous job extends the > reseed and recrawl intervals when no changes are found in a checked document. > However, it should be possible to restrict this extension to a maximum value > in order to make sure that new documents are discovered within a certain > interval. -- This message was sent by Atlassian JIRA (v6.1.5#6160)