Yes its a continuous Job.
On Tue, Sep 3, 2019 at 11:05 AM Priya Arora wrote:
> Hi ,
> I am having a job Job:-myuniversity_intranet (which is crawling data from
> intranet site) and the data has been indexed in an index.
> My query here is, does manifold have some functionality to test a url
Hi ,
I am having a job Job:-myuniversity_intranet (which is crawling data from
intranet site) and the data has been indexed in an index.
My query here is, does manifold have some functionality to test a url
before indexing that whether the URL is existing or not?.
Likewise , in my index (say index
[
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921084#comment-16921084
]
Karl Wright commented on CONNECTORS-1566:
-
The only thing that is preventing this from going
[
https://issues.apache.org/jira/browse/CONNECTORS-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1508:
Fix Version/s: (was: ManifoldCF 2.14)
ManifoldCF 2.15
> Add
[
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1521:
Fix Version/s: (was: ManifoldCF 2.14)
ManifoldCF 2.15
>
[
https://issues.apache.org/jira/browse/CONNECTORS-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cihad Guzel updated CONNECTORS-1622:
Summary: Upgrade to Tika 1.22 (was: Upgrade to Tika 1.22 when available)
> Upgrade
Cihad Guzel created CONNECTORS-1622:
---
Summary: Upgrade to Tika 1.22 when available
Key: CONNECTORS-1622
URL: https://issues.apache.org/jira/browse/CONNECTORS-1622
Project: ManifoldCF
Issue
Hi,
You aren't giving me enough information to know why your job isn't
rechecking URLs. Please tell me how your job is configured, specifically
whether it's continuous or not. Thanks.
Karl
On Mon, Sep 2, 2019 at 4:47 AM Priya Arora wrote:
> Hi,
>
> I have a query regarding manifoldCF. Is
Hi,
I have a query regarding manifoldCF. Is this having some kind of
functionality to check, if the URL it is crawling, does exist actually or
page not found(404).
Like I have a requirement in which i am crawling data for university and
job i continuously running.After some period it found that