+1.
I respect for the design concept of ManifoldCF, but I think force delete options make MCF more
useful for those who use MCF as crawler. Adding force delete options doesn't change default
behaviors and it doesn't break back-compatibility.
Koji
On 2022/06/14 14:46, Ricardo Ruiz wrote:
Hi Karl
We are using ManifoldCF as a crawler more than a synchronizer. We are thinking of contributing to
ManifoldCf by including a force job delete and force output connector delete, considering of course
the things that need to be deleted with them (BD, etc). Do you think this is possible?
We think that not only us but the community would be benefited from this kind
of functionality.
Ricardo.
On Mon, Jun 13, 2022 at 7:34 PM Karl Wright <daddy...@gmail.com
<mailto:daddy...@gmail.com>> wrote:
Because ManifoldCF is not just a crawler, but a synchonizer, a job
represents and includes a
list of documents that have been indexed. Deleting the job requires
deleting the documents that
have been indexed also. It's part of the basic model.
So if you tear down your target output instance and then try to tear down
the job, it won't
work. ManifoldCF won't just throw away the memory of those documents and
act as if nothing
happened.
If you're just using ManifoldCF as a crawler, therefore, your fix is about
as good as it gets.
You can get into similar trouble if, for example, you reinstall ManifoldCF
but forget to include
a connector class that was there before. Carnage ensues.
Karl
On Mon, Jun 13, 2022 at 1:39 AM Ricardo Ruiz <ricrui3s...@gmail.com
<mailto:ricrui3s...@gmail.com>> wrote:
Hi all
My team uses mcf to crawl documents and index into solr instances, but
for reasons beyond
our control, sometimes the instances or collections are deleted.
When we try to delete a job and the solr instance or collection doesn't
exist anymore, the
job reaches the "End notification" status and gets stuck there. No
other job can be aborted
or deleted until the initial error is fixed.
We are able to clean up the errors following the next steps:
1. Reconfigure the output connector to an existing Solr instance and
collection
2. Reset the output connection, so it forgets any indexed documents.
3. Reset the job, so it forgets any indexed documents.
4. Restart the ManifoldCF server.
Is there any other way we can solve this error? Is there any way we can
force delete the job
if we don't care about the job's documents anymore?
Thanks in advance.
Ricardo.