Hi there, Over the past few days I've been playing around with Solr and ManifoldCF... and I have to say, I'm quite impressed, that everything works that well :)
However, I have a short question: When using the filesystem connector, is there a way to let ManifoldCF always send all documents to Solr, no matter whether they have been crawled/send before? I set up a job to crawl the local file system and send documents to Solr. When starting the job for the first time, everything works perfectly well and the document is successfully ingested into Solr. The problem is, that when I delete Solr's index (I'm still playing around with Solr, so that happens from time to time) and restart the ManifoldCF job, the document is not sent to Solr (again) - probably because it assumes, that this is not necessary, since the document did not change. What I then did was to clear the crawled directory, start the crawl job (ManifoldCF realises that the directory is empty), re-populate the directory and restart the crawl job. I definitely don't want to set the crawl jobs up like this later on, but for testing that would be quite handy. I hope I accidentally didn't overlook something in the user documentation or in the mailing list... Any help/hint is appreciated. Cheers, Tasat
