Hi Anupam, I looked at the code at some length, and there is another deletion pathway that would not have been caught by my previous patch. However, this pathway is only triggered *before* indexing. Still, we should rule it out.
Can you remove the old patch I gave you and apply this one instead: Index: framework/agents/src/main/java/org/apache/manifoldcf/agents/incrementalingest/IncrementalIngester.java =================================================================== --- framework/agents/src/main/java/org/apache/manifoldcf/agents/incrementalingest/IncrementalIngester.java (revision 1307815) +++ framework/agents/src/main/java/org/apache/manifoldcf/agents/incrementalingest/IncrementalIngester.java (working copy) @@ -1589,6 +1589,7 @@ protected void removeDocument(IOutputConnection connection, String documentURI, String outputDescription, IOutputRemoveActivity activities) throws ManifoldCFException, ServiceInterruption { + Logging.ingest.error("Removing document",new Exception("Removing document")); IOutputConnector connector = OutputConnectorFactory.grab(threadContext,connection.getClassName(),connection.getConfigParams(),connection.getMaxConnections()); if (connector == null) // The connector is not installed; treat this as a service interruption. Same instructions as before. Thanks, Karl On Sat, Mar 31, 2012 at 4:34 PM, Karl Wright <daddy...@gmail.com> wrote: > I tried modifying the file system connector here to use the same crawling > model as the document connector. Everything still behaved exactly as > expected. So we are really going to need that trace to make any further > progress. > > > Karl > > Sent from my Windows Phone > ________________________________ > From: Karl Wright > Sent: 3/31/2012 4:11 PM > To: Anupam Bhattacharya > Subject: RE: Running 2 jobs to update same document Index but different > > The output from the patch I gave you will go to manifoldcf.log. If you > aren't seeing it please send me your properties.xml file. > > Karl > > Sent from my Windows Phone > ________________________________ > From: Anupam Bhattacharya > Sent: 3/31/2012 10:37 AM > To: Karl Wright > Subject: Re: Running 2 jobs to update same document Index but different > > Hello Karl, > > I did today the filesystem crawling where Output connector to Null and Input > with Filesystem Connector. > The job ran properly without deleting the files which i crawled. > > Although i could not find any log messages logged into any manifoldcf.log > files. I did the rebuild in the IncrementalIngestor.java. > > Can you please mention where i need to look for any log messages related to > this. > > Regards > Anupam > > > On Fri, Mar 30, 2012 at 4:21 PM, Karl Wright <daddy...@gmail.com> wrote: >> >> I did not see that you tried creating a filesystem connection and job. >> Did you do that, and did it work for you without sending a deletion? >> If not, please go back to using the manifoldcf id field and try that >> first. >> >> Here is the patch I'd like you to apply: >> >> =================================================================== >> --- >> framework/agents/src/main/java/org/apache/manifoldcf/agents/incrementalingest/IncrementalIngester.java >> (revision >> 1307149) >> +++ >> framework/agents/src/main/java/org/apache/manifoldcf/agents/incrementalingest/IncrementalIngester.java >> (working >> copy) >> @@ -697,6 +697,8 @@ >> { >> IOutputConnection connection = >> connectionManager.load(outputConnectionName); >> >> + Logging.ingest.error("Deleting documents!", new >> Exception("Deletion stack trace")); >> + >> if (Logging.ingest.isDebugEnabled()) >> { >> int i = 0; >> >> >> Then, rebuild ManifoldCF. Every document that is deleted from the >> index will generate a trace in the log. Run your crawl and send me >> one of those traces. >> >> Karl >> >> >> On Fri, Mar 30, 2012 at 6:06 AM, Anupam Bhattacharya >> <anupam...@gmail.com> wrote: >> > I checked the Manifoldcf logs and i there were no exceptions. >> > >> > Additionally i changed the id (uniqueKey) in SOLR to the documentum >> > specific >> > unique id i.e. r_object_id and ran the job. This i time i could easily >> > create the indexes. >> > >> > For (4) please provide the places for which i need to enable logging. >> > >> > On Thu, Mar 29, 2012 at 6:56 PM, Karl Wright <daddy...@gmail.com> wrote: >> >> >> >> "But as per my observation the deletion happens only when uniqueKey in >> >> SOLR schema is set to id. " >> >> >> >> The SOLR setup cannot influence the flow in ManifoldCF unless it causes >> >> SOLR to reject the ManifoldCF requests. So I suspect that the delete >> >> request is happening in both cases, and it is not getting acted upon by >> >> SOLR >> >> in the case where uniqueKey is not set to "id". That's because the >> >> delete >> >> request from ManifoldCF will be for a key that solr doesn't recognize >> >> as >> >> such. >> >> >> >> Please do try recommendations (3) and (4). >> >> >> >> Karl >> >> >> >> >> > > > > > > -- > Thanks & Regards > Anupam Bhattacharya > >