Hi Karl, Thanks a lot for your replay. I didn't change any code in the framework except my own repository connector.
I found that there five methods which are available to inject document identifiers. Could you please tell me how I should choose the right way to inject the document identifiers. activities.addDocumentReference(documentIdentifier); activities.addDocumentReference(documentIdentifier, parentIdentifier, relationshipType); activities.addDocumentReference(documentIdentifier, parentIdentifier, relationshipType, dataNames, dataValues); activities.addDocumentReference(documentIdentifier, parentIdentifier, relationshipType, dataNames, dataValues, originationTime); activities.addDocumentReference(documentIdentifier, parentIdentifier, relationshipType, dataNames, dataValues, originationTime, prereqEventNames); The way I injected document identifiers is as follows. activities.addDocumentReference(docUri,documentIdentifier,RELATIONSHIP_CHILD); docUri is the doc url which is supposed to be fetched, e.g. http://domino_server:80/path/dep1/database_name.nsf/api/data/documents documentIdentifier is the parent url, e.g. http://domino_server:80/path/dep1/database_name.nsf/api/data/documents/unid/B0F9484E94DEA3204825813E001034E1 I am afraid that there is no full stack trace thrown. I have only got the new IllegalArgumentException("Unrecognized document identifier: '"+documentIdentifier+"'"); with the following code in the WorkerThread.java(org.apache.manifoldcf.crawler.system). I've found the document identifier in the table of "jobqueue" and the dochash in the table of "jobqueue" is matched against the hashcode generated by the hash method. For some of the document identifiers, previousDocuments.get(documentIdentifierHash) can return the queued document, but for several document identifier, previousDocuments.get(documentIdentifierHash) return null. Could you please give me some indication? protected IPipelineSpecificationWithVersions computePipelineSpecificationWithVersions(String documentIdentifierHash, String componentIdentifierHash, String documentIdentifier) { QueuedDocument qd = previousDocuments.get(documentIdentifierHash); // return null. The problem is here. if (qd == null) throw new IllegalArgumentException("Unrecognized document identifier: '"+documentIdentifier+"'"); return new PipelineSpecificationWithVersions(pipelineSpecification,qd,componentIdentifierHash); } Best wishes, Cheng ________________________________ From: Karl Wright <daddy...@gmail.com> Sent: 12 November 2018 18:46 To: user@manifoldcf.apache.org Subject: Re: Job stuck - WorkerThread functions return null Hi, Have you been modifying the framework code? If so, I really cannot help you. If you haven't -- it looks like you've got code that is injecting document identifiers that are incorrect. But I will need to see a full stack trace to be sure of that. Thanks, Karl On Mon, Nov 12, 2018 at 4:06 AM Cheng Zeng <ze...@hotmail.co.uk<mailto:ze...@hotmail.co.uk>> wrote: Hi Karl, I am developing my own repository where I borrowed some code from the file repository connector. I use my repository connector to crawling documents from IBM domino system. I managed to retrieve all the files in the domino, however, when I restart my job to recrawl the database in the domino, I've got problems with the following code where previousDocuments.get(documentIdentifierHash) in the WorkerThread.java(org.apache.manifoldcf.crawler.system) return null for some of the document ids. As a result, the job got stuck with the specific document id. Could you please tell me how I could fix the problem? protected IPipelineSpecificationWithVersions computePipelineSpecificationWithVersions(String documentIdentifierHash, String componentIdentifierHash, String documentIdentifier) { QueuedDocument qd = previousDocuments.get(documentIdentifierHash); // return null. The problem is here. if (qd == null) throw new IllegalArgumentException("Unrecognized document identifier: '"+documentIdentifier+"'"); return new PipelineSpecificationWithVersions(pipelineSpecification,qd,componentIdentifierHash); } Thanks a lot. Cheng