Re: Connectors post processing

Karl Wright Fri, 08 Mar 2013 00:18:45 -0800

I am suggesting you need to isolate DFC activity in a separate process
from the rest of ManifoldCF.  It does not have to be in the Documentum
connector, or even in the Documentum Connector server process, and
indeed it would be a challenge to set up build dependencies from one
connector to another.  So I am suggesting that you create your own
custom Documentum Solr Connector process, using RMI for communication,
based on the code in connectors/documentum, that does what you want.
The book has an RMI example also.


Karl

On Fri, Mar 8, 2013 at 3:11 AM,  <pankaj.pand...@wipro.com> wrote:
> Hi Karl,
>
> Thanks for your quick response!
>
> If I got you correctly, then you are suggesting that I should be doing the 
> Documentum update operation via Manifold Documentum connector and not to 
> perform direct operation from Solr Connector, correct me if I am wrong. If 
> so, can you please suggest where should I place my logic and how I should 
> invoke it?
>
>
> Regards,
> Pankaj
>
> -----Original Message-----
> From: Karl Wright [mailto:daddy...@gmail.com]
> Sent: Friday, March 08, 2013 1:23 PM
> To: dev@manifoldcf.apache.org
> Subject: Re: Connectors post processing
>
> Here's one hint: I separated all communication with Documentum via DFC into 
> its own process for a reason.  It was not an arbitrary decision, believe me.  
> The ManifoldCF in Action book describes some of those reasons.  They are just 
> as applicable to your situation as they are to the Documentum repository 
> connector.
>
> To directly answer your first question, ManifoldCF loads connector jars from 
> paths which are configurable using properties.xml.  This is done via a 
> classloader.
>
> The mcf-combined war puts together all of ManifoldCF in one war.  You either 
> run that, or you run the other three wars and the agents process.  All of 
> this is covered in the how-to-build-and-deploy page.
>
> Karl
>
>
>
> On Fri, Mar 8, 2013 at 12:11 AM,  <pankaj.pand...@wipro.com> wrote:
>> Hi Karl,
>>
>> As you mentioned, I am trying to customize the Solr Output connector to 
>> perform some  update in Documentum system, after ingestion completes 
>> successfully. However, I am facing problem while getting a DFC session in 
>> Solr Connector noteJobCompleted() method. It either complains about 
>> "[DM_DOCBROKER_E_NO_DOCBROKERS]error:  "No DocBrokers are configured" or 
>> sometimes is unable to find the "aspectrjt.jar" file. I am using ManifoldCF 
>> RestAPI's to invoke a job from our custom application. I have placed the 
>> "dfc.properties" in Solr connector jar and dfc.jar and aspectrjt.jar are 
>> present in th "connector-lib-proprietary" folder. I also placed the two 
>> jar's in mcf-api-service, mcf-combined-service apps on tomcat as well, but 
>> still I am getting the same error.
>>
>> Can you please assist me on this? Also, I am wondering how the connectors 
>> jars are called/invoked when a job is started using the REST API's? Is the 
>> corresponding connector jar code executed by the mcf-api-service on the 
>> tomcat app server or it is done by the batch file running on the machine? 
>> Reason I am asking this is, when I give Sysout's in my connector code, I see 
>> the message is printed on both- tomcat's stdout.log and Manifold's 
>> start-agents.bat.
>>
>> Also, what purpose does "mcf-combined-service" app server?
>>
>> Thanks!
>>
>> Regards,
>> Pankaj
>>
>> -----Original Message-----
>> From: Karl Wright [mailto:daddy...@gmail.com]
>> Sent: Tuesday, March 05, 2013 4:52 PM
>> To: dev@manifoldcf.apache.org
>> Subject: Re: Connectors post processing
>>
>> If you are asking if there is an IRepositoryConnector method that is called 
>> on job completion, there is not.
>>
>> Karl
>>
>> On Tue, Mar 5, 2013 at 4:49 AM,  <pankaj.pand...@wipro.com> wrote:
>>> Hi Karl,
>>>
>>> Thanks for the quick response!
>>>
>>> Is there another method implemented at connector level, that exists across 
>>> the connectors, and gets called before releasing the session for a 
>>> particular job.
>>>
>>>
>>> Thanks!
>>>
>>> Regards,
>>> Pankaj
>>>
>>> -----Original Message-----
>>> From: Karl Wright [mailto:daddy...@gmail.com]
>>> Sent: Monday, March 04, 2013 3:05 PM
>>> To: dev@manifoldcf.apache.org
>>> Subject: Re: Connectors post processing
>>>
>>> Hi Pankaj,
>>>
>>> ManifoldCF is not set up as a document pipeline.  The model used presumes 
>>> that any document modification is a downstream responsibility of whatever 
>>> system the documents are output to.  So you would want to think of the 
>>> problem as simply getting all the necessary information to that system 
>>> through ManifoldCF.  Furthermore, updating systems that ManifoldCF crawls 
>>> is expressly prohibited in most situations our users find themselves in.
>>>
>>> What I would suggest is one of the following:
>>>
>>> (1) Configure the document extraction pipeline for whatever search engine 
>>> is your target, to add a stage that does what you want.  If it is Solr, you 
>>> would modify the Tika pipeline, for instance.  You won't be able to use any 
>>> ManifoldCF code for this, except as perhaps an example.  This plugin would 
>>> modify documents back in Documentum.
>>>
>>> (2) If there is no such pipeline available, you can build a custom output 
>>> connector that does essentially the same thing.  There is a method that 
>>> output connectors have which is called at the end of all jobs, called 
>>> noteJobComplete().
>>>
>>> In either case, DFC has such a massive (and outdated) dependency list that 
>>> you probably cannot run it in the same JVM as either your search engine or 
>>> ManifoldCF.  That is why ManifoldCF communicates with Documentum only 
>>> through the MCF Documentum server process, using RMI to invoke methods in 
>>> that process.  You will also need to make sure all the required information 
>>> for the postprocessing is included as metadata in the RepositoryDocument 
>>> object.
>>>
>>> Karl
>>>
>>>
>>> On Mon, Mar 4, 2013 at 12:15 AM,  <pankaj.pand...@wipro.com> wrote:
>>>> Hi,
>>>>
>>>> I want to execute a piece of code(post-processing logic) after a 
>>>> Documentum/Filenet/Livelink connector are done with extraction process. 
>>>> The post-processing logic will basically update one attribute value, on a 
>>>> documentum object(IDfQueueItem), corresponding to the successfully 
>>>> ingested document. Can you please help me out with the below issues.
>>>>
>>>> 1. Is there a common method which is called towards the end of extraction 
>>>> process, where I can place my post processing logic? I tried placing logic 
>>>> in processDocuments() of DCTM.java, but seems like it is called for each 
>>>> document and not towards the end of entire operation.
>>>>
>>>> 2. Is there a way to convert the IDocumentum to IDfSession. Currently, if 
>>>> I try to fetch a object using  IDocumentum.getObjectByQualification(), it 
>>>> throws in ClassCastException with some Proxy28 Class. As a work around, I 
>>>> tried to get an explicit documentum session in DCTM.java, but it always 
>>>> throws NO_DOCBROKERS_CONFIGURED(because it can't find dfc.properties 
>>>> file). I tried placing the jar file in connector folder(and several 
>>>> others) and then placed it under the mcf-dctm-connector.jar file as well, 
>>>> but got the same error. Can you assist me how to resolve this error or any 
>>>> workaround?
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> Regards,
>>>> Pankaj
>>>>
>>>> Please do not print this email unless it is absolutely necessary.
>>>>
>>>> The information contained in this electronic message and any attachments 
>>>> to this message are intended for the exclusive use of the addressee(s) and 
>>>> may contain proprietary, confidential or privileged information. If you 
>>>> are not the intended recipient, you should not disseminate, distribute or 
>>>> copy this e-mail. Please notify the sender immediately and destroy all 
>>>> copies of this message and any attachments.
>>>>
>>>> WARNING: Computer viruses can be transmitted via email. The recipient 
>>>> should check this email and any attachments for the presence of viruses. 
>>>> The company accepts no liability for any damage caused by any virus 
>>>> transmitted by this email.
>>>>
>>>> www.wipro.com
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any attachments to 
>>> this message are intended for the exclusive use of the addressee(s) and may 
>>> contain proprietary, confidential or privileged information. If you are not 
>>> the intended recipient, you should not disseminate, distribute or copy this 
>>> e-mail. Please notify the sender immediately and destroy all copies of this 
>>> message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient 
>>> should check this email and any attachments for the presence of viruses. 
>>> The company accepts no liability for any damage caused by any virus 
>>> transmitted by this email.
>>>
>>> www.wipro.com
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments to 
>> this message are intended for the exclusive use of the addressee(s) and may 
>> contain proprietary, confidential or privileged information. If you are not 
>> the intended recipient, you should not disseminate, distribute or copy this 
>> e-mail. Please notify the sender immediately and destroy all copies of this 
>> message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient should 
>> check this email and any attachments for the presence of viruses. The 
>> company accepts no liability for any damage caused by any virus transmitted 
>> by this email.
>>
>> www.wipro.com
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
>
> www.wipro.com

Re: Connectors post processing

Reply via email to