Hi, I think a better option would be to follow the Solr Data Import handler example and create a CMIS Import Handler. For example CMIS spec supports Change logs - list of content created, updated, deleted. Stanbol and track the changes in a CMIS repository and do extraction.
Ainga On 7 Jul 2011, at 05:06, Olivier Grisel wrote: > 2011/7/6 Ezequiel Foncubierta <[email protected]>: >> Hi, >> >> I'm not sure if this option is currently supported or there is a plan to do >> it. I think that would be interesting to enable an async option when you >> call the engines/ resource. If the system is overloaded, or some engines >> take a long time to process the content, it should be an option to run >> asynchronous transactions. Some integrations requires synchronous calls, >> because they want to tag contents in the same transactions. But, could be >> some others in which the synchronous calls are not a required feature. >> >> This proposal is because, almost the all the current integration uses >> synchronous calls. It means that, the content creation process is as >> following: >> >> 1. Create the content in the CMS >> 2. Send the content to the enhancer >> 3. Write the enhancer results >> 4. Relate the content with the extracted entities >> >> So, the CMS performance depends on the Apache Stanbol performance. An >> alternative, would be creating the content and run a background process to >> extract the enhancements (using different transactions). >> >> A first way to get it, is by sending a url parameter (e.g. >> referer=http://system/listener/service). If this parameter is present, then >> run the enhancements in a background thread and, once finished, then send >> the results to the specified url in the referer parameter. > > Sounds like a good approach but nothing is implemented yet. The Async > stuff that occurs in the source code is a left over of early code > prototyping that happened during the first sprint and was never lead > to it's term. > >> You know better than me the pros and cons of the current implementation, >> so... what do you think about the asynchronous calls? > > They would be useful. Please file a new jira for this. > > We need to extend the JobManager to fork threads for such stuff. We > could use the JDK ThreadPoolExecutor API to do this quite easily (in > memory queued tasks with basic multicore parallelism). > > Later we might also want to provide a way to query for some monitoring > / progress info: the first query returns a token id that could be used > to query for a description of the progress of the job. > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 203 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
