[jira] [Commented] (STANBOL-263) Asynchronous calls support for the engines resource

Ezequiel Foncubierta (JIRA) Mon, 11 Jul 2011 02:32:31 -0700

    [ 
https://issues.apache.org/jira/browse/STANBOL-263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063243#comment-13063243
 ]


Ezequiel Foncubierta commented on STANBOL-263:
----------------------------------------------

Bertrand Delacretaz says in the mail list:

I agree with the need for async enhancement engines, one of the ideas
that we had when discussing the initial FISE design (that led to the
Stanbol enhancer) was to have some metadata that says "we're still
working on some parts of this content and might get some more metadata
later" in the content items to handle status info about asynchronous
enhancement.

Async processing should IMO take things like Mechanical Turk into
account, i.e. extremely slow processing, using things like
http://groups.csail.mit.edu/uid/turkit/ maybe.

Another idea, more complicated but potentially much more powerful, is
to use a kind of tuple space for enhancement engines to collaborate,
something like:

-Content item CI is added to the space
-Engine A sees CI, works on it and as a result adds triple A1 to the space
-Engine B sees triple A1, works on it and adds triple B1 to the space
-Engine A sees triple B1 and adds more metadata, based on it an CI, to the space

Engines can then work iteratively on finding out more things about
content items, and this would also allow for correlating metadata
supplied by several engines to improve the metadata quality.

I'm basically just dreaming outloud here, but I think we should take
this into account if we introduce multiple processing chains in the
enhancer: some chains might use a totally different engine
collaboration mechanism like the above, instead of the current
sequential processing.

> Asynchronous calls support for the engines resource
> ---------------------------------------------------
>
>                 Key: STANBOL-263
>                 URL: https://issues.apache.org/jira/browse/STANBOL-263
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancer
>            Reporter: Ezequiel Foncubierta
>
> Enable an async option when you call the engines/ resource. If the system is 
> overloaded, or some engines take a long time to process the content, it 
> should be an option to run asynchronous transactions. Some integrations 
> requires synchronous calls, because they want to tag contents in the same 
> transactions. But, could be some others in which the synchronous calls are 
> not a required feature.
> This proposal is because, almost the all the current integration uses 
> synchronous calls. It means that, the content creation process is as 
> following:
> 1. Create the content in the CMS
> 2. Send the content to the enhancer
> 3. Write the enhancer results
> 4. Relate the content with the extracted entities
> So, the CMS performance depends on the Apache Stanbol performance. An 
> alternative, would be creating the content and run a background process to 
> extract the enhancements (using different transactions).
> A first way to get it, is by sending a url parameter (e.g. 
> referer=http://system/listener/service). If this parameter is present, then 
> run the enhancements in a background thread and, once finished, then send the 
> results to the specified url in the referer parameter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (STANBOL-263) Asynchronous calls support for the engines resource

Reply via email to