Hi,

On Sun, Jul 10, 2011 at 9:46 PM, Ezequiel Foncubierta
<[email protected]> wrote:
> ,,,If you need to process large files, an asynchronous option would be nice 
> to avoid having connections waiting for a
> response. I think that both, configurable chains and asynchronous calls, 
> ought to be implemented...

I agree with the need for async enhancement engines, one of the ideas
that we had when discussing the initial FISE design (that led to the
Stanbol enhancer) was to have some metadata that says "we're still
working on some parts of this content and might get some more metadata
later" in the content items to handle status info about asynchronous
enhancement.

Async processing should IMO take things like Mechanical Turk into
account, i.e. extremely slow processing, using things like
http://groups.csail.mit.edu/uid/turkit/ maybe.

Another idea, more complicated but potentially much more powerful, is
to use a kind of tuple space for enhancement engines to collaborate,
something like:

-Content item CI is added to the space
-Engine A sees CI, works on it and as a result adds triple A1 to the space
-Engine B sees triple A1, works on it and adds triple B1 to the space
-Engine A sees triple B1 and adds more metadata, based on it an CI, to the space

Engines can then work iteratively on finding out more things about
content items, and this would also allow for correlating metadata
supplied by several engines to improve the metadata quality.

I'm basically just dreaming outloud here, but I think we should take
this into account if we introduce multiple processing chains in the
enhancer: some chains might use a totally different engine
collaboration mechanism like the above, instead of the current
sequential processing.

-Bertrand

Reply via email to