Hi, On Sun, Jul 10, 2011 at 9:46 PM, Ezequiel Foncubierta <[email protected]> wrote: > ,,,If you need to process large files, an asynchronous option would be nice > to avoid having connections waiting for a > response. I think that both, configurable chains and asynchronous calls, > ought to be implemented...
I agree with the need for async enhancement engines, one of the ideas that we had when discussing the initial FISE design (that led to the Stanbol enhancer) was to have some metadata that says "we're still working on some parts of this content and might get some more metadata later" in the content items to handle status info about asynchronous enhancement. Async processing should IMO take things like Mechanical Turk into account, i.e. extremely slow processing, using things like http://groups.csail.mit.edu/uid/turkit/ maybe. Another idea, more complicated but potentially much more powerful, is to use a kind of tuple space for enhancement engines to collaborate, something like: -Content item CI is added to the space -Engine A sees CI, works on it and as a result adds triple A1 to the space -Engine B sees triple A1, works on it and adds triple B1 to the space -Engine A sees triple B1 and adds more metadata, based on it an CI, to the space Engines can then work iteratively on finding out more things about content items, and this would also allow for correlating metadata supplied by several engines to improve the metadata quality. I'm basically just dreaming outloud here, but I think we should take this into account if we introduce multiple processing chains in the enhancer: some chains might use a totally different engine collaboration mechanism like the above, instead of the current sequential processing. -Bertrand
