Hi there !

Here comes a first implementation of Camel/Stanbol integration.
There is plenty rooms for improvments, but gave a first idea.

You will find the branch here : https://svn.apache.org/repos/asf/incubator/stanbol/branches/cameltrial/

Changes are :
* in /enhancer : modify the engine endpoint to take care of route/chain & add the cameljobmanager
* in /launchers : add a camellauncher

°°°°°° build and start °°°°°°

To try do "mvn3 clean install" in /enhancer and /launchers/camellauncher

start as usual.

REMARK : as configured this will create 2 folders in your /tmp folder :
- chaininput : continuously scanned folder for text file that have to be enhance
- chainoutput : results of processing in an rdf file.

°°°°°° use it °°°°°°

- A default route/chain is define. This default chain do the same like the weightedjobmanager, it's call by the web-interface.

- Others routes are defined here [1].

- They can be fired with classical engine's REST api, just need to add the chainName at the end of the url : http://localhost:8080/engines/{chainName} [2]

So you have this "can be call" chains :
1) metaxa : just call metaxa engine send output to curl
2) metaxa2 : call metaxa then langidEngine sent output to curl *and* create an rdf file or the result 3) chainlink : call metaxa then another defined route (here the default one, weighted chain)

And this "pool" chain :
It's a chain that scan files in /tmp/chaininput, process this files and put the rdf output in /tmp/chainoutput.

Files are delete from chaininput folder, but that's a choice, can be configure to keep them in place (add a noop=true parameter to Camel config url).

For now only plain text are accepted but it's just a matter of adding a Tika mimetype detector to get it enabled for any mimeType (but still rely on metaxa for extraction).

++



[1] : https://svn.apache.org/repos/asf/incubator/stanbol/branches/cameltrial/enhancer/jobmanager/cameljobmanager/filepoolchain/src/main/java/org/apache/stanbol/enhancer/jobmanager/defaultRoute/FileRoute.java

[2] exemples :
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" --data "Here comes a little test with Paris as content and also Berlin but why not detect city as Boston." http://localhost:8080/engines/metaxa

curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" --data "Here comes a little test with Paris as content and also Berlin but why not detect city as Boston." http://localhost:8080/engines/chainlink

Reply via email to