Hi Mihály

On Tue, Sep 25, 2012 at 9:07 PM, Mihály Héder <[email protected]> wrote:
> Hi All,
>
> I have written a blog post about the lessons learnt from the EAP project I
> had been working on:
> http://blog.iks-project.eu/lessons-learnt-while-working-with-apache-stanbol/
>

Thanks for this blog post. It is really valuable feedback.
I will try to answer some of your questions.

> The reason I'm citing this here is that I'm interested in your opinion on
> the following mid-term development questions and suggestions (discussed in
> detail in the post):
> -What is the best way to monitor a running stanbol instance with
> munin/nagios/icinga, etc? How can I extract e.g. an enchancement/hour
> statistic from stanbol?

Within Apache Stanbol the EnhancementJobManager collects the
ExecutionMetadata [1]. They are stored in an own ContentPart of the
processed ContentItem.

So one possibility would be to add a feature to the EnhancementJobManger that
allows to log those information (or even to store them into a RDF triple store).

If we do that this would really allow very fine grained analyses about requests
processed by the Stanbol Enhancer.


[1] 
http://stanbol.apache.org/docs/trunk/components/enhancer/executionmetadata.html

> -I think at some point we should create a standardized a REST API through
> which non-java EEs could be accessed.

I am not sure how such a interface should look like? I could think
about an interface that POST the current metadata of the ContentItem
to some URI. The results could again be RDF that is than added to the
ContentItem. Maybe one could even allow the definition of some kind of
Filter so that not the whole RDF metadata need to be serialized.

Non-java EE that also need the content (e.g. the text/plain Blob)
would need a different kind of interface.

BTW: Serialization/Deserialization of ContentItems is already
implemented (by using multipart mime).

> -Also, I think that if we had some standardized description XML or whatever
> format that would tell what kind of output a certain EE produces, that
> would be helpful.

I would really like to have EnhancementEngines providing RDF
descriptions of themselves when making a GET request to

    http://{stanbol-instance}/enhancer/engine/{engine-name}

if those descriptions would also include information about the
consumed/produced elements that would be great.

However this feature is much more important for UIMA as for Stanbol,
because with Stanbol EnhancementEngines are expected to create
Annotations that confirm to the EnhancementStructure.

best
Rupert


-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to