[
https://issues.apache.org/jira/browse/STANBOL-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401422#comment-13401422
]
Pablo Mendes commented on STANBOL-652:
--------------------------------------
I like it a lot.
Saving the evaluation results to an RDF store would then allow one to
slice-and-dice the results however he/she wants.
For example, how many correct results for person annotations?
SELECT count(?result)
WHERE {
?result sbc:state sbc:benchmark-state-succeeded .
?result sbc:about ?annotation .
?annotation entity ?person .
?person a dbpedia:Person .
}
Sharing this RDF via a SPARQL endpoint would then allow prompt Web-based report
generation (aka visualization of results) with something like Sgvizler, for
example
http://code.google.com/p/sgvizler/
Although what I'd actually do would be to get a CSV from this and use R to
analyze the results. But even getting this CSV should be trivial from the RDF.
> Benchmark should report evaluation summary
> ------------------------------------------
>
> Key: STANBOL-652
> URL: https://issues.apache.org/jira/browse/STANBOL-652
> Project: Stanbol
> Issue Type: Improvement
> Components: Testing
> Reporter: Pablo Mendes
> Priority: Minor
> Labels: benchmark, evaluation
>
> The SBC is a nice way to perform manual inspection of the behavior of the
> enhancement chain for different examples in the evaluation dataset. However,
> for evaluations with several hundreds of examples, it would be interesting to
> have scores that summarize the performance for the entire
> dataset. For example, precision, recall and F1. An evaluation dataset is
> available here in BDL: http://spotlight.dbpedia.org/download/stanbol/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira