|
||||||||
|
This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira |
||||||||
- [jira] [Created] (STANBOL-583) CELI... Alessio Bosca (Created) (JIRA)
- [jira] [Updated] (STANBOL-583)... Alessio Bosca (Updated) (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (Commented) (JIRA)
- [jira] [Commented] (STANBOL-58... Rupert Westenthaler (Commented) (JIRA)
- [jira] [Updated] (STANBOL-583)... Rupert Westenthaler (JIRA)
- [jira] [Assigned] (STANBOL-583... Rupert Westenthaler (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Updated] (STANBOL-583)... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Rupert Westenthaler (JIRA)
- [jira] [Updated] (STANBOL-583)... Fabian Christ (JIRA)
- [jira] [Updated] (STANBOL-583)... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Rupert Westenthaler (JIRA)
- [jira] [Commented] (STANBOL-58... Rupert Westenthaler (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Commented] (STANBOL-58... Rupert Westenthaler (JIRA)
- [jira] [Commented] (STANBOL-58... Alessio Bosca (JIRA)
- [jira] [Resolved] (STANBOL-583... Rupert Westenthaler (JIRA)

First:
* I had troubles applying your patch. While I think that i finally managed to apply it correctly you might want to validate this.
* to make further work more easy I created an own branch https://svn.apache.org/repos/asf/incubator/stanbol/branches/celi-enhancement-engines
Generated Enhancements:
To make the validation of the Stanbol Enhancement Structure more common to all EnhancementEngines I implemented STANBOL-612. This new validation utility is now also used for the CELI NER engine and identified several issues with the created Enhancement. Some of them I have already fixed but there are two remaining where I will most likely need you help.
1) The NER enhancement for "28 septembre 1934" (time) returns "28 eptembre 1934 " as formKind. Because of that the "selected-text" does not correspond with the parsed text and the validation fails.
2) The start/end positions for "Paris" do have an offset of two chars. The validation states that "ris, " is selected instead of "Paris"
Alessio it would be good if you could have a look at the two described issues as having access to the server side logs seams critical to work on that.
best
Rupert
Detailed list of my changes to the CELI NER engine:
* I have made the supported language(s) configurable as I am expecting that configuring a different service URL might bring the possibility to support other languages. Multiple languages can be configured by
* comma separated String e.g. "fr;it;de"
* Array or Collection of Strings e.g. ["fr","it","de"]
* Enhancement creation:
* start/end positions are now xsd:int
* implemented an simple extraction of the selection-context (max 50char prefix/suffix to the selection but tries to cut of by words)
* selected-text and selection-context are now PlainLiterals and use the language as detected for the text.
* UnitTest now uses the EnhancementStructureHelper to validate created enhancements
* Instead of using the CELI language identification engine the test now statically adds the triple "ci.getUri(),dc:language,'fr'" to the Enhancement graph.
* NER http client:
* tried to add "<?xml version="1.0" encoding="UTF-8"?>" to the request. However this had not the expected result.
* revision #1338567 adds support of streaming the XML escaped text directly to the HTTP request.