[ 
https://issues.apache.org/jira/browse/STANBOL-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277685#comment-13277685
 ] 

Rupert Westenthaler commented on STANBOL-613:
---------------------------------------------

fise:TextAnnotations descibing the language of an analyzed text should also use 
"http://purl.org/dc/terms/LinguisticSystem"; as dc:type


    ?la rdf:type fise:TextAnnotation, fise:Enhancement
    ?la dc:type dc:LinguisticSystem
    ?la dc:language ?lang

and all properties required by fise:Enhancement
                
> Define a standard way on how to obtain the extracted language
> -------------------------------------------------------------
>
>                 Key: STANBOL-613
>                 URL: https://issues.apache.org/jira/browse/STANBOL-613
>             Project: Stanbol
>          Issue Type: Sub-task
>          Components: Enhancer
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>            Priority: Minor
>             Fix For: 0.10.0-incubating
>
>
> With the addition of the CELI Langauge Identification Engine there are now 
> two different engines that do support the same feature.
> However currently Engines that do consume the detected language are "hard 
> coded" to the LangId Engine (enhancer/engines/langid). Something that need to 
> be changed to allow the adoption of alternatives - like the CELI based 
> implementation.
> The suggestion is to use the following Pattern to extract the language
> (1) via Annotations:
>   ?x rdf:type fise:TextAnnotation .
>   ?x dc:language ?language .
>   OPTIONAL {
>     ?x dc:created ?engine
>   }
>   OPTIONAL {
>     ?x fise:confidence ?confidence
>   }
> (2) via ContentItem metadata
>   ?ci dc:language ?language
> (2) is a fallback if (1) delivers no results.
> Methods that
>  * extract the language (with the highest confidence) - including fallback to 
> (2)
>  * extract all languages (sorted by confidence) - including fallback to (2)
>  * extract all TextAnnotations with dc:language values
> are added to the EnhancementEngineHelper utility of the enhancer.servicesapi 
> module

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to